You are here

Effective, design-independent XML keyword search

TitleEffective, design-independent XML keyword search
Publication TypeConference Paper
Year of Publication2009
AuthorsTermehchy, A., and M. Winslett
Tertiary AuthorsCheung, D., I-Y. Song, W. Chu, X. Hu, and J. Lin
Conference NameProceedings of the 18th ACM International Conference on Information and Knowledge Management (CIKM’09)
Date Published11/2009
PublisherACM Press
Conference LocationHong Kong, China
ISBN Number9781605585123

Keyword search techniques that take advantage of XML structure make it very easy for ordinary users to query XML databases, but current approaches to processing these queries rely on intuitively appealing heuristics that are ultimately ad hoc. These approaches often retrieve irrelevant answers, overlook relevant answers, and cannot rank answers appropriately. To address these problems for data-centric XML, we propose coherency ranking (CR), a domain- and database design-independent ranking method for XML keyword queries that is based on an extension of the concept of mutual information. With CR, the results of a keyword query are invariant under schema reorganization. We analyze how previous approaches to XML keyword search approximate CR, and present efficient algorithms to perform CR. Our empirical evaluation with 65 user-supplied queries over two real-world XML data sets shows that CR has better precision and recall and provides better ranking than all previous approaches.