|
Abstract : |
Modern digital libraries require user-friendly and yet responsive access to the rapidly growing, heterogeneous, and distributed collection of information sources. However, the increasing volume and diversity of digital information available online have led to a growing problem that conventional data management systems do not have, namely nding which information sources out of many candidate choices are the most relevant to answer a given user query. We refer to this problem as the query routing problem. We introduce in this paper the notation and issues of query routing, and present a practical solution for designing a scalable query routing system based on multi-level progressive pruning strategies. Akey idea is to create and maintain user query pro les and source capability pro les independently, and to provide algorithms that can dynamically discover relevant information sources for a given query through the smart use of user query pro les and source capability pro les, including the mechanisms for interleaving query routing with query parallelization and query execution process to continue the pruning at run-time. Comparing with the keyword-based indexing techniques adopted in most of the search engines and software, our approach o ers ne-granularity ofinterest matching, thus it is more powerful and e ective for handling queries with complex conditions., |