Home

Discovering relevant scientific literature on the web


Author(s) : C. Lee Giles Steve Lawrence Kurt D. Bollacker, 
Publisher : N/A
Publication Date : 2000
ISSN : N/A
Abstract : boon to scientific publication. It lets researchers disseminate their reports faster and at lower cost than ever before, greatly increasing the number and diversity of easily available publications. At the same time, however, the acceleration of publication has increased the perceived information overload for researchers attempting to keep abreast of relevant research in rapidly advancing fields. Scientific literature on the Web makes up a massive, noisy, disorganized database. Unlike large, single-source databases such as a corporate customer database, the Web database draws from many sources, each with its own organization. Also, owing to its diversity, most records in this database are irrelevant to an individual researcher. Furthermore, the database is constantly growing in content and changing in organization. All these characteristics make the Web a difficult domain for knowledge discovery. To quickly and easily gather useful knowledge from such a database, users need the help of an information-filtering system that automatically extracts only relevant records as they appear in a stream of incoming records. 1 To this end, we have developed the CiteSeer digital library system. 2 CiteSeer, a custom-digital-library generator, performs,