Home

A conceptual-modeling approach to extracting data from the web


Author(s) : 0, 
Publisher : N/A
Publication Date : 1998
ISSN : N/A
Abstract : Electronically available data on the Web is exploding at an ever increasing pace. Much of this data is unstructured, which makes searching hard and traditional database querying impossible. Many Web documents, however, contain an abundance of recognizable constants that together describe the essence of a document's content. For these kinds of data-rich documents (e.g., advertisements, movie reviews, weather reports, travel information, sports summaries, financial statements, obituaries, and many others) we can apply a conceptual-modeling approach to extract and structure data. The approach is based on an ontology---a conceptual model instance---that describes the data of interest, including relationships, lexical appearance, and context,