Home

Scalable discovery of informative structural concepts using domain knowledge


Author(s) : Surnjani Djoko Lawrence B. Holder Diane J. Cook, 
Publisher : N/A
Publication Date : 1996
ISSN : N/A
Abstract : Discovering repetitive, and functional substructures in large structural databases improves the ability to interpret and compress the data. However, scientists working with a database in their area of expertise often search for predetermined types of structures, or for structures exhibiting characteristics specific to the domain. This paper presents a method for guiding the discovery process with domain-specific knowledge. In this paper, the Subdue discovery system is used to evaluate the benefits of using domain knowledge to guide the discovery process. Results show that domain-specific knowledge improves the search for substructures which are useful to the domain, and leads to greater compression of the data. Empirical and theoretical results also indicate the scalability of the algorithm to increasingly large structural databases. Keywords--data mining, minimum description length principle, data compression, inexact graph match, domain knowledge, scalability,