Automatic discovery of language models for text databasesDocument filtering with inference networksQuery-based sampling of text databases