|
Abstract : |
Stochastic n-gram language models have been successfully applied in continuous speech recognition for several years. Such language models provide many computational advantages but also require huge text corpora for parameter estimation. Moreover, the texts must exactly re?ect, in a statistical sense, the user's language. Estimating a language model on a sample that is not representative severely a?ects speech recognition performance. A solution to this problem is provided by the Bayesian learning framework. Beyond the classical estimates, a Bayes derived interpolation model is proposed. Empirical comparisons have been carried out on a 10,000-word radiological reporting domain. Results are provided in terms of perplexity and recognition accuracy. 1., |