|
Abstract : |
Research into the automatic acquisition of lex-ical information from corpora is starting to produce large-scale computational lexicons con-taining data on the relative frequencies of sub-categorisation alternatives for individual verbal predicates. However, the empirical question of whether this type of frequency information can in practice improve the accuracy of a statisti-cal parser has not yet been answered. In this paper we describe an experiment with a wide-coverage statistical grammar and parser for En-glish and subcategorisation frequencies acquired from ten million words of text which shows that this information can significantly improve parse accuracy 1. 1, |