How Much do Grammars Leak?

TitleHow Much do Grammars Leak?
Publication TypeConference Paper
Year of PublicationSubmitted
AuthorsAngelov, K
Conference NameCOLING 2012
Publication LanguageEnglish
KeywordsGF, Grammars, hybrid, MOLTO dissemination, Treebanks
Abstract

We present a large-scale evaluation for the coverage of the English Resource Grammar developed in Grammatical Framework (GF) on the Penn Treebank. The English Resource Grammar is a wide-coverage linguistic grammar which was developed independently from the treebank, and for a first time we do a quantitative analysis of its coverage. We measured a coverage of 94.47\%, and we identified the main syntactic structures where the grammar leaks. As a side effect of the evaluation, we built a treebank for the grammar by translating the Penn Treebank to abstract syntax trees. The treebank is used in an ongoing project for training of stochastic disambiguation models for the grammar.

Notes

Feedback is welcome.

Refereed DesignationRefereed
AttachmentSize
gf-penn.pdf229.75 KB