Creating Linguistic Resources with GF

17 May 2010
Europe/Stockholm

LREC 2010 Tutorial, Malta, 17 May 2010

Aarne Ranta

Background

The tutorial gives an introduction to GF, Grammatical Framework (grammaticalframework.org), which is a special-purpose programming language for implementing grammars. GF has partly the same scope as grammar formalisms like HPSG and LFG, but differs from them by having a concept of multilingual grammars, i.e. grammars with shared semantic representations. GF is moreover a functional programming language with static typing and a powerful module system, which makes it a modern and efficient engineering tool.

GF has dedicated constructs and rich libraries for morphology and syntax implementations. The GF Resource Grammar Library has currently coverage of 15 languages. Partly as a result of the GF Resource Grammar Summer School in 2009 (grammaticalframework.org/summerschool.html), which gathered 30 participants from 20 countries, some 15 more languages are under construction.

The main uses of GF and the library have been in multilingual generation, spoken dialogue systems, and domain-specific translation. GF-based translation will be developed futher in the European FP7 project MOLTO (Multilingual On-Line Translation, www.molto-project.eu).

We believe that GF gives an excellent platform for creating computational linguistic resources for new languages. This has been proven by applications covering a wide range of languages (e.g. English, French, Finnish, Arabic, Japanese, Tswana) and further confirmed by the summer school of 2009. GF has shown to attract talented students, who can get productive in a few hours, and then create comprehensive resources in a few months. The already existing language base moreover makes it possible to inherit code and experience when starting projects for new languages.

The GF software runs on all major operating systems (Linux, MacOS, Windows). GF has conversion tools that enable the reuse of grammars in several other formats, including context-free grammars for speech recognition (e.g. Nuance) and finite automata for morphology (XFST). Both the GF compiler and the grammars are available as open-source software.

The tutorial

The goal of the tutorial is to give the knowledge needed for building GF applications or starting a new language implementation. The material is divided into three one-hour lectures:

  1. The main concepts of GF and multilingual grammars
  2. Building morphology implementations and lexica
  3. Implementing syntactic rules for generation, parsing, and translation

The material covered will be an abridged version of the summer school introduction slides, which can be found in grammaticalframework.org/doc/resource-tutorial.pdf.

Prerequisites

The main prerequisites are:

  • some familiarity with programming
  • some familiarity with linguistic concepts

Experience has shown that 2-3 years studies in computer science, linguistics, or a related subject give the sufficient background for learning GF. No previous knowledge of GF is presupposed. Teacher

Aarne Ranta, Professor of Computer Science

Department of Computer Science and Engineering, Chalmers University of Technology and University of Gothenburg, 41296 Gothenburg, Sweden

Tel. +46 31 772 10 82, Email aarne at chalmers dot se

Homepage www.cs.chalmers.se/~aarne