3.1 The MOLTO Translation Tools (TT) Editor

This section describes the GF translation editor originally developed by Bringert and Angelov at UGOT and reworked at UHEL.

To guide the development of a suitable translation editor API to support MOLTO translation needs, UGOT created a prototype web-based translation editor. It is implemented using the Google Web Toolkit and usable for authoring with small multilingual grammars. To use it from the web, all that is needed is a reasonably modern web browser. To install it locally, one needs in addition a web server, MySQL database and GF services.

The editor runs entirely in the web browser, so once you have opened the web page and have documents and grammars loaded, you can continue translation editing while you are offline.

3.1.1 Software requirements

In order to install the editor, you need to have the following components:

  1. The editor code itself (in the eclipse package)
  2. For developer version only:
    • Eclipse Helios JEE (3.6)
    • Google Web Toolkit plugin (tested with version 2.3.1)
  3. Web server
    • Apache (tested with 2.2.14 on Ubuntu)
    • FastCGI (libapache2-mod-fastcgi)
  4. Database
    • HSQL (tested with version 1.8.1)
    • HSQL-MySQL (1.8.1) -- a slightly modified version: hsql-mysql-1.8.1-molto.zip
    • MySQL server (tested with 5.1.54 and 5.1.62)
  5. GF server
    • GF (tested with 3.3.3)
    • Haskell (tested with ghc 7.0.4, cabal-install 0.10.2)

In this section we assume that the user has Apache, MySQL and GF server configurations done. Please see Appendix for instructions on background settings.

3.1.2 Installation

3.1.2.1 Developer version

The prototype TT editor code is packaged as an Eclipse project archive http://tfs.cc/molto/molto-tt-0.9-linux-eclipse-20120529.zip ready for import in Eclipse (Helios).

Import the project in Eclipse. You should have Google Web Toolkit plugin (tested with version 2.3.1). The runtime editor files are found in TT-0.9/www/editor/. To install the runtime, the following files are placed under Apache2 server root (here /var/www) as shown.

/var/www/editor$ ls 
grammars  index.html  org.grammaticalframework.ui.gwt.EditorApp  WEB-INF

When you have placed the files under /var/www, then you can launch the project in Eclipse. Choose from the menu Run -> Run configurations -> Web Application -> (new configuration). In the tab Server untick Run built-in server. If you have put the files in directory /var/www/editor, then the launch address will be 127.0.0.1:8888/editor/index.html?gwt.codesvr=127.0.0.1:9997.

Web server: Apache2 fastcgi and action modules must be enabled for the services. See installation notes at the end for a sample Apache2 virtual host below to handle the services from port 8888 (the default).

GF server: The editor requires also an installation of GF server. The server binaries are content-service (for authentication and simple mysql database management) and pgf-service (for gf grammars). When compiling, the cabal option --global should be used; then the GF service binaries get installed in /usr/local/bin. They can be copied/linked under webserver (by default Apache2) fcgi-bin directory as follows.

/var/www/fcgi-bin$ ls -l 
content-service -> /usr/local/bin/content-service
pgf-service -> /usr/local/bin/pgf-service

Database: The TT editor back end requires an installation of MySQL, HSQL and a Haskell library hsql-mysql by Krasimir Angelov. Further instructions how to create a database for MOLTO TT tools are in the installation notes.

The content service needs to read mysql database connection parameters from file /usr/local/bin/fpath. It should be in the same directory as content-service and contain four tokens, the mysql host and database names and the database owner credentials.

/usr/local/bin$ cat fpath
localhost moltodb moltouser moltopass

Then, the database is created by typing the following:

/usr/local/bin$ ./content-service fpath

  • login
  • update_grammar (grammar cache)
  • delete_grammar
  • grammars (listing)
  • save (document to mysql db)
  • load (document)
  • search (documents)
  • delete (document)
  • -->

    Sign in: The prototype editor currently uses the Google authentication API for sign in. Authentication and authorization for Google APIs allow third-party applications to get limited access to a user's Google accounts for certain types of activities. A user needs to have a Google account to sign in to the application.

    3.1.2.2 User version

    All back-end requirements are needed also for the user version. Now, instead of opening the package in Eclipse, the only thing needed is to place the following files under Apache2 server root (here /var/www) as shown.

    /var/www/editor$ ls 
    grammars  index.html  org.grammaticalframework.ui.gwt.EditorApp  WEB-INF
    

    Then, to run the editor, just type the address 127.0.0.1:8888/editor/index.html?gwt.codesvr=127.0.0.1:9997 into browser.

    3.1.2.3 Limitations

    Ideally, the same login should work throughout the different parts of the distributed toolkit. There should be some group scheme to set group level access restrictions. Eventually, we may want to provide MOLTO single-sign-on as a replacement for Google authentication.

    3.1.3 Grammar manager

    The prototype editor has a simple grammar manager that is supposed to allow a user to upload her grammars to the editor's grammar cache under her name. The cache kept is on the editor server for reasons of speed and xss restrictions. The user chooses the current grammar from among the cached grammars using a drop-down list.

    3.1.3.1 Limitations

    The grammar manager is not yet completed.

    3.1.4 Document manager

    The prototype editor has a simple document manager that saves a translated document in and retrieves one from from the mysql database using ContentService. The current document is saved in the database using a diskette icon on the editor page. The Documents tab shows the currently saved documents and allows the user to load a selected document for continued translation.

    3.1.4.1 Limitations

    Naming of documents is not yet supported. Both the grammar manager and document manager remain to be linked to the TMS.

    3.1.5 Term manager

    The TT editor includes a simple tabular equivalents editor for searching and editing translation correspondences from the web of data, including TermFactory services. The equivalents editor is an independent web application that may also be used standalone or as a plugin to other applications. When complete, the equivalents editor lets the user extend their GF grammars with terms entered in the term editor and/or upload them as term proposals to TermFactory.

    3.1.5.1 Installation

    The equivalents editor was built with the ExtJS javascript library. It can be downloaded from http://tfs.cc/molto/molto-term-editor.tgz. Unpack it and put the whole molto_term_editor directory under /var/www/ (or wherever your web server wants them, for example in Windows the path is probably C:\Program Files\Apache\htdocs). Open the file editor_sparql.html in a browser.

    Note that this is also included in the complete editor as one of the tabs. As for function, the versions are identical. The screenshot below is from the standalone version.

    3.1.5.2 Use

    The term editor consists of two tabular grids. In the first (left side) grid, enter a term in the text input and opt for wider or narrower concepts. In the latter case (the default) the editor shows on the right another grid of concepts that are classed narrower than the search term in the data source (by default, OntoText FactForge) and their designations in a predefined selection of languages. In the former case, the editor fills out the left side grid with concepts that are classed in the data source as wider than the search term. Clicking on one of them does a search for its subconcepts and terms, shown in the right side grid.

    The term grid is editable and the editor remembers the user's edits to the cells in the grid.

    3.1.5.3 Limitations

    The data source and choice of languages are not yet user definable. The editor is not yet connected to the TermFactory or GF grammar back ends.

    3.1.6 Editor

    In the current version, there is a sign-in box and tabs for grammars, documents, editor, and terms, plus two to query and browse the loaded grammar. The latter services are familiar from other GF front ends and based on the GF grammar Web API.

    3.1.6.1 Use

    After sign in, the editor calls content-service to show the logged in user's grammars from the grammarusers mysql table in the grammar list. The user chooses a domain grammar. This brings to view the initial vocabulary known by the grammar as fridge magnets to choose from. Alternatively, the user can type or paste text in the editor window. At every new input, the active translation unit is sent to the back end for translation, and the set of fridge magnets is updated. When a translation unit is complete and translatable, it is simultaneously translated to all the available languages and the translations are shown on the screen (in blue). If an input is not parsable, the editor underlines the unparsable part. The user can back off to the point of deviation using backspace. In addition, There is a button for clearing the input.

    The editor guides the text author by showing a set of fridge magnets and offers autocompletion to hint how a text can be continued within the limits of the current grammar.

    3.1.6.2 Limitations

    The prototype gives a first rough idea of how a web based GF translation editor could work. At present, however, it remains oriented to a very small vocabulary (fridge magnets are not apt to work well with thousands of words). It is also doubtful that the setup is fast enough for the amount of interactivity caused at speeds involved in professional translation. A reconsideration how the editor and the back end best play together is indicated. A related limitation is the strict left-to-right orientation of the parsing. UGOT seems to be working on a robust parser which allows other manners of combining parsing and editing. The proper disposition of the translation result is not worked out yet.