Knowledge representation infrastructure

"Living" deliverable -

ENTER SOME EXPLAINATION/MOTIVATION HERE

Then start by collecting data sets and other requirements as subsections.

Creation of NLS-GFT-OPI Corpus

The attached document describes the structure of a NLS-GFT-OP corpus and its ingredients; the process of the creation of a NLS-GFT-OP corpus.

AttachmentSize
creationOfNLS-GFT-OPICorpus2.pdf96.72 KB

Data Sets

Please add your data set description as a sub page following the template format given here below.

Which data sets do you plan to use?

Template data set

When adding your data set, cut and paste the sections here below.

Description

Original format and size

Use cases for the data sets

Characteristics

  • Do you need the entire data set or just a part of it - which one?
  • Do you need to change or extend the data set during the project?
  • Inside your applications, what is the format in which you need to use the data from this set?

Software specific requirements

Do you need any special software to create and edit the data set?

Sample queries

Describe some/all? queries that you need to execute against the set.

Knowledge Representation Infrastructure

The modules and functionality of the MOLTO Knowledge Representation Infrastructure (KRI) is thoroughly described in D4.1 Knowledge Representation Infrastructure.

This KRI is the data modeling and manipulation backbone of the entire project, aiming to serve semi-automatic creation of abstract grammars from ontologies; deriving ontologies from grammars, and instance level knowledge from Natural Language (NL). In terms of retrieval, NL queries will be transformed to semantic queries and the resulting knowledge, expressed back in NL.

D4.1 Knowledge Representation Infrastructure can be found at: http://www.molto-project.eu/node/1120.

AttachmentSize
D4.1.pdf518.37 KB

MOLTO Infrastructure at OntoText

Query Language

UGOT, with help and feedback from Ontotext, has started a GF grammar for queries. The grammar is currently for English only and translates queries like

"Bulgarian people working at Google"

to abstract syntax trees like

MQuery (QSet (SPlural (KProp (PRelation Employed (SInd (IName Google))) (KProp (PCountry Bulgaria) Person))))

The same syntax tree has hundreds of variants, such as

give me persons that are Bulgarian that are professionally active in Google give me Bulgarian people that work at Google give me people that are Bulgarian that work at Google Bulgarian people employed by Google

The grammar is very incomplete and unfinished, and very much work in progress! So contributions are welcome.

The code can be found in the GF darcs repository and viewed in http://code.haskell.org/gf/examples/query/ - see in particular README.

Requirements

The requirements for the knowledge and representation infrastructure have been collected from the technological and use case partners. The technological partners with experience in semantic representations also brought in requirements based on previous experience and the state of the technology at the moment.