Multilingual CNL-based Semantic Wiki


Kaarel Kaljurand
Institute of Computational Linguistics, University of Zurich

SAKT-2012, Bolzano
2012-11-19

Presenter Notes

Structure of the talk

  • current wiki systems
    • existing systems, their types, their shortcomings
  • Multilingual CNL-based Semantic Wiki
  • AceWiki-GF
    • AceWiki
    • Grammatical Framework
  • multilingual ACE

Presenter Notes

Existing wiki systems

  • wiki
    • user-friendly collaborative environment for knowledge management
    • content typically unconstrained natural language (NL)
    • powered by software, e.g. MediaWiki
    • e.g. Wikipedia
  • semantic wiki (= wiki + formal semantics)
    • provides: richer query language, consistency checking (via automatic reasoning)
    • content typically NL + typed links (i.e. RDF triples)
    • software: Semantic Mediawiki, ...
  • CNL-based semantic wiki (= semantic wiki using CNL)
    • formal languages hidden (=> can use more expressive formal languages)
    • software: AceWiki
  • multilingual wiki
    • authoring in multiple (natural) languages
    • current systems: only document-level interlinking

Presenter Notes

Multilingual CNL-based Semantic Wiki

  • multiple languages
    • natural: English, German, ACE, ...
    • formal: ACE, FOL, ...
    • languages for content, UI, meta information
  • content
    • viewable/editable/queryable in multiple languages
    • automatically kept in sync
  • CNL-based
    • backed by formal grammar(s)
    • formal languages are hidden
  • semantic
    • consistency checking, question answering, ...
    • precise translation

Presenter Notes

Use cases

  • multilingual ACE wiki
    • authoring in multiple ACE-based CNLs
  • tourist phrasebook
    • book structure (ToC, chapters, index)
    • multiple languages
    • grammar editing
  • catalog of museum objects (paintings, painters)
    • each object on a separate wiki page
    • multiple languages
    • rich queries (e.g. "which Dutch painter painted which French painter?")
  • logic/math puzzles
    • multiple user solutions
    • automatically checked

Presenter Notes

Current implementation
(AceWiki-GF)

Presenter Notes

Technologies

  • AceWiki
    • collaborative environment
    • GUI (e.g. look-ahead editor)
    • storage
    • connection to ACE parser
    • connection to OWL reasoners
  • Grammatical Framework (GF)
    • multilingual grammars
    • parser (translation, completion, ...)
    • grammar editor

Presenter Notes

AceWiki

  • goal: user-friendly yet expressive semantic wiki system
  • wiki features: collaborative editing, multiple interlinked articles
  • background reasoning language: OWL
    • expressive fragment of first-order logic
    • decidable reasoning tasks: consistency checking, question answering, ...
    • complex syntax
  • front-end language: ACE
    • subset of natural English
    • well-defined translation into first-order logic / OWL
    • end-user documentation: construction and interpretation rules
  • developed by Tobias Kuhn
  • see more: http://attempto.ifi.uzh.ch/acewiki/

Presenter Notes

AceWiki article (screenshot)

Screenshot: article

Presenter Notes

Look-ahead editor (screenshot)

Screenshot: look-ahead editor

Presenter Notes

Reasoning (screenshot)

Screenshot: reasoning

Presenter Notes

Reasoning via translation to OWL

Every country that does not border a sea is a landlocked-country.

SubClassOf(
   ObjectIntersectionOf(
      :country
      ObjectComplementOf(
         ObjectSomeValuesFrom(
            :border
            :sea
         )
      )
   )
   :landlocked-country
)

Which country is a landlocked-country?

ObjectIntersectionOf(
    :country
    :landlocked-country
)

Presenter Notes

Shortcomings

  • single natural language
    • ACE (OWL-compatible subset)
  • single formal language
    • ACE (OWL-compatible subset)
  • grammar not modifiable
    • ACE grammar or the ACE->OWL mapping
    • only content words (of predefined classes) can be added/removed
  • ambiguous content is not supported

Presenter Notes

Grammatical Framework (GF)

  • functional programming language for grammar engineering
  • parsing and generation (linearizing)
  • focus on multilinguality
    • multiple concrete grammars
    • common single abstract grammar (language-neutral)
    • translate = parse string in concrete language A to abstract tree + linearize tree as a string in concrete language B
  • special support for natural language features
    • long-distance dependencies
    • word form generation
    • Resource Grammar Library (RGL)
  • use case: defining multilingual CNLs
  • developed by Aarne Ranta et al at the University of Gothenburg

Presenter Notes

GF example (3 grammar modules)

abstract Unitconv = {
  flags startcat = Unitconv ;
  cat Unit ; Unitconv ;
  fun
    unitconv : Unit -> Unit -> Unitconv ;
    land_mile, nautical_mile : Unit ;
}

concrete UnitconvEng of Unitconv = {
  lincat Unit, Unitconv = {s : Str} ;
  lin
    unitconv x y = {s = "how much is" ++ x.s ++ "in" ++ y.s ++ "?"} ;
    land_mile = {s = "mile"} ;
    nautical_mile = {s = "nautical mile" | "mile"} ;
}

concrete UnitconvWolfram of Unitconv = {
  lincat Unit, Unitconv = {s : Str} ;
  lin
    unitconv x y = {s = "convert" ++ x.s ++ "to" ++ y.s} ;
    land_mile = {s = "mile"} ;
    nautical_mile = {s = "nmi"} ;
}

Presenter Notes

GF parsing and linearizing

Parsing i.e. converting a string how much is nautical mile in mile ? to tree(s)

Unitconv> parse -lang=Eng "how much is nautical mile in mile ?"

unitconv nautical_mile land_mile
unitconv nautical_mile nautical_mile

Linearization i.e. converting a tree unitconv nautical_mile land_mile to string(s)

Unitconv> linearize -treebank -list (unitconv nautical_mile land_mile)

UnitconvEng: how much is nautical mile in mile ?, , how much is mile in mile ?
UnitconvWolfram: convert nmi to mile

Translation i.e. parse + linearize

Unitconv> parse -lang=Eng "how much is nautical mile in mile ?" | l -lang=Wolfram

convert nmi to mile
convert nmi to nmi

Presenter Notes

AceWiki integration with GF

  • multilingual viewing and editing of wiki content
    • grammar-based editing (show next possible tokens)
  • wiki entry is GF abstract tree set
    • viewed via linearization(s)
    • can represent ambiguity
  • access to multiple online GF grammars
    • provided by GF Webservice
    • single grammar per wiki
  • grammar integrated into the wiki
    • wiki-linking of grammar and content
    • grammar can be updated while building the wiki
  • multilingual ACE grammar implemented in GF
    • other GF grammars can be used instead (no ACE-based reasoning in this case)

Presenter Notes

Article with Editor (screenshot)

Presenter Notes

Grammar module page (screenshot)

Presenter Notes

Multilingual ACE

Presenter Notes

ACE

  • goal: user-friendly language for formal knowledge engineering
  • subset of natural English
  • translatable into Discourse Representation Structures (DRS)
    • and further into standard first-order logic, OWL, various rule languages
    • enables automatic reasoning, e.g. consistency checking, question answering, ...
  • verbalization of formal languages
    • DRS, OWL
  • end-user documentation: construction and interpretation rules
  • editing environments: AceWiki, ACE Editor, ACE View, ...

Presenter Notes

Multilingual ACE

An ACE grammar in GF/RGL adds multiple natural languages as front-ends to ACE.

Multilinguality

Presenter Notes

ACE in GF

  • implementation of the ACE syntax (i.e. no DRS generation)
    • extension of Angelov and Ranta (CNL 2009)
  • available in 15 natural languages via the RGL
    • Catalan, Danish, Dutch, English, Finnish, French, German, Italian, Latvian, Norwegian, Polish, Romanian, Russian, Spanish, Swedish
    • design allows for easy extendability
  • status
    • focus on the subset of ACE that is used in AceWiki (almost 100% coverage at almost 0% ambiguity)
    • some precision problems, e.g. anaphoric references do not obey DRS accessibility constraints
    • ambiguity and coverage problems in some languages

Presenter Notes

Translation example

p -lang=Ace "if a person admires no golfer then the person buys
    at least 2 aquariums that nothing but travelers inspect ." | l

si una persona no admira cap golfista llavors la persona compra
    almenys 2 aquarins que nomÈs viatgers inspeccionen .

als een persoon geen golfer bewondert , dan koopt de persoon
    ten minste 2 aquaria die slechts reizigers inspecteren .

jos henkilö ei ihaile mitään golfaajaa niin henkilö ostaa
    vähintään 2 akvaariota jonka vain matkustajat tarkastavat .

si une personne n' admire aucun golfeur alors la personne achète
    au moins 2 aquariums que seulement des voyageurs inspectent .

wenn eine Person keinen Golfer bewundert , dann kauft die Person
    wenigstens 2 Aquariume die nur Reisenden inspizieren .

si una persona non ammira nessuno giocatore di golf allora la persona compra
    almeno 2 acquari che soltanto viaggiatori ispezionano .

si una persona no admira hacia golfista entonces la persona compra
    al menos 2 acuarios que solamente viajeros inspeccionan .

om en person beundrar inget golfspelare så personen köper
    minst 2 akvariumar som bara resenärar avsynar .

Presenter Notes

Links

Presenter Notes