Ontotext : Productize MOLTO Technology in novel areas
Publishing platforms and Digital Journalism
The cross-media analytics is a typical case of business intelligence, developed at Ontotext. Ontotext’s technology covers preferably (but not only) publishing agencies (such as, Press Association, NDP, Oxford, etc.) and government data management (US government). From a language point of view, the company has been working systematically on commercial project for Dutch and English. However, lately, it started to expand the multilingual set to Bulgarian, German, Chinese, etc. Having in mind these facts, GF formalism as well as RDF-GF interoperability from MOLTO would be the natural extension of the information extractors, thus facilitating the interaction between the users’ queries and their machine processing. More precisely, the following extensions are envisaged: embedded translator service, tuned to the domain (sports, finance, politics, etc.); embedded converter from RDF representation to GF and then to language, and vice versa.
Also, internally, the semantic annotation tool will be augmented with language localization modules that would support the annotators.
Related markets: publishing; electronic government
SWOT Analysis:
Strengths: Improvement of the existing multilingual modules; creation of new functionalities to the customers, such as viewing the same result in various languages; improving the annotation process and text analytics; better communication between the ontology and user queries.
Weaknesses: Domain adaptation of the MOLTO modules might be needed, when addressing a new domain or even a subdomain of a specific domain.
Opportunities: There might be the possibility to create a publishing platform of new generation, which provides a typological core for many languages and thus – is easily adaptable to new languages. Additionally, to see Dutch news in English within the publishing system itself, for example, would extremely facilitate the customers.
Threats: The online real time applications might be unstable initially due to the complex architecture.
Social Media
In Ontotext projects the existing LOD resources (such as, Linked Life Data) are applied for different socially aware domains and across languages. For example, the entity extraction tool LUPedia as well as the linked data concept store FactForge will be used in enhancing the socially marked knowledge. These modules will be extended by the language generation tool from MOLTO in order to improve the accuracy of the extracted information. This step is manageable, since the MOLTO rule-based translation technology is extended with the help of statistical approaches.
Related markets: education, tourism
SWOT Analysis:
Strengths: MOLTO gives the possibility of applying a structured approach to unstructured data for the purposes of good understanding of big amounts of data.
Weaknesses: MOLTO might support better some forms of Social Media (publicly available), while some others (restricted) – not so well.
Opportunities: The social media might be viewed as a network of subdomains and addressed by MOLTO technology in a step-by-step way.
Threats: No visible possibility is foreseen at the moment for using MOLTO modules directly in sentiment and opinion analysis.
Pharma
Ontotext regularly participates in projects that consider health care and life sciences data management (there is Life Science project running now). Here the available domain ontologies are explored together with the NLP processing. GF will be extremely useful since both the prescriptive and diagnosis languages are controlled. There is an additional level of translation here, namely: from the specialized prescription and anamnesis language of doctors into the common natural language of the users.
Related markets: Medical producs sales, health care
SWOT Analysis:
Strengths: MOLTO is best performing in controlled and structured domains. Pharma is a good example of such a combination. In addition, there is an already working prototype on Patents in this domain.
Weaknesses: Pharma would be better manageable from doctors’ production point of view, rather than from patient perspective, since professional language is better controlled.
Opportunities: Improvement of multilingual search and relevance of the search results.
Threats: Pharma is one of the well elaborated domains from a processing point of view. Thus, the real added value of MOLTO is to be tested in the future.
- Printer-friendly version
- Login to post comments
- Slides
What links here
No backlinks found.