Ontotext’s business model combines the development of products (including some open source versions) with the provision of research, consultancy and development services. Many commercial projects combine all four elements. For Ontotext MOLTO will bring the unique opportunity to strengthen its position in the semantic technologies and knowledge-driven text analytics market, with development and adoption of intelligence methods that support ontology-based multilinguality. This will be possible due to the fact that MOLTO adds to the semantic technologies the GF formalism, which operates as an interlingua on language level and thus, localizes the ontologies in appropriate ways. More precisely, the main directions of future development will be as follows:
The business strategies will be as follows:
Ontotext AD is the strongest semantic technologies company in Europe and a world-leading supplier of core semantic technology, text mining and web mining solutions.
We have unmatched portfolio of world-class technology and expertise in:
The main differentiator between Ontotext and other semantic technology vendors is that we deliver robust technology, proven in multiple high-profile projects that justify its maturity and usability. The best example in this direction is the usage of OWLIM (our RDF database engine) in the BBC FIFA World Cup 2010 website where most of the pages were generated dynamically through queries to OWLIM – millions of requests per day, hundreds of updates per hour, handled by a cluster of few servers. Following the success of this project, BBC extended the use of Ontotext technology for the BBC Sport website and for the London Olympics 2012 website.
Ontotext’s clients span across several sectors:
Considering the substantial number of clients of Ontotext in UK, we are running in London regular open training courses “Semantic Technologies with OWLIM”, usually scheduled at roughly once per quarter.
On the one hand, the research goes into products through the traditional ways:
On the other hand, the developed technology within the project is applicable to other related areas of NLP services application. It can be either used as stand-alone applications, or be integrated into larger and more complex architectures. Both business opportunities have significant added value.
The first direction is exemplified by the envisaged use case domains: Patents in medical domain and artefacts in Cultural heritage domain. The second direction goes to areas that apply strongly Question Answering, Information Retrieval and MT. Such areas are: Publishing, Social Media and Pharma. The related products are highly commercial and thus, precision and relevance of the retrieved information are crucial features for the clients. GF formalism would be useful for the smoothing of the multilingual retrieval and translation results. It must be noted that the component shared by all targeted products of Ontotext is the ontology-based knowledge that relates to LOD and multilingual settings.
All the EU research projects that Ontotext has been involved in, have lead to the improvement of the current technology as well as to the creation of new products, that have been explored in commercial projects. In this way, we might view the Research as an Investigation, Preparation and Compilation phase, while the applications in Industry – as Adaptation, Harmonization and Real Setting evaluation phase. Below some synergies of the aforementioned kind are given:
RENDER is an ongoing project that aims at providing a comprehensive conceptual framework and technological infrastructure for enabling, supporting, managing and exploiting information diversity in Web-based environments. It also would leverage very large amounts of content and metadata: news, blog and microblog streams, content and logs from Wikipedia, news archives, multimedia content and reader comments, discussion forums, etc. This data is managed by a highly scalable data management infrastructure, and enriched with machine-understandable descriptions and links referring to the Linked Open Data Cloud. This development would lead OWLIM and KIM technologies to handle diverse data, which would widen their data coverage and management.
CUBIST is an ongoing project that aims at Combining and Uniting Business Intelligence and Semantic Technologies with a special focus on unstructured data mining. Being central to the project goals, the semantic technology supports a persistent layer – a semantic Data Warehouse. The project adds to the better Visual Analytics, whose improved characteristics would be important for providing more competitive user interfaces in industry.
MediaCompaign is innovative in Ontology creation for cross-media modelling of media presence and campaigns; Semantic cross-market product data interlinking; Identification and tracking of new media campaigns in different media and countries. MediaCompaign focused mainly on advertisement campaigns and their impact on attitudes and opinions. Thus, the publishing services, provided by Ontotext, will be enriched with sentiment analysis additionally to the knowledge-based analysis. Thus, Ontotext will have a social-aware service.
NoTube project concentrated on personalized semantic news; personalized TV guide with adaptive advertising as well as Internet TV in the Social Web. It relied on the key role of the semantic technologies, taking into account the community aspects and is built on multilinguality. The results strengthened the personalized component in the retrieved information in commercial publishing platforms.
PHEME project (will start in October 2013) has as its main goal the development of scalable methods for Social Semantic Intelligence, across media and languages. I also aims at modeling not only facts and opinions, but also the parameters of reliability of the information sources. Additionally, PHEME focuses on more concrete and socially crucial cases in recent years, such as crowdsourcing, citizen journalism and bioinformatics. PHEME goes beyond official media campaigns - to social network dimensions and beyond the opinions – to rumour and misinformation detection. This project will lead to a large-scale social media bound OWLIM and KIM platforms. Also, it will add to its services the identification of misinformation, which would be very valuable facility in the personalized component for the end users.
In all productizing areas, listed below, the following underlying NLP technology is assumed:
With the globalization processes and harmonization of large groups of documents in EU, the requirements for particular data management systems is rapidly growing. Additionally, virtual space has become more populated, shared, explored and multilingual. For example, virtual tours in famous museums; virtual storage and access to EU legislation; interactive online digests; electronic government; digital preservation storages; social networks etc.
For these reasons, it is not surprising that some of the most active business domains at the moment are the Cultural heritage stakeholders (DARIAH, CLARIN, EUROPEANA); Pharma (Astra Zeneca); Media Publishing (BBC, NDP, Press Association) and Social Media (Pheme project).
Ontotext is involved in all of the aforementioned domains through research projects and commercial projects.