NLP and Knowledge Base
The broad success of quantitative methods such as deep learning in NLP sometimes risks to downplay the importance of explicit, symbolic knowledge required for many NLP task: Good named entity recognition (NER) for instance not only needs to recognize the entities (where learning based methods are important), but also normalize, disambiguate and if possible link the recognized entities to background knowledge. That often requires explitic knowledge.
The term “knowledge aquisition bottleneck” refers to the fact that is is often difficult (and costly) to ensure that the required knowledge is available and present in the right formats for NER algorithms.
When Wikidata comes in…
At Kairntech we have put NER in place in a way that benefits broadly from public sources like Wikipedia and Wikidata that are constantly updated such that users benefit always from up-to-date knowledge when analysing content. Recognized entities are disambiguated (there may be another person, another location, another concept with the same name) and linked to background information, thus enriching the analysed content.
In the example above we see that this service, updated automatically on March 3, 2020, has therefore access to the latest information on the Brexit, which after lengthy negotiations finally took place at the end of January 2020. Latest information on this entity as well as on tens of millions of others in several languages is available to the Kairntech Named Entity Recognition that is part of our Kairntech platform.