Named Entity Recognition with Wikidata: always up to date!

Natural language processing and named entity recognition with wikidata knowledge

The broad success of quantitative methods such as deep learning in NLP sometimes risks to downplay the importance of explicit, symbolic knowledge required for many NLP tasks. Good named entity recognition (NER) for instance not only needs to recognize the entities (where learning based methods are important), but also normalize, disambiguate and if possible link the recognized entities to background knowledge. That often requires explitic knowledge.

Wikidata extraction

The term “knowledge aquisition bottleneck” refers to the fact that is is often difficult (and costly) to ensure that the required knowledge is available and present in the right formats for NER algorithms.

When Wikidata comes in…

At Kairntech we have put NER in place in a way that benefits broadly from public sources like Wikipedia and Wikidata that are constantly updated such that users benefit always from up-to-date knowledge when analysing content. Recognized entities are disambiguated (there may be another person, another location, another concept with the same name) and linked to background information, thus enriching the analysed content.

wikidata extraction

In the example above we see that this service, updated automatically on March 3, 2020, has therefore access to the latest information on the Brexit, which after lengthy negotiations finally took place at the end of January 2020. Latest information on this entity as well as on tens of millions of others in several languages is available to the Kairntech Named Entity Recognition that is part of our Kairntech platform.

Kairntech maintains instances of the platform online for test, demos and experiments. Feel free to contact us at info@kairntech.com to discuss your first steps with the Kairntech platform.