Updating structured vocabularies: a necessary but difficult task Structured vocabularies (lexicons, dictionaries, thesauri, taxonomies…) and, more broadly, knowledge bases, play an important role in many applications when it comes to organizing or making information accessible. But they still need to be constantly updated, which is a long, tedious and difficult task. A good example is … Continue reading Discover and enrich knowledge by analysing documents with AI
Tag: Business vocabularies
Business vocabularies, also called lexicons, are a component of NLP processing. They are linguistic sets with grammatical and semantic information about words in a given language. One could think of lists of people, organizations, locations or products.
With Kairntech Studio you can import business vocabularies to create a text annotator (the official term is gazetteer).
Vocabularies can then be used to automatically annotate documents and help you jump-start the creation of an annotated dataset in order to train a Machine Learning model. Few-shot learning with Large Language Models may also be used in this context.
Vocabularies can be used either to annotate or to avoid annotations. The consolidation of different annotations and normalization processes are part of the AI Pipelines within the Kairntech solution.
For more detailed information:
The European green taxonomies Taxonomy is a word that suddenly appeared with financial investors and the public at large when the European Commission published its own taxonomy on sustainable activities. Let's quickly recall that a taxonomy is a term which originates from biology and that designates the naming in an ordered system that is intended … Continue reading Taxonomy and NLP
We have recently shown you how Kairntech can be used to automate parts of the efforts in vocabulary maintenance: Finding new candidate terms for the update and extension of your vocabularies. We have turned that into a video (5min) that explains the process and shows you how to do this using the software. https://www.youtube.com/watch?v=s3Cqzl67Fms
Maintain business vocabularies with AI Structured business vocabularies (thesauri, taxonomies...) play an important role in many applications where complex, large and volatile information needs to be organized and made accessible. A fine example is the famous MeSH thesaurus that facilitates search and access on Medical topics. Enriching scientific content with MeSH terms allows to guarantee … Continue reading Finding new needles in content haystacks with business vocabularies
Introduction Machine Learning approaches in NLP have been shown to be able to solve a wide range of tasks after being trained from scratch on an appropriate training corpus. While this is impressive, it often does not correspond to the demand in many real-world scenarios. Often relevant prior knowledge exists – in the case of … Continue reading Jumpstart your Machine Learning efforts by importing structured knowledge
Introduction Information extraction tends to target two situations: Extract entities from an existing vocabulary, or Create an extraction model from scratch when there is no existing vocabulary. However, sometimes the situation is a mixture of the two extremes: an incomplete business vocabulary exists but needs to be completed with relevant additional entities of the same type. … Continue reading Bootstrap World Knowledge to extend Business Vocabulary and enhance Knowledge Graphs
In few words Due to the special situation we are all experiencing, the AI SDV conference this year took place fully online. Two days of intensive exchange about NLP / AI approaches, requirements and use cases. Participants from three continents attended the event. Organizer Christoph Haxel had declared: "The show must go on ... line" … Continue reading 2020 AI SDV: Large-scale thesaurus-based Entity Extraction