Kairntech Software

Kairntech helps you to build, industrialize and maintain trustworthy GenAI language assistants.

See also: Kairntech Consulting

1) Hosted deployment: Hosting fees will be added (GPU in option)

For non-production instances (development & integration, pre-production…) or other specific requests, please contact us.

Supported languages: English, French, German, Spanish, Italian, Dutch, Portuguese, Russian, Arabic, Chinese, Hindi, Urdu, Japanese, Persian.


Key Features & NLP Tasks

Supported languages
vs
NLP tasks
Western languagesNon-Western languages
Core features (GUI, search, manual annotation…)YesYes
Language identificationYesYes
Token classification (date, amount, address, phrase…)YesYes
Named Entity Recognition (person, location, organization, disease…)YesYes
Sentence classificationYesYes
Text classificationYesYes
Entity Linking (Wikidata/Wikipedia)English, French, German, Spanish, Italian, Portuguese, Swedish.
+ language on demand
Arabic, Japanese, Russian, Ukrainian, Chinese, Bengali, Hindi, Persian.
+ language on demand
Entity Linking (lexicon, business vocabulary)YesYes
Semantic textual similarityYesPartially
Question answering – RAGYes
but may depend on the third-party solutions used
Yes
but may depend on the third-party solutions used
Text summarization(2)Yes
but may depend on the third-party solutions used
Yes
but may depend on the third-party solutions used
Paraphrase generationYesYes
Data augmentationYesYes
Sentiment analysis (polarity, emotion)YesYes
Intent detection & slot fillingcoming soon…coming soon…
Relationship extractioncoming soon…coming soon…
Co-reference resolutioncoming soon…coming soon…
Automatic Speech Recognition(2)Yes
but may depend on the third-party solutions used
Yes
but may depend on the third-party solutions used
Machine translation(2)Yes
but may depend on the third-party solutions used
Yes
but may depend on the third-party solutions used

Core engines

Text classification enginesScikitLearn: MultinomialNB, ComplementNB, SVC, LinearSVC, LogisticRegression, MLPClassifier, RandomForestClassifier, DecisionTreeClassifier, GradientBoostingClassifier, XGBClassifier, KerasMLPClassifier…
Spacy with Transformer models
Flair with static embeddings (fasttext…), Flair embeddings…
Transformers: Almost all model types & model names from Hugging Face Hub
FastText
BERTopic
Text clustering enginesBERTopic
Token classification & NER enginesCRF-Suite
Spacy with Transformer models
Delft: BidLSTM-CRF, BidGRU-CRF with ELMO embeddings
Flair: Optimizers (SGD, Adam…), RNN-type (LSTM, GRU) with static embeddings (fasttext…), Flair embeddings…
Transformers: Almost all model types & model names from Hugging Face Hub
Lexicon-based enginesPhraseMatcher
EntityRuler

Core components

Document ConvertersTika (PDF, Office, HTML…)
Whisper (Speech to text)
Deeptranscript(2) (Speech to text)
OCRmypdf (scanned PDF to Text)
Grobid (Scholarly documents)
Inscriptis (HTML to txt)
Pubmed XML (Biomedical abstract)
NewsML-G2 XML (news)
Transformer models (Speech to text)
Custom converter (on demand)
Document Segmenters (chunking)Microsoft Blingfire
Regular expression segmenter
PySBD segmenter
Spacy Rules segmenter
Segmentation pipelines
Custom segmenter (on demand)
Output FormattersJSON
Tabular (CSV, Excel)
Custom formatter (on demand)

Core models & technical components

Off-the-shelf models & technical componentsAcronyms detection
Duckling (Units & Measure detection)
SpacyNER (Entity detection)
Pattern (regex)
Spacy Rules
Annotations consolidation
Pseudonymization
Restore punctuation and true casing
Annotation-based segmentation
Group sentences by chunks
DeepL(2) (Machine Translation)
Transformer models (Q&A, SA, Zero shot classifier…)

Custom model & component (on demand)
Language Models (embeddings)All suitable models from Hugging Face hub (AllMiniLM-L6-v2, paraphrase-multilingual-MiniLM-L12-v2, mBERT, CamemBERT, XLM-Roberta…)
OpenAI(2) embeddings
Fine-tuned Language Models (on demand)
Large Language Models (LLMs)OpenAI(2): GPT-3.5, GPT-4…
Microsoft Azure(2): GPT-3.5, GPT-4…
DeepInfra(2): Llama2, Mistral-7B, Mixtral-8x7B, Mixtral-8x22B, DBRX, Dolphin-2.6, Zephir…
Wikidata/Wikipediaentity-fishing (15 languages)
New language on demand

2) API integration, key required