Experiment, customize, industrialize and maintain trustworthy GenAI language assistants with Kairntech Software.
See also: Kairntech Consulting
1) Hosted deployment: Hosting fees will be added (GPU in option)
2) Excluding possible LLM costs
3) Support Hours: Mon-Fri, 9am-6pm
For non-production instances (development & integration, pre-production…) or other specific requests, please contact us.
Supported languages: English, French, German, Spanish, Italian, Dutch, Portuguese, Russian, Arabic, Chinese, Hindi, Urdu, Japanese, Persian, Ukrainian, Farsi.
AI NLP Key Features
| Supported languages vs NLP tasks | Western and non-western languages |
| Core features (GUI, search, manual annotation…) | All |
| Language identification | All |
| Token classification (date, amount, address, phrase…) | All |
| Named Entity Recognition (person, location, organization, disease…) | All |
| Sentence classification | All |
| Text classification | All |
| Entity Linking leveraging Wikidata/Wikipedia | English, French, German, Spanish, Italian, Portuguese, Swedish, Arabic Japanese, Russian, Ukrainian, Chinese, Bengali, Hindi, Persian. + language on demand |
| Entity Linking leveraging Business vocabulary | All |
| Semantic textual similarity | All for western languages, partially for non-western languages |
| Document Question answering – RAG | All but may depend on the third-party solutions used |
| Agent-driven RAG | All but may depend on the third-party solutions used |
| Text summarization(2) | All but may depend on the third-party solutions used |
| Paraphrase generation | All |
| Data augmentation | All |
| Sentiment analysis (polarity, emotion) | All |
| Intent detection & slot filling | coming soon… |
| Relationship extraction | coming soon… |
| Co-reference resolution | All |
| Automatic Speech Recognition – ASR(2) | All but may depend on the third-party solutions used |
| Machine Translation- MT(2) | All but may depend on the third-party solutions used |
Core engines
| Text classification engines | ScikitLearn: MultinomialNB, ComplementNB, SVC, LinearSVC, LogisticRegression, MLPClassifier, RandomForestClassifier, DecisionTreeClassifier, GradientBoostingClassifier, XGBClassifier, KerasMLPClassifier… Spacy with Transformer models Flair with static embeddings (fasttext…), Flair embeddings… Transformers: Almost all model types & model names from Hugging Face Hub FastText BERTopic |
| Text clustering engines | BERTopic |
| Token classification & NER engines | CRF-Suite Spacy with Transformer models Flair: Optimizers (SGD, Adam…), RNN-type (LSTM, GRU) with static embeddings (fasttext…), Flair embeddings… Transformers: Almost all model types & model names from Hugging Face Hub |
| Lexicon-based engines | PhraseMatcher EntityRuler |
Core components
| Document converters | Tika (PDF, Office, HTML…) LLMs, Whisper (Automatic Speech Recognitioon – ASR) LLMs (Image to Text) Mistral OCR (scanned PDF to Text) OCRmypdf (scanned PDF to Text) Grobid (Scholarly documents) Inscriptis (HTML to txt) Pubmed XML (Biomedical abstract) NewsML-G2 XML (news) Custom converter (on demand) |
| Document segmenters (chunking) | Microsoft Blingfire Regular expression segmenter PySBD segmenter Spacy Rules segmenter Segmentation pipelines Custom segmenter (on demand) |
| Output formatters | JSON Tabular (CSV, Excel) Custom formatter (on demand) |
Core models & technical components
| Off-the-shelf models & technical components | Acronyms detection Duckling (Units & Measure detection) SpacyNER (Entity detection) Pattern (regex) Spacy Rules Annotations reconciliation Pseudonymization Text generation with LLMs Data augmentation with LLMs Wikidata Semantic Fingerprints DeepL(2) (Machine Translation-MT) … Custom model & component (on demand) |
| Language Models (embeddings) | All suitable models from Hugging Face hub (AllMiniLM-L6-v2, bgem3, mBERT…) OpenAI(2) embeddings Fine-tuned Language Models (on demand) |
| Large Language Models (LLMs) | OpenAI(2): GPT-4… Microsoft Azure(2): GPT-4… DeepInfra(2): Llama-4, Qwen2.5, Nemotron, Phi4, DeepSeek… On premise LLMs: Llama-3, Qwen2.5, Nemotron… |
| Wikidata/Wikipedia | entity-fishing (15 languages) New language on demand |
