Supported Operating Systems
The Kairntech platform has been successfully validated on the following operating systems, for both CPU and GPU deployments:
- Ubuntu 20.04.6 LTS (x64), or higher
- RHEL / CentOS7 (x64), or higher
Note: When deploying in a CPU-only environment, Docker-based virtualization removes most OS-level constraints. However, for GPU-based deployment, compatibility is restricted due to NVIDIA driver requirements. Only the above-listed OS versions are officially supported for GPU setups.
Installation steps
All listed commands below come from the environment UBUNTU 18.04 LTS x64.
Host configuration prerequisites:
Kairntech platform Docker volumes prerequisites
Kairntech platform installation
Host configuration prerequisites:
ELASTICSEARCH recommendation
You may need to increase the vm.max_map_count kernel parameter to avoid running out of map areas.
In order to avoid such message:
[1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
It is recommended to edit file /etc/sysctl.conf and insert the following lines:
# ES - at least 262144 for production use
vm.max_map_count=262144
Apply the modification with using the following command
sudo sysctl -p
INOTIFY recommendation
You may need to increase the fs.inotify.max_user_instances parameter to avoid reaching user limits on the number of inotify resources.
In order to avoid such message
[Errno 24] inotify instance limit reached
It is recommended to edit file /etc/sysctl.conf and insert the following lines:
# Prevent [Errno 24] inotify instance limit reached
fs.inotify.max_user_instances = 65530
Apply the modification with using the following command
sudo sysctl -p
HAPROXY recommendation
You may need to set net.ipv4.ip_unprivileged_port_start to let to non root user haproxy the permission to run on priviledged port 443.
In order to avoid such message (in haproxy container console output)
[ALERT] (1) : Starting frontend http-in-sherpa: cannot bind socket (Permission denied) [0.0.0.0:443]
[ALERT] (1) : [haproxy.main()] Some protocols failed to start their listeners! Exiting.
It is recommended to edit file /etc/sysctl.conf and insert the following lines:
# Enable haproxy to listen to 443
net.ipv4.ip_unprivileged_port_start=0
Apply the modification with using the following command
sudo sysctl -p
User/Folder creation
USER creation
Is it highly advised to create a specific user, for the deployment of the platform:
# FOR A STANDARD USER
sudo adduser kairntech
# OR FOR A HEADLESS USER
sudo adduser --disabled-password --gecos "" kairntech
FOLDER creation
Is it highly advised to create a specific folder, for the deployment of the platform:
sudo mkdir -p /opt/sherpa
sudo chown -R kairntech. /opt/sherpa
Directory
/opt/sherpa/will store all files and folders relative to the platform (delivered by Kairntech)
- File
docker-compose.ymlwill be used to deploy/pull Docker images of the platform- Folder
initwill be used to deploy prerequisites docker volumes content
Binaries installation
Docker / Docker Compose installation
The platform being based on a Docker-type solution, please install docker and docker compose plugin.
The official page indicating the installation commands is located here.
sudo apt-get update
sudo apt-get install ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg
echo "deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu "$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
Then you will have to add the kairntech user to the docker group
sudo usermod -aG docker kairntech
As mentionned in the installation guide, log out and log back in so that your group membership is re-evaluated.
If you want to test, open a new session terminal and run
sudo su - kairntech
docker run hello-world
After installing the compose plugin, you can test via:
sudo su - kairntech
docker compose version
Docker volumes to mount
The binary docker will be used to feed docker volumes with embeddings files; to select specific languages, or all languages, an .env file is used for indicating the choice.
FLAIR embeddings
The engine Flair require some « embeddings » files: these static files are stored as Docker volumes. In order to get these files, please run:
sudo su - kairntech
cd /opt/sherpa/init/flair-init-job
# INSTALL AR, DE, EN AND FR
cat flair.env
FLAIR_LANGUAGES=ar,de,en,fr
FLAIR_RESOURCES_VERSION=default
docker compose -f docker-compose.yml up
# OR INSTALL ALL LANGUAGES
cat flair.env
FLAIR_LANGUAGES=all
FLAIR_RESOURCES_VERSION=default
docker compose -f docker-compose.yml up
The variable
FLAIR_LANGUAGEScan be modified to indicate the language selection.
The Docker container can be removed, once Flair embeddings are deployed, via:
docker rm flair-resources-init-job
The table below gives disk usage corresponding to available languages:
| Language | Size |
|---|---|
| Arabic (AR) | 2.9G |
| German (DE) | 4.3G |
| English (EN) | 3.8G |
| Spanish (ES) | 4.2G |
| Farsi (FA) | 768M |
| French (FR) | 4.2G |
| Hindi (HI) | 1.0G |
| Italian (IT) | 3.9G |
| Dutch (NL) | 3.9G |
| Portuguese (PT) | 2.7G |
| Russian (RU) | 4.1G |
| Chinese (ZH) | 1.6G |
| All | 35G |
FASTTEXT embeddings
The engine fastText require some « embeddings » files: these static files are stored as Docker volumes. In order to get these files, please run:
sudo su - kairntech
cd /opt/sherpa/init/fasttext-init-job
# INSTALL AR, DE, EN AND FR
cat fasttext.env
FASTTEXT_LANGUAGES=ar,de,en,fr
FASTTEXT_RESOURCES_VERSION=default
docker compose -f docker-compose.yml up
# OR INSTALL ALL LANGUAGES
cat fasttext.env
FASTTEXT_LANGUAGES=all
FASTTEXT_RESOURCES_VERSION=default
docker compose -f docker-compose.yml up
The variable
FASTTEXT_LANGUAGEScan be modified to indicate the language selection.
The Docker container can be removed, once fastText embeddings are deployed, via:
docker rm flair-resources-init-job
The table below gives disk usage corresponding to available languages:
| Language | Size |
|---|---|
| Arabic (AR) | 1.5G |
| German (DE) | 5.6G |
| English (EN) | 6.2G |
| Spanish (ES) | 2.5G |
| French (FR) | 2.9G |
| Italian (IT) | 2.2G |
| Japanese (JA) | 1.3G |
| Portuguese (PT) | 1.5G |
| Russian (RU) | 4.7G |
| Chinese (ZH) | 822M |
| All | 29G |
ENTITY-FISHING knowledge
The component entity-fishing require some « knowledge » files: these static files are generated every month, and stored as Docker volumes. In order to get these files, please run:
sudo su - kairntech
cd /opt/sherpa/init/entity-fishing-init-job
# INSTALL AR, DE, EN AND FR
cat entity-fishing.env
ENTITY_FISHING_LANGUAGES=ar,de,en,fr
ENTITY_FISHING_RESOURCES_VERSION=2025-07-10
docker compose -f docker-compose.yml up
# OR INSTALL ALL LANGUAGES
cat entity-fishing.env
ENTITY_FISHING_LANGUAGES=all
ENTITY_FISHING_RESOURCES_VERSION=2025-07-10
docker compose -f docker-compose.yml up
The variable
ENTITY_FISHING_RESOURCES_VERSIONcan be modified to match the most recent knwoledge.
The Docker container can be removed, once entity-fishing knowledge is deployed, via:
docker rm entity-fishing-init-job
The table below gives disk usage corresponding to available languages:
| Language | Size |
|---|---|
| Arabic (AR) | 36.7G (3.7G + 33G) |
| German (DE) | 40.0G (6.0G + 33G) |
| English (EN) | 49G (16G + 33G) |
| Spanish (ES) | 37.4G (4.4G + 33G) |
| Farsi (FA) | 36.5G (3.5G + 33G) |
| French (FR) | 38.6G (5.6G + 33G) |
| Italian (IT) | 36.9G (3.9G + 33G) |
| Japanese (JA) | 36.6G (3.6G + 33G) |
| Portuguese (PT) | 35.8G (2.8G + 33G) |
| Russian (RU) | 39.4G (6.4G + 33G) |
| Chinese (ZH) | 36.1G (3.1G + 33G) |
| Ukrainian (UA) | 36.6G (3.6G + 33G) |
| Indian (HI) | 33.5G (455M + 33G) |
| Swedish (SE) | 37.2G (4.2G + 33G) |
| Bengali (BD) | 33.7G (700M + 33G) |
| All | 100G (67G + 33G) |
In these metrics, the common knowledge takes 33G of disk usage, and is mandatory.
VECTORIZERS
In order to fully utilize the vectorizers, language models files must be downloaded: these static files are stored as Docker volumes. In order to get these files, please run:
sudo su - kairntech
cd /opt/sherpa/init/vectorizers-init-job/
# INSTALL allMiniLML6V2
cat vectorizer.env
VECTORIZER_RESOURCES_VERSION=sentence-transformers/all-MiniLM-L6-v2
docker compose -f docker-compose.allminilml6v2.yml up
docker rm vectorizer-resources-init-job
# INSTALL multiMiniLML12V2
cat vectorizer.env
VECTORIZER_RESOURCES_VERSION=sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
docker compose -f docker-compose.multiminilml12v2.yml up
docker rm vectorizer-resources-init-job
# INSTALL spDilaCamembert
cat vectorizer.env
VECTORIZER_RESOURCES_VERSION=oterrier/sp-dila-camembert-base-gpl
docker compose -f docker-compose.spdilacamembertgpl.yml up
docker rm vectorizer-resources-init-job
# INSTALL sentenceCamembertBase
cat vectorizer.env
VECTORIZER_RESOURCES_VERSION=dangvantuan/sentence-camembert-base
docker compose -f docker-compose.sentencecamembertbase.yml up
docker rm vectorizer-resources-init-job
# INSTALL bge-m3 (GPU only)
cat vectorizer.env
VECTORIZER_RESOURCES_VERSION=BAAI/bge-m3
docker compose -f docker-compose.bge-m3.yml up
docker rm vectorizer-resources-init-job
The variable
VECTORIZER_RESOURCES_VERSIONcan be modified to indicate the model selection.
Kairntech platform installation
In order to download the different images needed to install the platform, you must first connect to dockerhub.
(The password to be used will be delivered by Kairntech ).
sudo su - kairntech
cd /opt/sherpa
docker login
username: ktguestkt
password:
Once logged in, you can start downloading the images:
docker compose -f docker-compose.yml pull
If you have deployed with authentication (MongoDB or ElasticSearch), for compatibility purpose, you’ll need to symlink
.envfile:sudo su - kairntech cd /opt/sherpa ln -s mongodb-credentials.env .env cat elasticsearch-credentials.env >> .env
Finally, to start the platform, run:
docker compose -f docker-compose.yml up -d
Once the platform is started, you can check the status of the containers; the following console output is given as an example. Some containers may not be present, depending on the kind of deployment you processed.
docker ps -a --format "{{.ID}}\t\t{{.Names}}\t\t{{.Status}}"
79e235f82787 sherpa-core Up 20 sec
e69f95855809 sherpa-crfsuite-suggester Up 20 sec
c9d95639c808 sherpa-entityfishing-suggester Up 20 sec
94e4574b95de sherpa-fasttext-suggester Up 20 sec
8f13e72aeb0d sherpa-phrasematcher-test-suggester Up 20 sec
0f49dec91340 sherpa-phrasematcher-train-suggester Up 20 sec
aa08f1008770 sherpa-sklearn-test-suggester Up 20 sec
988976ef327d sherpa-sklearn-train-suggester Up 20 sec
bed6169d9185 sherpa-spacy-test-suggester Up 20 sec
302bd98a44ab sherpa-spacy-train-suggester Up 20 sec
7754162ae44c sherpa-flair-test-suggester Up 20 sec
08d1ad415adb sherpa-flair-train-suggester Up 20 sec
4835129a77c9 sherpa-bertopic-test-suggester Up 20 sec
b999a848044c sherpa-bertopic-train-suggester Up 20 sec
0826e0dd9c85 sherpa-elasticsearch Up 20 sec
7f781bf11ddf sherpa-mongodb Up 20 sec
d3b0e0557309 sherpa-builtins-importer Up 20 sec
cf075d3b06f4 sherpa-multirole Up 20 sec
ae1b24e0ccdb sherpa-pymultirole Up 20 sec
2a737b399388 sherpa-pymultirole-trf Up 20 sec
f43121e96544 sherpa-pymultirole-ner Up 20 sec