Natural language questions and answers on local and confidential information


Large Language Models (LLMs such as GPT) have sparked significant progress in the way, users can interact with text: they deliver impressive results on tasks such as text generation, text summarization and many other scenarios. The predominant mode of interaction with LLMs is using a website such as OpenAI’s ChatGPT directly or embedding GPT into 3rd party applications via the API. This allows the user to benefit from LLMs results without having to worry about the considerable computing requirements of running a LLM installation themselves, let alone training it.

The downside, however is that you cannot expect the LLM to deliver results specifically on your own content. While LLMs have been exposed to massive amounts of publicly available content during training – billions of webpages and documents – they obviously (and fortunately) don’t have access to your own, private and often confidential content: Your documents, your emails, your reports.

In the remainder of this document we describe an approach we have implemented in the Kairntech software, that allows the user to benefit from both aspects: Having the full power of LLMs available to address relevant document analysis use cases while at the same time making this available on your own private documents without sharing them in their entirety with third party vendors.

The approach we describe here is often referred to as RAG (retrieval augmented generation) and has been the focus of a recent Kairntech webinar. The slides we had presented during this webinar can be found here.

The approach: Retrieval augmented generation or “RAG”

Kairntech has been a platform for the integration of different AI/NLP approaches right since the beginning. Today it offers users access to libraries such as Spacy and Flair for natural language processing and model training, to DeepL for machine translation, to entity-fishing for Wikidata-based content annotation and many more dedicated packages

Not surprisingly, access to LLMs, too, has been quickly integrated into the software: The question answering scenario as outlined below therefore can be seen as a straightforward example for this approach.

A user’s documents are imported, segmented and mapped to an appropriate vector-based semantic representation and then stored locally in a dedicated vector database. A user’s queries against these local, private documents is then processed by first performing the same semantic mapping of the query to an embedding vector and then selecting the n most relevant or similar vectors from the database. These segments are then turned into a virtual document and submitted to the LLMs in order to generate the answer to the original question.

Talk to your documents

The result is a significant extension of the way, users can interact with their own documents. The setup above allows to ask high level questions, whose answer is hidden somewhere inside potentially large document collections and which would take considerable efforts to compile with purely manual, traditional text-search.

As a first scenario consider a patent expert trying to get an overview of technological trends by studying large numbers of patent documents on a given topic. In our example below we assume that documents on a given topic (here: recent patents on rechargeable batteries) have been imported into the system and that the expert is interested in answering a detailed technical question:

What alternative materials for anode material in lithium ion batteries are discussed in the imported patents?

Note that not only are patent documents typically complex and long, but more importantly, the question above just cannot be answered by a simple keyword-based query.

The setup offered by Kairntech as described above, allows the user to directly ask this question and many others and receive an instant answer:

It is worth studying this response in more detail: At the top of the page the user types the respective question in just the same way as they would instruct a co-worker to go out and prepare a dedicated report on the issue.

The resulting list below then is first a condensed answer in light green that summarizes the matter in a few sentences and then a list of text segments in the original patent documents that have something to say on the issue. Note furthermore that the condensed answer at the top references the individual segments (with links represented by the blue numbers in brackets) allowing the user to validate a specific piece of the answer in detail if required. Evidently the individual segment in turn “know” from which document it originally came from, allowing the user to jump into the full text if required.

Note again that this goes beyond what a user of, say ChatGPT to mention just the most popular implementation of LLMs today, would receive. First it cannot be guaranteed that the documents relevant for answering the question were part of the GPT training set. And moreover, a GPT answer typically provides no indication from which sources the answer was created.

More such use cases

The approach outlined above of course is not limited to patent analysis. Asking a collection of lease contracts (also often complex and long texts) about which legally options to terminate the contract are listed in the texts is another example.

It is evident that a whole range of relevant scenario become imaginable once this functionality is available: Journalists can ask high level questions on a large document collection around a current case (“Did politician X ever explicitly deny that company Y should get the contract?”). Or users can hope for support to unravel details from a long thread of email exchanges on a given topic (“What was the reason again why in project X we considered Y a good component rather than Z?”).


LLMs are an exciting new approach to AI/NLP scenarios. They deliver relevant and useful results on a wide range of use cases. Kairntech has taken determined steps to embrace this new technology and to make it part of what users of the Kairntech software can access. It is evident that LLMs will evolve quickly in the coming months, not only in the basic language understanding capabilities but also in the way they can be combined and integrated into larger use cases.

While the key motivation for the user is that many scenarios are made significantly easier by interacting appropriately with LLMs, behind the scenes there is lots of serious engineering going on:

  • What use cases require LLMs?
  • What use cases are equally (or better) addressed with other NLP/AI approaches?
  • How to choose the best hyperparameter settings for a given scenario?
  • How to properly and seamlessly embed an LLM into a larger application and what are the benefits and shortcomings of individual LLMs?

Kairntech as a software provider as well as a service and consulting company is prepared to accompany our clients on these questions around the existing perspectives of LLMs usage in the business world.