Transformers: the new trend in Artificial Intelligence and NLP
Recent progress in NLP (Natural Language Processing) technologies highlights its strong potential for business transformation. While the domain used to be dominated by big players such as Google or Microsoft, smaller organizations are now going mainstream with the collaborative, open source approaches such as Explosion.ai (Spacy) or Hugging Face.
The driving force behind Hugging Face is the rise of transformer architectures, nicely summarized in this article. Developers can build on top of already pre-trained general purpose models with hundreds of millions of parameters, rather than start from scratch every time.
Existing implementations of transformer models used to be a deployment nightmare with complex library dependencies, constrained environment requirements, different model formats, ad-hoc tokenizers, different special symbols or different expected input vectors. A wide variety of framework versions, and using different transformer versions a system often requires a lot of specific development to adapt the input/output pipelines.
Hugging Face transformers provide a very handy compatibility layer for all flavors of transformer models, turning the most common NLP tasks into a hub of models with a very large development community. AutoConfig, AutoModel and AutoTokenizer can automatically detect the implementation and resources requirements from the model name. Although some case/ switch tuning might still be necessary, this is particularly useful with TensorFlow and Keras where less “native” transformer models are available (although it adds 1.3GB of PyTorch library dependencies…).
The community activity with thousand off-the-shelf NLP models, also including domain-specific pre-trained models such as automatic text summarization, is clear proof of the Hugging Face success. It is equivalent to a multi-billion dollar R&D effort as Clem Delangue, the CEO of Hugging Face, calls it.
However an off-the-shelve Hugging Face model is not exactly the same thing as running a smart, NLP-powered, application.
Improve off-the-shelf NLP models with domain expertise
First of all, off-the-shelf NLP models are always a particular compromise between performance and accuracy taking into account the quality of the results, the size of the model, its runtime, the quality of the training dataset or the time needed to prototype.
To make the best possible model satisfying your own unique business requirements implies to improve a general purpose model. Although a lot of time can be saved by kickstarting projects with off-the-shelf models, you still need to spend time by manually annotating false positives and false negatives. No miracles here: to get a great model you need to wet your shirts and know what makes your business and know-how unique!
Curated annotations require an easy-to-use interface to browse documents and segment documents in order to create labels, to (un)validate text elements and experiment with various machine and deep learning models. This process of text annotation can be improved by innovations comprising in:
- user experience (such as a built-in curation interface with links to the original document),
- active learning (proposing on-the-fly suggestions),
- transfer learning (leveraging better endowed languages such as English).
Build end-to-end NLP pipelines
Building a dedicated NLP application also means to take advantage of different frameworks, components and models (e.g. Facebook’s Duckling, Spacy, Trankit…) and internally developed models into a NLP pipeline. Ideally this should be realized with pre-packaged components in an easy-to-use interface accessible to business analysts.
On top of AI models a NLP pipeline should handle a variety of data flows:
- Converters and processing frameworks to ingest and transform sources (via for instance Apache Tika, Kafka, Flink…)
- Processors to perform specific tasks such as the normalization of dates, rule based interactions, the consolidation of different annotation processes, the linking to Knowledge Graph…
- Gazetteers or taxonomies to leverage or enrich business vocabularies
- Output convertors to export in a reading grid or visualization tools such as Excel or XML format
And last but not least : such a pipeline should be integrated within a business environment using a REST API, hiding the complexity of all different engines and technical components used.
And surprise, surprise, that’s exactly what Kairntech is offering you: a low-code platform to quickly implement custom-made AI models and NLP pipelines. Ask for a free trial at info@kairntech.com.