How to build NLP pipelines?

An NLP pipeline combine a sequence that may contain:

  • a converter to transform source formats
  • custom-made or off-the-shelf modelsannotators and processors to manipulate text or annotations,
  • a formatter to provide the desired output format.

To create a pipeline:

  • Go to the Pipelines view
  • Create a new pipeline

  • Give a name to the pipeline.
  • Select a possible converter if necessary (you don’t need any converter if you deal with raw text)

  • Add a first item in your pipeline clicking on the +
  • When adding a component to a pipeline, you can select either:
    • Model or Pipeline that are part of existing projects or
    • Off-the-shelf annotator independently of any existing project
  • Select a project you have access to and then select a model
  • Add another model as you wish
  • Add an off-the-shelf component as below.
  • You will have acces to a large library of off-the-shelf models & technical components.
  • Select for instance the Consolidation component as below. This component aims at dedupplicating annotations between two similar models that might extract the same type of information.
  • You can configure the component with the parameters in the right panel.
  • You can add conditions under which a particular model, annotator or component is launched
  • Finally, you can select a possible formatter to have a different output format that the Kairntech JSON format.
  • Save the pipeline with the tick button at the top right of the window.
  • Then, as a single model, you can test your pipeline