How to build an annotation pipeline?

An annotation pipeline combines a sequence that may contain:

  • a converter to transform source formats
  • custom-made or off-the-shelf modelsannotators and processors to manipulate text or annotations,
  • a formatter to provide the desired output format.

To create a pipeline:

  • Go to the Processing view
  • Create a new pipeline
  • Give a name to the pipeline
  • A pipeline contains a converter (option), some processing components (at least one), and an output formater (option)
  • Select a possible converter (you don’t need any converter if you deal with raw text)
  • Add a first item in your pipeline clicking on the + Component
  • When adding a component to a pipeline, you can select either:
    • Model or Pipeline that are part of existing projects or
    • Off-the-shelf annotator independently of any existing project
  • Select “Model or plan” to select a project you have access to and then select a model
  • Add another model as you wish
  • Add an off-the-shelf component as below.
  • You will have access to a large library of off-the-shelf models & technical components.
  • Select for instance the Consolidation component as below. This component aims at dedupplicating annotations between two similar models that might extract the same type of information.
  • You can configure the component with the parameters in the right panel.
  • You can add conditions under which a particular model, annotator or component is launched
  • Finally, you can select a possible output formater to have a different output format that the Kairntech JSON format.
  • Don’t forget to save your pipeline
  • You can later on edit, dupplicate or delete your pipeline