How to create a custom segmentation pipeline?

  • Go to the Pipelines view
  • Create a segmentation pipeline
  • Give a name to your pipeline
  • Select Advanced segmentation pipeline
  • Add a first item to the pipeline
  • Select an existing models that you have built in your project. For instance a simple CRF-Suite model that extracts document boundaries (Article XX….)
  • Add a new item as off-the-shelf annotator
  • Select the tag2segment off-the-shelf component that will segment document at each annotation extracted by the previous model
  • Save your pipeline
  • Click on the star to automatically recompute all segments of your documents
  • Note that existing annotations will be kept
  • Check the new segmentation in the Segments view

If you want to use an off-the-shelf segmenter: