How to automatically annotate your documents with off-the-shelf models?

User can annotate the documents in their project with off-the-shelf models or custom-built models & NLP pipelines. A lot of time is gained with this build-in option to annotate a set of documents.

This allows you pre annotate raw text, compare annotations for dataset optimization, reject annotations hence creating counter examples in your dataset… Filters help you to become even more efficient.

  • Go to the main menu
  • Select a predefined annotator or a project (meaning any projects you have access to)
  • For predefined annotator, there are a number of pre-configured off-the-shelf models:
    • All Wikidata concept
    • Media-related Wikidata concept to extract Person, Location & Organization
    • Health-related Wikidata concept” to extract Disease, Symptom and Drug
    • Trankit NER to extract Person, Location, Organization
    • Spacy NER to extract Person, Location, Organization
    • A pipeline combining Spacy NER and Wikidata concept
  • Check the box to receive an email when the job is completed
  • Annotate either “All documents“, “Dataset” or “Search result list
  • Once the job is completed, the Documents view is automatically refreshed with the new annotations.
  • You can also annotate your documents with an existing project you have access to.

  • In case you want to remove the automatically generated information: