How to automatically annotate a set of documents?

User can annotate the documents in their project with off-the-shelf models or custom-built models & pipelines. A lot of time is gained with this build-in option to annotate a set of documents.

This allows you add filters in Question-answering projects, pre annotate raw text when building a dataset, compare manual & automated annotations for dataset optimization…

If you want to use an existing model in a project you have access
Go to the main project menu
Click on Automatically annotate With a project

Select a project
Select the model or pipeline (Annotator) you want to use

You can annotate “all documents“, “search result list” or “dataset” (dataset is all documents or segments with annotations)

If you have provided an email on your profile, you have the possibility to check a box to receive one when annotation is over (It can be long when there is a lot of documents).
Once the job is completed, the Documents view is automatically refreshed with the new annotations.

If you want to annotate your set of documents with a predefined annotator

Go to the main project menu
Clic on Automatically annotate with a predefined annotator

There are a number of pre-configured annotators:
- All Wikidata concept
- Media-related Wikidata concept to extract Person, Location & Organization
- Health-related Wikidata concept to extract Disease, Symptom and Drug
- Spacy NER to extract Person, Location, Organization
- A pipeline combining Spacy NER and Wikidata concept

Annotate either “All documents“, “Dataset” or “Search result list“
If you have provided an email on your profile, you have the possibility to check a box to receive one when annotation is over (It can be long when there is a lot of documents).

In case you want to remove the automatically generated information:

How to remove existing annotations?