User can annotate the documents in their project with off-the-shelf models or custom-built models & pipelines. A lot of time is gained with this build-in option to annotate a set of documents.
This allows you add filters in Question-answering projects, pre annotate raw text when building a dataset, compare manual & automated annotations for dataset optimization…
- If you want to use an existing model in a project you have access
- Go to the main project menu
- Click on Automatically annotate “with a project“
- Select a project
- Select the model or pipeline (Annotator) you want to use
- You can annotate “all documents“, “search result list” or “dataset” (dataset is all documents or segments with annotations)
- Check the box to receive an email when the job is completed because this may take time
- Once the job is completed, the Documents view is automatically refreshed with the new annotations.
If you want to annotate your set of documents with a predefined annotator
- Go to the main project menu
- Clic on Automatically annotate “with a predefined annotator“
- There are a number of pre-configured annotators:
- All Wikidata concept
- Media-related Wikidata concept to extract Person, Location & Organization
- Health-related Wikidata concept” to extract Disease, Symptom and Drug
- Trankit NER to extract Person, Location, Organization
- Spacy NER to extract Person, Location, Organization
- A pipeline combining Spacy NER and Wikidata concept
- Annotate either “All documents“, “Dataset” or “Search result list“
- Check the box to receive an email when the job is completed
- In case you want to remove the automatically generated information: