How to experiment entity detection with models (advanced)?

With a sufficient number of annotations available, the goal is to create and  experiment with models with the help of pre-configured and state-of-the-art machine or deep learning algorithms. The models learn from existing annotations. Models need to be compared in order to select the best one for automating this task in the future on new documents.

  • Go to Model Experiments view
  • Split dataset
    • We recommend to generate train/test metadata on the dataset to create train/test datasets that can be easily be compared when using different experiments. 
    • Alternatively use the “Shuffle: true” parameter. This parameter creates a random split at each training run. Changes in training success may then come from properties of these random splits, especially when the number of documents is small.
  • Create a new experiment
  • Select the algorithm for the experiment
  • Show advanced parameters
  • Adjust the training options
  • Adjust the algorithm training parameters (see also What are word embeddings?)
    • Select “Embeddings” when appropriate to benefit from pre-computed semantic properties of words such as BERT transformers or flair embeddings. Adding embeddings will in many cases increase the quality but will also result in a longer computation time for the training of the model.
  • Save your experiment
  • Launch the model experiment
  • Evaluate the overall quality of the models and the quality for each model by ticking the quality tick-box.
  • You can select different options to evaluate the quality of the models: F-measure, precision, recall and number of examples. 
  • Improve the labels of the dataset with a low quality by focusing your annotation efforts on these weaker labels.