How to experiment categorization with models (advanced)?

With a sufficient number of annotations available, the goal is to create and experiment with models with the help of pre-configured and state-of-the-art machine or deep learning algorithms. The models learn from existing annotations. Models need to be compared in order to select the best one for automating this task in the future on new documents.

  • Go to Model experiments view
  • Split dataset
    • We recommend to generate train/test metadata on the dataset to create train/test datasets that can be easily be compared when using different experiments. 
    • Alternatively use the “Shuffle: true” parameter. This parameter creates a random split at each training run. Changes in training success may then come from properties of these random splits, especially when the number of documents is small.
  • Create a new experiment
  • Select the algorithm for the experiment
  • Set training and test parameters:
    • We recommend to generate train/test metadata on the dataset to create train/test datasets that can be easily be compared when using different experiments. 
    • Alternatively use the “Shuffle: true” parameter. This parameter creates a random split at each training run. Changes in training success may then come from properties of these random splits, especially when the number of documents is small.
  • And adjust model parameters
  • Save your model experiment at the bottom of the page
  • Launch the experiment
  • Evaluate the overall quality of the models and the quality for each model by ticking the quality tick-box.
  • You can select different options to evaluate the quality of the models: F-measure, precision, recall and number of examples. 
  • Improve the labels of the dataset with a low quality by focusing your annotation efforts on these weaker labels.