{"id":2606,"date":"2023-01-11T17:09:32","date_gmt":"2023-01-11T16:09:32","guid":{"rendered":"https:\/\/kairntech.com\/doc\/?page_id=2606"},"modified":"2025-05-06T16:34:47","modified_gmt":"2025-05-06T14:34:47","slug":"methodology-guide","status":"publish","type":"page","link":"https:\/\/kairntech.com\/doc\/methodology-guide\/","title":{"rendered":"Methodology Guide"},"content":{"rendered":"\n<h1 class=\"wp-block-heading\">Introduction<\/h1>\n\n\n\n<p>The following methodology guide sections are presented:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"#Text-classification\" data-type=\"internal\" data-id=\"#Text-classification\">Text classification<\/a><\/li>\n\n\n\n<li><a href=\"#Entity-detection\" data-type=\"internal\" data-id=\"#Entity-detection\">Entity detection<\/a><\/li>\n<\/ul>\n\n\n\n<p>First of all, define the type of project:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Text classification (<a href=\"https:\/\/kairntech.com\/doc\/faq\/#what-is-text-classification\" data-type=\"URL\" data-id=\"https:\/\/kairntech.com\/doc\/faq\/#what-is-text-classification\">what is text classification?<\/a>)<\/li>\n\n\n\n<li>Entity detection (<a data-type=\"URL\" data-id=\"https:\/\/kairntech.com\/doc\/faq\/#what-is-entity-detection\" href=\"https:\/\/kairntech.com\/doc\/faq\/#what-is-entity-detection\">what is entity detection<\/a>?)<\/li>\n\n\n\n<li>Both text classification and entity detection<\/li>\n<\/ul>\n\n\n\n<p>In the latter case, create two separate projects and combine the outcomes in a pipeline: one project can use a model from another project.<\/p>\n\n\n\n<p>Although it is possible to create <strong>multilingual projects<\/strong>, it is recommended to create <strong>one project for each language<\/strong> to obtain the best possible results. In this case,  the language of the documents needs to be defined.<\/p>\n\n\n\n<div style=\"height:21px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<hr class=\"wp-block-separator aligncenter has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\" id=\"Text-classification\">Text classification<\/h1>\n\n\n\n<h2 class=\"wp-block-heading\">Step 1: Initiate the project<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/kairntech.com\/doc\/how-to-create-a-project\/\">Create<\/a>&nbsp;the project (type=Text classification)<\/li>\n\n\n\n<li><a href=\"https:\/\/kairntech.com\/doc\/how-to-upload-documents\/\">Upload<\/a>&nbsp;documents<\/li>\n\n\n\n<li>Go to the&nbsp;<em>Documents<\/em>&nbsp;view<\/li>\n\n\n\n<li><a href=\"https:\/\/kairntech.com\/doc\/how-to-inspect-documents\/\">Inspect<\/a>&nbsp;documents to see what they look like, and explore differences<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"495\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-14-1024x495.png\" alt=\"\" class=\"wp-image-2927\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-14-1024x495.png 1024w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-14-300x145.png 300w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-14-768x371.png 768w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-14-1536x743.png 1536w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-14.png 1928w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\">Step 2: Define labels<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Go to the&nbsp;<em>Labels<\/em>&nbsp;view<\/li>\n\n\n\n<li><a href=\"https:\/\/kairntech.com\/doc\/how-to-create-labels\/\">Create<\/a>&nbsp;labels\n<ul class=\"wp-block-list\">\n<li>One&nbsp;<strong>label&nbsp;<\/strong>= one&nbsp;<strong>category<\/strong><\/li>\n\n\n\n<li>Think whether a document could belong to&nbsp;<strong>one or several categories<\/strong><\/li>\n\n\n\n<li>It is possible to create a label to obtain better results even if the label is not used to create a classification model<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><a href=\"https:\/\/kairntech.com\/doc\/how-to-define-labeling-guidelines\/\">Write<\/a>&nbsp;annotation guidelines for each label (recommended)<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"498\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-15-1024x498.png\" alt=\"\" class=\"wp-image-2929\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-15-1024x498.png 1024w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-15-300x146.png 300w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-15-768x373.png 768w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-15-1536x747.png 1536w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-15.png 1925w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\">Step 3: Pre-annotate documents (optional)<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Why using automatic pre annotation?\n<ul class=\"wp-block-list\">\n<li>Pre-annotation with an <strong>off-the-shelf model<\/strong> or <strong>NLP pipeline<\/strong> saves time because a first version of the training dataset is created automatically<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><a href=\"https:\/\/kairntech.com\/doc\/how-to-automatically-analyse-new-documents\/\">Pre-annotate<\/a>&nbsp;documents\n<ul class=\"wp-block-list\">\n<li>At least some of the&nbsp;<strong>labels&nbsp;<\/strong>(categories) of the existing model\/NLP pipeline should perfectly match the labels you want to create<\/li>\n\n\n\n<li>Pre-annotate a&nbsp;<strong>small number of documents<\/strong>&nbsp;(say 50) to start with, because you all annotations need to be reviewed individually in order to create a high quality dataset<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Select \u201c<strong>Labelled\u201d<\/strong>&nbsp;in the filter \u201c<strong>Status<\/strong>\u201c to access the&nbsp;<strong>dataset<\/strong><\/li>\n\n\n\n<li>Please note:\n<ul class=\"wp-block-list\">\n<li>The&nbsp;<strong>dataset&nbsp;<\/strong>consists of&nbsp;<strong>all labelled documents<\/strong><\/li>\n\n\n\n<li>Useless labels (categories) can be&nbsp;<a href=\"https:\/\/kairntech.com\/doc\/how-to-remove-existing-annotations\/\">deleted<\/a>&nbsp;together with their annotations&nbsp;in the&nbsp;<em>Labels&nbsp;<\/em>view<\/li>\n\n\n\n<li>When creating new labels, review all the annotated documents to complete any missing annotations (only for multi-category projects)<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"498\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-16-1024x498.png\" alt=\"\" class=\"wp-image-2931\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-16-1024x498.png 1024w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-16-300x146.png 300w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-16-768x373.png 768w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-16-1536x746.png 1536w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-16.png 1922w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\">Step 4: Annotate documents<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/kairntech.com\/doc\/how-to-annotate-categories-manually\/\">Annotate<\/a>&nbsp;at the document level\n<ul class=\"wp-block-list\">\n<li>Single or multiple categories<\/li>\n\n\n\n<li>At least&nbsp;<strong>10 to 15 annotations per label (category)<\/strong>, following the annotation guidelines<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Continue even after the first appearance of the blue \u201cpop up\u201c announcing that suggestions are available<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-image size-large is-style-default\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"500\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-90-1024x500.png\" alt=\"\" class=\"wp-image-2645\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-90-1024x500.png 1024w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-90-300x146.png 300w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-90-768x375.png 768w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-90.png 1448w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Step 5: Use the suggestion engine<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Why using the suggestion engine?\n<ul class=\"wp-block-list\">\n<li>To speed up the dataset creation<\/li>\n\n\n\n<li>To quickly assess the machine\u2019s ability to learn<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Go to the&nbsp;<em>Suggestio<\/em>ns view\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/kairntech.com\/doc\/how-to-use-categories-suggestions\/\">Accept\/correct<\/a>&nbsp;the suggested categories then validate the document. It will be added to the dataset with its categorie(s).<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Manage suggestions\n<ul class=\"wp-block-list\">\n<li>Sort them according to their&nbsp;<strong>confidence level score<\/strong><\/li>\n\n\n\n<li><strong>Filter&nbsp;<\/strong>the list on the label (category) you want to work on<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Please note:\n<ul class=\"wp-block-list\">\n<li>The suggestion engine is updated after a few validated suggestions<\/li>\n\n\n\n<li>The suggestion engine is based on a machine learning algorithm with a fast training time (but which will not necessarily provide the best results)<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"499\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-92-1024x499.png\" alt=\"\" class=\"wp-image-2649\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-92-1024x499.png 1024w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-92-300x146.png 300w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-92-768x374.png 768w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-92.png 1452w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\">Step 6: Review the dataset<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Why reviewing the dataset?\n<ul class=\"wp-block-list\">\n<li>Dataset quality is essential to create the best possible model<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Go to the&nbsp;<em>Labels&nbsp;<\/em>view\n<ul class=\"wp-block-list\">\n<li>Make sure the annotations are evenly distributed over the labels \u2026 as much as possible<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"684\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-62-1024x684.png\" alt=\"\" class=\"wp-image-2426\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-62-1024x684.png 1024w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-62-300x200.png 300w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-62-768x513.png 768w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-62.png 1219w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n<ul class=\"wp-block-list\">\n<li>Go to the&nbsp;<em>Documents&nbsp;<\/em>view<\/li>\n\n\n\n<li>Select \u201c<strong>Labelled\u201d<\/strong>&nbsp;in the \u201c<strong>Status<\/strong>\u201c filter to access the&nbsp;<strong>dataset<\/strong><\/li>\n\n\n\n<li>The dataset must be as accurate as possible: without false or missing categories and no&nbsp;<a data-type=\"link\" data-id=\"https:\/\/kairntech.com\/doc\/faq\/#what-is-an-inconsistent-annotation\" href=\"https:\/\/kairntech.com\/doc\/faq\/#what-is-an-inconsistent-annotation\">inconsistencies<\/a>&nbsp;between categories<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"498\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-93-1024x498.png\" alt=\"\" class=\"wp-image-2651\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-93-1024x498.png 1024w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-93-300x146.png 300w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-93-768x374.png 768w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-93-1536x748.png 1536w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-93.png 1923w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\">Step 7: Split the dataset<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Why spliting dataset?\n<ul class=\"wp-block-list\">\n<li>To make sure we will use the same&nbsp;<strong>training&nbsp;<\/strong>and&nbsp;<strong>test&nbsp;<\/strong>sets to compare different model experiments<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Go to the&nbsp;<em>Model experiments&nbsp;<\/em>view<\/li>\n\n\n\n<li><a href=\"https:\/\/kairntech.com\/doc\/how-to-generate-train-and-test-metadata\/\">Split the dataset<\/a>&nbsp;by generating&nbsp;<strong>train\/test metadata<\/strong>&nbsp;on the dataset<\/li>\n\n\n\n<li>Note:\n<ul class=\"wp-block-list\">\n<li>If new annotations are added to the dataset, the split will be automatically updated when launching a new experiment<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"588\" height=\"314\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-17.png\" alt=\"\" class=\"wp-image-2933\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-17.png 588w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-17-300x160.png 300w\" sizes=\"auto, (max-width: 588px) 100vw, 588px\" \/><\/figure>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\">Step 8: Train models<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Go to the&nbsp;<em>Model Experiments<\/em>&nbsp;view<\/li>\n\n\n\n<li>Edit each predefined experiment, check the training options to use <strong>train <\/strong>&amp; <strong>test <\/strong>metatada on &#8220;train_on&#8221; and &#8220;test_on&#8221; parameters<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1000\" height=\"646\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2025\/05\/image-1.png\" alt=\"\" class=\"wp-image-4887\" style=\"width:723px;height:auto\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2025\/05\/image-1.png 1000w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2025\/05\/image-1-300x194.png 300w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2025\/05\/image-1-768x496.png 768w\" sizes=\"auto, (max-width: 1000px) 100vw, 1000px\" \/><\/figure>\n<\/div>\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/kairntech.com\/doc\/how-to-experiment-categorization-with-models\/\">Launch<\/a>\u00a0the predefined experiments<\/li>\n\n\n\n<li>Check the quality (F-measure) of each experiment and identify the best model<\/li>\n\n\n\n<li>Note:\n<ul class=\"wp-block-list\">\n<li>If the F-measure is&nbsp;<strong>below 60%<\/strong>&nbsp;<strong>quality<\/strong>,&nbsp;<strong>enrich and improve the dataset<\/strong>&nbsp;by iteration (see next steps below)<\/li>\n\n\n\n<li>Do not create new experiments to test different algorithms if the F-measure is below 60%, it is useless at this stage<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"316\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-18-1024x316.png\" alt=\"\" class=\"wp-image-2935\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-18-1024x316.png 1024w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-18-300x93.png 300w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-18-768x237.png 768w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-18-1536x475.png 1536w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-18.png 1926w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\">Step 9: Iterate steps 4-5-6 above to achieve 60% accuracy<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>In the&nbsp;<em>Model experiments&nbsp;<\/em>view\n<ul class=\"wp-block-list\">\n<li>Identify labels with&nbsp;<strong>low quality<\/strong>&nbsp;by ticking the quality box<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Enrich the dataset on these labels either:\n<ul class=\"wp-block-list\">\n<li>with new manually annotated segments (see above: 4 \u2013 Annotate text)<\/li>\n\n\n\n<li>or by using <em>Suggestions<\/em> (see above: 5 \u2013 Use the suggestion engine)<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>In the&nbsp;<em>Model experiments<\/em>&nbsp;view\n<ul class=\"wp-block-list\">\n<li>Run the experiment again and see if the accuracy of the model has improved for each label<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Iterate<\/strong>\u2026 until achieving at least a <strong>60%<\/strong>&nbsp;<strong>accuracy&nbsp;<\/strong>for each label<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"498\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-4-1024x498.png\" alt=\"\" class=\"wp-image-1971\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-4-1024x498.png 1024w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-4-300x146.png 300w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-4-768x374.png 768w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-4-1536x747.png 1536w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-4.png 1924w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\">Step 10: Annotate the dataset automatically<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Why annotating the dataset?\n<ul class=\"wp-block-list\">\n<li>To test the&nbsp;<strong>dataset and model quality<\/strong><\/li>\n\n\n\n<li>To detect possible&nbsp;<strong>discrepancies<\/strong><\/li>\n\n\n\n<li>It is only useful if model accuracy is&nbsp;<strong>above 60%<\/strong><\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Go to the&nbsp;<em>Documents&nbsp;<\/em>view<\/li>\n\n\n\n<li><a href=\"https:\/\/kairntech.com\/doc\/how-to-automatically-analyse-new-documents\/\">Run<\/a>&nbsp;an automatic annotation of the dataset with the model<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"643\" height=\"580\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-6.png\" alt=\"\" class=\"wp-image-1974\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-6.png 643w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-6-300x271.png 300w\" sizes=\"auto, (max-width: 643px) 100vw, 643px\" \/><\/figure>\n<\/div>\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"862\" height=\"443\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-5.png\" alt=\"\" class=\"wp-image-1973\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-5.png 862w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-5-300x154.png 300w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-5-768x395.png 768w\" sizes=\"auto, (max-width: 862px) 100vw, 862px\" \/><\/figure>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\">Step 11: Identify discrepancies<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Go to the&nbsp;<em>Documents&nbsp;<\/em>view<\/li>\n\n\n\n<li>Open the filter \u201c<strong>Agreement: automatic-other<\/strong>\u201d<\/li>\n\n\n\n<li>Select &#8220;<strong>Disagreement<\/strong>&#8220;<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"498\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-19-1024x498.png\" alt=\"\" class=\"wp-image-2937\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-19-1024x498.png 1024w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-19-300x146.png 300w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-19-768x374.png 768w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-19-1536x747.png 1536w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-19.png 1924w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n<ul class=\"wp-block-list\">\n<li>Check the origin of the annotation with the letter or the tooltip on the chips<\/li>\n\n\n\n<li>If the model is right after all, <strong>correct <\/strong>the dataset manually.<\/li>\n\n\n\n<li>When corrections are made, remove the automatic annotations from the model.<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"531\" height=\"629\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-20.png\" alt=\"\" class=\"wp-image-2940\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-20.png 531w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-20-253x300.png 253w\" sizes=\"auto, (max-width: 531px) 100vw, 531px\" \/><\/figure>\n<\/div>\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"654\" height=\"309\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-21.png\" alt=\"\" class=\"wp-image-2941\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-21.png 654w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-21-300x142.png 300w\" sizes=\"auto, (max-width: 654px) 100vw, 654px\" \/><\/figure>\n<\/div>\n\n\n<ul class=\"wp-block-list\">\n<li>Re-train the model. You will improve the model&#8217;s precision. <\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Step 12: Train the final model<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Why selecting a final model?\n<ul class=\"wp-block-list\">\n<li>To compare different algorithms to judge their accuracy<\/li>\n\n\n\n<li>Probably neither the suggestion model nor the pre-packaged experiments result in the best model. In this case, it is necessary to experiment to select the final model .<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Go to the&nbsp;<em>Model experiments<\/em>&nbsp;view<\/li>\n\n\n\n<li><a href=\"https:\/\/kairntech.com\/doc\/how-to-experiment-categorization-with-models-advanced\/\">Create<\/a>&nbsp;additional model experiments, launch models and compare the quality (F-measure)<\/li>\n\n\n\n<li>Note:\n<ul class=\"wp-block-list\">\n<li>The goal is to achieve an accuracy between 80% and 95% (F-measure)<\/li>\n\n\n\n<li>Don\u2019t expect to achieve a 100% accuracy\u2026 but you might reach this in some simple cases<\/li>\n\n\n\n<li>Performance might be as important as accuracy, in which case it is not necessarily the highest quality model that is selected as the final model<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<div style=\"height:74px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<hr class=\"wp-block-separator aligncenter has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\" id=\"Entity-detection\">Entity detection<\/h1>\n\n\n\n<h2 class=\"wp-block-heading\">Step 1: Initiate the project<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/kairntech.com\/doc\/how-to-create-a-project\/\">Create<\/a>&nbsp;the project (type=Entity detection)<\/li>\n\n\n\n<li><a href=\"https:\/\/kairntech.com\/doc\/how-to-upload-documents\/\">Upload<\/a>&nbsp;documents.\n<ul class=\"wp-block-list\">\n<li>If documents are short (a few sentences), it is not necessary to create segments.<\/li>\n\n\n\n<li>Otherwise, use the default segmentation engine to start with.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><a href=\"https:\/\/kairntech.com\/doc\/how-to-inspect-documents\/\">Inspect<\/a>&nbsp;documents and segments\n<ul class=\"wp-block-list\">\n<li>Go to the&nbsp;<em>Document<\/em>s view and read several documents to see what they look like, how they differ<\/li>\n\n\n\n<li>Go to the&nbsp;<em>Segments&nbsp;<\/em>view and check if segmentation is good and appropriate. A different and better segmentation may be necessary. In this case use either an&nbsp;<a href=\"https:\/\/kairntech.com\/doc\/how-to-configure-a-segmenter\/\">off-the-shelf segmenter<\/a>&nbsp;or build a&nbsp;<a href=\"https:\/\/kairntech.com\/doc\/how-to-create-a-custom-segmentation-pipeline\/\">custom segmentation pipeline<\/a>.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"616\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-22-1024x616.png\" alt=\"\" class=\"wp-image-2945\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-22-1024x616.png 1024w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-22-300x181.png 300w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-22-768x462.png 768w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-22.png 1180w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption class=\"wp-element-caption\">Document view<\/figcaption><\/figure>\n<\/div>\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"459\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-23-1024x459.png\" alt=\"\" class=\"wp-image-2947\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-23-1024x459.png 1024w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-23-300x134.png 300w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-23-768x344.png 768w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-23.png 1277w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption class=\"wp-element-caption\">Segment view<\/figcaption><\/figure>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\">Step 2: Define labels<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>What is a label?\n<ul class=\"wp-block-list\">\n<li>A&nbsp;<strong>label&nbsp;<\/strong>describes a&nbsp;<strong>concept&nbsp;<\/strong>(or an entity type)<\/li>\n\n\n\n<li>Creating a label means that text will be annotated with that label (think about positive and negative examples of the concept to be added as annotation guidelines).<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Go to the&nbsp;<em>Labels&nbsp;<\/em>view<\/li>\n\n\n\n<li><a href=\"https:\/\/kairntech.com\/doc\/how-to-create-labels\/\">Create<\/a>&nbsp;labels\n<ul class=\"wp-block-list\">\n<li>It is possible to create a label to obtain better results even if the label is not used to create a model<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><a href=\"https:\/\/kairntech.com\/doc\/how-to-define-labeling-guidelines\/\">Write<\/a>&nbsp;annotation guidelines for each label (recommended)<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"412\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-24-1024x412.png\" alt=\"\" class=\"wp-image-2949\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-24-1024x412.png 1024w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-24-300x121.png 300w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-24-768x309.png 768w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-24-1536x617.png 1536w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-24.png 1923w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\">Step 3: Pre-annotate documents (optional)<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Why using automatic pre-annotation?\n<ul class=\"wp-block-list\">\n<li>Pre-annotation by using an <strong>off-the-shelf model<\/strong> or <strong>NLP pipeline<\/strong> can save time in creating a dataset<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><a href=\"https:\/\/kairntech.com\/doc\/how-to-automatically-analyse-new-documents\/\">Pre-annotate<\/a>&nbsp;documents\n<ul class=\"wp-block-list\">\n<li>At least some of the&nbsp;<strong>labels&nbsp;<\/strong>of the model\/NLP pipeline that is used to pre-annotate should perfectly match the labels of the project<\/li>\n\n\n\n<li>Pre-annotate a&nbsp;<strong>small number of documents<\/strong>&nbsp;(say 20 to 50) to start with, because all annotations need to be reviewed to create and optimize the dataset<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Please note:\n<ul class=\"wp-block-list\">\n<li>The&nbsp;<strong>dataset&nbsp;<\/strong>consists of&nbsp;<strong>all annotated (labelled) segments<\/strong><\/li>\n\n\n\n<li>Useless labels and associated annotations can be <a href=\"https:\/\/kairntech.com\/doc\/how-to-remove-existing-annotations\/\">deleted<\/a>&nbsp;by using the&nbsp;<em>Labels&nbsp;<\/em>view<\/li>\n\n\n\n<li>When creating a new labels, all segments that are already annotated need to be reviewed<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-image size-large is-style-default\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"307\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-99-1024x307.png\" alt=\"\" class=\"wp-image-2663\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-99-1024x307.png 1024w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-99-300x90.png 300w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-99-768x230.png 768w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-99-1536x460.png 1536w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-99.png 1924w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Step 4: Annotate text<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Go to the&nbsp;<em>Segments<\/em>&nbsp;view<\/li>\n\n\n\n<li><a href=\"https:\/\/kairntech.com\/doc\/how-to-annotate-entities-manually\/\">Annotate text<\/a>\n<ul class=\"wp-block-list\">\n<li>At least&nbsp;<strong>10 to 15 annotations per label<\/strong>, following the annotation guidelines<\/li>\n\n\n\n<li>Continue even after the first appearance of the blue \u201cpop up\u201d appears announcing available suggestions<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>If context is lacking\n<ul class=\"wp-block-list\">\n<li>Switch to the&nbsp;<em>Documents<\/em>&nbsp;view<\/li>\n\n\n\n<li>Possibly reconsider the&nbsp;<strong>segmentation of the document<\/strong>&nbsp;by using an&nbsp;<a href=\"https:\/\/kairntech.com\/doc\/how-to-configure-a-segmenter\/\">off-the-shelf segmenter<\/a>&nbsp;or by creating a&nbsp;<a href=\"https:\/\/kairntech.com\/doc\/how-to-create-a-custom-segmentation-pipeline\/\">custom segmentation pipeline<\/a><\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Notes:\n<ul class=\"wp-block-list\">\n<li>Review carefully the annotated segments to&nbsp;<strong>avoid false, inconsistent or missing annotations<\/strong>&nbsp;<\/li>\n\n\n\n<li>It is <strong>better&nbsp;<\/strong>to have&nbsp;<strong>few annotations without errors and inconsistencies<\/strong>&nbsp;than many annotations with possible errors and inconsistencies<\/li>\n\n\n\n<li>Segments must be annotated&nbsp;<strong>consistently<\/strong><\/li>\n\n\n\n<li>If an entity is not annotated when it should be, it will be considered as a counter example and confuse the algorithm and consequently lower its quality<\/li>\n\n\n\n<li>The&nbsp;<strong>dataset<\/strong>&nbsp;consists of&nbsp;<strong>all annotated (labelled) segments<\/strong><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-image size-large is-style-default\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"310\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-100-1024x310.png\" alt=\"\" class=\"wp-image-2665\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-100-1024x310.png 1024w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-100-300x91.png 300w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-100-768x233.png 768w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-100.png 1459w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Step 5: Use the suggestion engine<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Why using the suggestion engine?\n<ul class=\"wp-block-list\">\n<li>To speed up dataset creation<\/li>\n\n\n\n<li>To quickly assess the machine\u2019s ability to learn<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Go to the&nbsp;<em>Suggestions&nbsp;<\/em>view\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/kairntech.com\/doc\/how-to-use-entities-suggestions\/\">Accept\/correct\/reject<\/a>&nbsp;the suggested annotations then validate the segment<\/li>\n\n\n\n<li>Each validated segment will be added to the dataset together with its annotations<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Manage suggestions\n<ul class=\"wp-block-list\">\n<li>Sort suggestions according to their&nbsp;<strong>confidence level score<\/strong>\n<ul class=\"wp-block-list\">\n<li>Use \u201chigh confidence\u201d score to assess the machine\u2019s ability to learn<\/li>\n\n\n\n<li>Use the \u201cmargin sampling\u201d or \u201clow confidence\u201d score to handle the segments where the machine has the most difficulty<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Filter<\/strong>&nbsp;the suggestions on the labels you want to work on<\/li>\n\n\n\n<li>If the context of the segment is insufficient to validate a suggestion,&nbsp;<strong>increase the context<\/strong>&nbsp;or click on the title to access the document (possibly reconsider the segmentation of documents)<\/li>\n\n\n\n<li>If you reject all the suggestions and finally validate the segment, it will be added to the dataset, and thus be considered as a&nbsp;<strong>counter example<\/strong>. It can be very effective to add counter examples to a dataset to improve the accuracy of the final model.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Note:\n<ul class=\"wp-block-list\">\n<li>The suggestion engine is updated after a few validations<\/li>\n\n\n\n<li>The suggestion engine is based on a machine learning algorithm with a fast training time (but which will not necessarily provide the best results)<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"502\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-102-1024x502.png\" alt=\"\" class=\"wp-image-2669\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-102-1024x502.png 1024w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-102-300x147.png 300w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-102-768x377.png 768w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-102.png 1448w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\">Step 6: Review the Dataset<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Why reviewing the dataset?\n<ul class=\"wp-block-list\">\n<li>Dataset quality is essential to create the best possible model<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Go to the&nbsp;<em>Labels&nbsp;<\/em>view\n<ul class=\"wp-block-list\">\n<li>Make sure the annotations are evenly distributed over the labels\u2026 as much as possible<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"711\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-56-1024x711.png\" alt=\"\" class=\"wp-image-2319\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-56-1024x711.png 1024w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-56-300x208.png 300w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-56-768x534.png 768w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-56.png 1130w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n<ul class=\"wp-block-list\">\n<li>Go to the&nbsp;<em>Segments<\/em>&nbsp;view\n<ul class=\"wp-block-list\">\n<li>Filter the segments on&nbsp;<strong>Status<\/strong>=\u201d<strong>Labelled<\/strong>\u201c<\/li>\n\n\n\n<li>You see the&nbsp;<strong>dataset<\/strong>&nbsp;which consists of<strong>&nbsp;all annotated (labelled) segments<\/strong><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"434\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-55-1024x434.png\" alt=\"\" class=\"wp-image-2317\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-55-1024x434.png 1024w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-55-300x127.png 300w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-55-768x325.png 768w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-55.png 1499w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n<ul class=\"wp-block-list\">\n<li>In the&nbsp;<em>Segments&nbsp;<\/em>view\n<ul class=\"wp-block-list\">\n<li>Select the&nbsp;<strong>label&nbsp;<\/strong>within the filter \u201c<strong>Label name<\/strong>\u201c<\/li>\n\n\n\n<li>Then detect possible&nbsp;<strong>false annotations<\/strong>&nbsp;on this label<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"349\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-13-1024x349.png\" alt=\"\" class=\"wp-image-2010\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-13-1024x349.png 1024w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-13-300x102.png 300w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-13-768x262.png 768w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-13.png 1504w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n<ul class=\"wp-block-list\">\n<li>In the&nbsp;<em>Segments&nbsp;<\/em>view\n<ul class=\"wp-block-list\">\n<li>Apply the&nbsp;<strong>\u201cexclusive\u201d&nbsp;<\/strong>mode on the filter displayed at the top by selecting the red icon next the label. <\/li>\n\n\n\n<li>Then detect possible&nbsp;<strong>missing annotations<\/strong>&nbsp;on this label<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"320\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-14-1024x320.png\" alt=\"\" class=\"wp-image-2012\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-14-1024x320.png 1024w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-14-300x94.png 300w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-14-768x240.png 768w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-14.png 1510w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n<ul class=\"wp-block-list\">\n<li>Note:\n<ul class=\"wp-block-list\">\n<li>The dataset must be&nbsp;<strong>as accurate as possible<\/strong>: without false or missing annotations, and without <a href=\"https:\/\/kairntech.com\/doc\/faq\/#what-is-an-inconsistent-annotation\" data-type=\"link\" data-id=\"https:\/\/kairntech.com\/doc\/faq\/#what-is-an-inconsistencyclassification\">inconsistencies<\/a>!<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Step 7: Split the dataset<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Why splitting a dataset?\n<ul class=\"wp-block-list\">\n<li>To make sure to use the same&nbsp;<strong>training&nbsp;<\/strong>and&nbsp;<strong>test&nbsp;<\/strong>sets to compare different model experiments<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Go to the&nbsp;<em>Model experiments&nbsp;<\/em>view\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/kairntech.com\/doc\/how-to-generate-train-and-test-metadata\/\">Split the dataset<\/a>&nbsp;by generating&nbsp;<strong>train\/test metadata<\/strong>&nbsp;on the dataset<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Note\n<ul class=\"wp-block-list\">\n<li>If you add new annotations to the dataset, the split will be automatically updated when launching a new experiment<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"438\" height=\"298\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-25.png\" alt=\"\" class=\"wp-image-2951\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-25.png 438w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-25-300x204.png 300w\" sizes=\"auto, (max-width: 438px) 100vw, 438px\" \/><\/figure>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\">Step 8: Train first models<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Go to the\u00a0<em>Model experiments<\/em>\u00a0view<\/li>\n\n\n\n<li>Edit each predefined experiment, check the training options to use <strong>train <\/strong>&amp; <strong>test <\/strong>metatada on &#8220;train_on&#8221; and &#8220;test_on&#8221; parameters<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"940\" height=\"644\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2025\/05\/image-2.png\" alt=\"\" class=\"wp-image-4894\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2025\/05\/image-2.png 940w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2025\/05\/image-2-300x206.png 300w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2025\/05\/image-2-768x526.png 768w\" sizes=\"auto, (max-width: 940px) 100vw, 940px\" \/><\/figure>\n<\/div>\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/kairntech.com\/doc\/how-to-experiment-entity-detection-with-models\/\">Launch<\/a>\u00a0the predefined model experiments<\/li>\n\n\n\n<li>Check the quality (F-measure) of each experiment and identify the best model<\/li>\n\n\n\n<li>Notes:\n<ul class=\"wp-block-list\">\n<li>If the F-measure is <strong>below 60%<\/strong>&nbsp;<strong>quality<\/strong>,&nbsp;<strong>enrich &amp; improve the dataset<\/strong>&nbsp;by iteration (see next steps below)<\/li>\n\n\n\n<li>Do not create new experiments to test different algorithms if the F-measure is below 60%, it is useless at this stage<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"308\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-26-1024x308.png\" alt=\"\" class=\"wp-image-2952\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-26-1024x308.png 1024w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-26-300x90.png 300w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-26-768x231.png 768w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-26-1536x462.png 1536w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-26.png 1923w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\">Step 9: Iterate steps 4-5-6 above to achieve 60% accuracy<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>In the&nbsp;<em>Model experiments<\/em>&nbsp;view\n<ul class=\"wp-block-list\">\n<li>Identify the labels having a&nbsp;<strong>low quality<\/strong>&nbsp;in the quality report<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Enrich the dataset on these labels either:\n<ul class=\"wp-block-list\">\n<li>with new manually annotated segments (see above 4 \u2013 Annotate text)<\/li>\n\n\n\n<li>or using the&nbsp;<em>Suggestions&nbsp;<\/em>view (see above 5 \u2013 Use the suggestion engine)<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>In the&nbsp;<em>Model experiments<\/em>&nbsp;view\n<ul class=\"wp-block-list\">\n<li>Run the experiment again and see if the accuracy of the model has improved for each label<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Iterate<\/strong>\u2026 until achieving a&nbsp;<strong>60%<\/strong>&nbsp;<strong>accuracy&nbsp;<\/strong>per label<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"498\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-27-1024x498.png\" alt=\"\" class=\"wp-image-2953\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-27-1024x498.png 1024w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-27-300x146.png 300w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-27-768x374.png 768w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-27-1536x748.png 1536w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-27.png 1913w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\">Step 10: Annotate the dataset automatically<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Why annotating the dataset?\n<ul class=\"wp-block-list\">\n<li>To test the&nbsp;<strong>model &amp; dataset quality<\/strong><\/li>\n\n\n\n<li>To detect possible&nbsp;<strong>discrepancies<\/strong><\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Go to the&nbsp;<em>Documents&nbsp;<\/em>view<\/li>\n\n\n\n<li><a href=\"https:\/\/kairntech.com\/doc\/how-to-automatically-analyse-new-documents\/\">Run<\/a>&nbsp;an automatic annotation of the dataset with the model<\/li>\n\n\n\n<li>Note\n<ul class=\"wp-block-list\">\n<li>This is only useful if the model accuracy is above 60%<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"580\" height=\"749\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-28.png\" alt=\"\" class=\"wp-image-2955\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-28.png 580w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-28-232x300.png 232w\" sizes=\"auto, (max-width: 580px) 100vw, 580px\" \/><\/figure>\n<\/div>\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"825\" height=\"441\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-49.png\" alt=\"\" class=\"wp-image-2291\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-49.png 825w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-49-300x160.png 300w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/01\/image-49-768x411.png 768w\" sizes=\"auto, (max-width: 825px) 100vw, 825px\" \/><\/figure>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\">Step 11: Identify discrepancies<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Go to the&nbsp;<em>Segments&nbsp;<\/em>view<\/li>\n\n\n\n<li>Select \u201c<strong>Disagreement\u201d<\/strong>&nbsp;in the filter \u201c<strong>Agreement: automatic-other<\/strong>\u201c<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"497\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-30-1024x497.png\" alt=\"\" class=\"wp-image-2959\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-30-1024x497.png 1024w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-30-300x145.png 300w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-30-768x372.png 768w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-30-1536x745.png 1536w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-30.png 1924w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n<ul class=\"wp-block-list\">\n<li>Look at each segment to understand the reason why there are discrepancies:\n<ul class=\"wp-block-list\">\n<li>If there is a mistake, delete the annotation<\/li>\n\n\n\n<li>It there is a pattern with no or very few examples in the dataset use the similarity search on the segment and enrich the dataset.<\/li>\n\n\n\n<li>If the quality of the text is bad (especially with PDF files converted into text), a solution could be to improve the quality of the converter.<\/li>\n\n\n\n<li>There is no real explanation<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>When you have finished the corrections, remove the automatic annotations from the model.<\/li>\n<\/ul>\n\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"746\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-31.png\" alt=\"\" class=\"wp-image-2964\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-31.png 624w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-31-251x300.png 251w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/figure>\n<\/div>\n\n<div class=\"wp-block-image is-style-default\">\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"671\" height=\"321\" src=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-32.png\" alt=\"\" class=\"wp-image-2965\" srcset=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-32.png 671w, https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-32-300x144.png 300w\" sizes=\"auto, (max-width: 671px) 100vw, 671px\" \/><\/figure>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\">Step 12: Train the final model<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Why creating a final model?\n<ul class=\"wp-block-list\">\n<li>You may want to compare different algorithms in terms of accuracy<\/li>\n\n\n\n<li>Probably neither the suggestion model nor the pre-packaged experiments the best model. In this case, it is necessary to experiment to find the best final model.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Go to the&nbsp;<em>Model experiments<\/em>&nbsp;view\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/kairntech.com\/doc\/how-to-experiment-entity-detection-with-models-advanced\/\">Create<\/a>&nbsp;new experiments and launch them to test different algorithms<\/li>\n\n\n\n<li>Compare quality (F-measure) between models<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Note:\n<ul class=\"wp-block-list\">\n<li>The goal is to achieve an accuracy between 80% and 95% (F-measure)<\/li>\n\n\n\n<li>Don\u2019t expect to achieve 100% accuracy\u2026 but you might achieve this in some simple cases<\/li>\n\n\n\n<li>Performance could be as important as accuracy, in which case you might not select the best model in terms of quality<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Introduction The following methodology guide sections are presented: First of all, define the type of project: In the latter case, create two separate projects and combine the outcomes in a pipeline: one project can use a model from another project. Although it is possible to create multilingual projects, it is recommended to create one project [&hellip;]<\/p>\n","protected":false},"author":6,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-2606","page","type-page","status-publish","hentry"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.8 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Methodology Guide - Kairntech Documentation<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/kairntech.com\/doc\/methodology-guide\/\" \/>\n<meta property=\"og:locale\" content=\"en_GB\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Methodology Guide - Kairntech Documentation\" \/>\n<meta property=\"og:description\" content=\"Introduction The following methodology guide sections are presented: First of all, define the type of project: In the latter case, create two separate projects and combine the outcomes in a pipeline: one project can use a model from another project. Although it is possible to create multilingual projects, it is recommended to create one project [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/kairntech.com\/doc\/methodology-guide\/\" \/>\n<meta property=\"og:site_name\" content=\"Kairntech Documentation\" \/>\n<meta property=\"article:modified_time\" content=\"2025-05-06T14:34:47+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-14.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1928\" \/>\n\t<meta property=\"og:image:height\" content=\"932\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Estimated reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"17 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/kairntech.com\/doc\/methodology-guide\/\",\"url\":\"https:\/\/kairntech.com\/doc\/methodology-guide\/\",\"name\":\"Methodology Guide - Kairntech Documentation\",\"isPartOf\":{\"@id\":\"https:\/\/kairntech.com\/doc\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/kairntech.com\/doc\/methodology-guide\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/kairntech.com\/doc\/methodology-guide\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-14-1024x495.png\",\"datePublished\":\"2023-01-11T16:09:32+00:00\",\"dateModified\":\"2025-05-06T14:34:47+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/kairntech.com\/doc\/methodology-guide\/#breadcrumb\"},\"inLanguage\":\"en-GB\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/kairntech.com\/doc\/methodology-guide\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-GB\",\"@id\":\"https:\/\/kairntech.com\/doc\/methodology-guide\/#primaryimage\",\"url\":\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-14.png\",\"contentUrl\":\"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-14.png\",\"width\":1928,\"height\":932},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/kairntech.com\/doc\/methodology-guide\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/kairntech.com\/doc\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Methodology Guide\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/kairntech.com\/doc\/#website\",\"url\":\"https:\/\/kairntech.com\/doc\/\",\"name\":\"Kairntech Documentation\",\"description\":\"All the information you need to use Kairntech Software, methodology,  user and installation guides.\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/kairntech.com\/doc\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-GB\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Methodology Guide - Kairntech Documentation","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/kairntech.com\/doc\/methodology-guide\/","og_locale":"en_GB","og_type":"article","og_title":"Methodology Guide - Kairntech Documentation","og_description":"Introduction The following methodology guide sections are presented: First of all, define the type of project: In the latter case, create two separate projects and combine the outcomes in a pipeline: one project can use a model from another project. Although it is possible to create multilingual projects, it is recommended to create one project [&hellip;]","og_url":"https:\/\/kairntech.com\/doc\/methodology-guide\/","og_site_name":"Kairntech Documentation","article_modified_time":"2025-05-06T14:34:47+00:00","og_image":[{"width":1928,"height":932,"url":"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-14.png","type":"image\/png"}],"twitter_card":"summary_large_image","twitter_misc":{"Estimated reading time":"17 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/kairntech.com\/doc\/methodology-guide\/","url":"https:\/\/kairntech.com\/doc\/methodology-guide\/","name":"Methodology Guide - Kairntech Documentation","isPartOf":{"@id":"https:\/\/kairntech.com\/doc\/#website"},"primaryImageOfPage":{"@id":"https:\/\/kairntech.com\/doc\/methodology-guide\/#primaryimage"},"image":{"@id":"https:\/\/kairntech.com\/doc\/methodology-guide\/#primaryimage"},"thumbnailUrl":"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-14-1024x495.png","datePublished":"2023-01-11T16:09:32+00:00","dateModified":"2025-05-06T14:34:47+00:00","breadcrumb":{"@id":"https:\/\/kairntech.com\/doc\/methodology-guide\/#breadcrumb"},"inLanguage":"en-GB","potentialAction":[{"@type":"ReadAction","target":["https:\/\/kairntech.com\/doc\/methodology-guide\/"]}]},{"@type":"ImageObject","inLanguage":"en-GB","@id":"https:\/\/kairntech.com\/doc\/methodology-guide\/#primaryimage","url":"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-14.png","contentUrl":"https:\/\/kairntech.com\/doc\/wp-content\/uploads\/sites\/2\/2023\/03\/image-14.png","width":1928,"height":932},{"@type":"BreadcrumbList","@id":"https:\/\/kairntech.com\/doc\/methodology-guide\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/kairntech.com\/doc\/"},{"@type":"ListItem","position":2,"name":"Methodology Guide"}]},{"@type":"WebSite","@id":"https:\/\/kairntech.com\/doc\/#website","url":"https:\/\/kairntech.com\/doc\/","name":"Kairntech Documentation","description":"All the information you need to use Kairntech Software, methodology,  user and installation guides.","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/kairntech.com\/doc\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-GB"}]}},"jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/kairntech.com\/doc\/wp-json\/wp\/v2\/pages\/2606","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/kairntech.com\/doc\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/kairntech.com\/doc\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/kairntech.com\/doc\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/kairntech.com\/doc\/wp-json\/wp\/v2\/comments?post=2606"}],"version-history":[{"count":72,"href":"https:\/\/kairntech.com\/doc\/wp-json\/wp\/v2\/pages\/2606\/revisions"}],"predecessor-version":[{"id":4898,"href":"https:\/\/kairntech.com\/doc\/wp-json\/wp\/v2\/pages\/2606\/revisions\/4898"}],"wp:attachment":[{"href":"https:\/\/kairntech.com\/doc\/wp-json\/wp\/v2\/media?parent=2606"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}