In your question-answering project, if you are not fully satisfied with the search retriever leveraging the default vectorizer, you can use and experiment different vectorizers to see which perform best.
- Prepare a set of question answer in a JSON file with the format below. This dataset will be considered as your GOLD dataset.
- Important notes:
- all answers should be in your corpus of segments.
- if you have a FAQ, you can import it even if the segments don’t contain the exact answer as formulated in the FAQ: the “Reference” retriever’s search plain text (see below) should still bring it back.
[
{
"text": "My first question?",
"identifier": "question1",
"metadata": {
"question": "true"
},
"altTexts": [
{
"name": "answer",
"text": "My first answer"
}
]
},
{
"text": "My second question?",
"identifier": "question2",
"metadata": {
"question": "true"
},
"altTexts": [
{
"name": "answer",
"text": "My second answer"
}
]
},
{
"text": "My third question?",
"identifier": "question3",
"metadata": {
"question": "true"
},
"altTexts": [
{
"name": "answer",
"text": "My third answer"
}
]
}
]
- Go to the Document view
- Import the JSON file in your project as you would import documents

- Go the Processing menue
- Create a first “Retriever” component

- Use the default configuration

- Configure the “Search parameter builder” component

- Give the name “Reference”
- keep all default values except the Size (Maximum number of hits to be returned) = 1
- Apply Size = 3 if the segments don’t contain the exact answer as formulated in the FAQ.
- Don’t forget to Save

- Create a new “Retriever”
- Select
- the Size (Maximum number of hits to be returned) = for instance 10
- the Search type you want to test (full-text, vector or hybrid search)
- the Vectorizer if you want to test vector or hybrid search…
- Don’t forget to Save


- Go to the Search Ranking Experiment view
- Create a new “Experiment”

- The default engine is “SearchRanking”
- Select the “Reference” search parameter
- Select the search parameter to be tested
- Don’t forget to Save

- Launch the experiment

- Access the different quality metrics

- If you want to experiment with different vectorizers, you have to create and configure them as below: