Support Centre

Model Training & Maintenance

Guides on how to create, improve and maintain Models in Re:infer, using platform features such as Discover, Explore and Validation

Training using 'Search' (Discover)

User permissions required: ‘View Sources’ AND ‘Review and label’

 

Please Note: Users will be able to see verbatims in Discover if they have ‘View Sources’ AND see labels if they have ‘View labels’ permissions, but they will require the ‘Review and label’ permission in order to actually apply labels in Discover.


Before training using 'Search', it's important to know that it should be used sparingly when training each label, so as not to create labels that are biased. If all of the training examples for a label are applied by using search, the model will learn to look for a very specific use of language for that label concept, and not properly consider the broader set of possible examples in the dataset. This is why it's important to rely more on the different training modes in the Explore phase of the training.


The 'Search' functionality in Discover is used to search for key terms and phrases. You are able to search for exact search terms and if they exist it will show you these followed by partial matches. This function can be used to search for alternative terms and ways of expressing the same intent or concept for each label. This can be useful if you know a relevant common term or expression that has not appeared in any of the clusters so far and want to pin a couple of examples.


Search should not be used to apply a large number of examples per search term and per label - only a few of each.


Let’s look at an example. The cluster below is clearly about the location of the hotel and a ‘Location’ label has been added. If we only used this term it could bias the model towards the phrases around the word ‘Location’ or similar, and we should use the Search feature to find alternative ways of expressing this:



Example cluster regarding 'location'


Possible alternative search terms for 'location':

 

  • Located
  • Convenient
  • Position
  • Proximity
  • Near
  • Hotel position
  • Location to transport
  • Transport links
  • Tourist attractions
  • Close to transport
  • Central
  • Close to airport
  • Near the airport

Searching for different terms


The examples below show how searching for alternative terms for ‘location’ highlights verbatims that are related to the location of the hotel but expressed differently. You could also apply the label of ‘Location’ to these and by doing this the model will be given different examples of ‘Location’.


 

Search results in Discover


  1. 'Attractions’ highlights where the reviewer likes how close the hotel is to local attractions
  2. Hotel Position’ is another term to describe the users like/dislike for the location
  3. Transport’ shows where the reviewer likes that the hotel is close to transport links
  4. Close to shops’ similarly highlights where the reviewer likes that the hotel is close to shops

Applying labels to search results

 

 

Search results in Discover that have been labelled

 

  1. Select ‘Search’ from the ‘Cluster’ drop-down menu in the Discover tab
  2. Enter your search term and hit enter or click the icon 
  3. Matching search terms will appear highlighted in yellow. Reinfer will show full matches followed by partial matches
  4. Add all labels that should apply. In the above example the ‘Location > Shops’ labels are added, but also all other relevant ones have been applied
  5. DO NOT do this for large numbers of verbatims for each label


You can use this process sparingly for each label that has variable ways of expressing the same topic, however, there are other methods covered in the Explore phase of training that also help provide these different training examples, but do not have the potential to bias your model.


Previous: Training using 'Clusters'

Did you find it helpful? Yes No

Send feedback
Sorry we couldn't be helpful. Help us improve this article with your feedback.

Sections