Support Centre

Model Training & Maintenance

Guides on how to create, improve and maintain Models in Re:infer, using platform features such as Discover, Explore and Validation

Training using 'Shuffle'

User permission required: 'View sources' AND 'Review and label'

 

'Shuffle' is the first step in Explore and its purpose is to provide users with a random selection of verbatims for them to review.  In shuffle mode, Re:infer will show you verbatims that have predictions covering all labels (and where there are none) so the Shuffle step differs from the others in Explore in that it doesn’t focus on a specific label to train but covers them all. 


Why is training using shuffle mode important?

It is important to use shuffle mode to ensure that you provide your model with sufficient training examples that are representative of the dataset as a whole, and are not biased by focusing only on very specific areas of the data. 


Labelling in shuffle mode essentially helps ensure that your taxonomy covers the data within your dataset well, and prevents you from creating a model that can very accurately make predictions on only a small fraction of the data within the dataset.


Looking through verbatims in shuffle mode is therefore an easy way to get a sense of how the overall model is doing, and can be referred to throughout the training process. In a well-trained taxonomy, you should be able to go through any unreviewed verbatims on shuffle and just accept predictions to further train the model. If you find lots of the predictions are incorrect, you can see which labels require more training


Going through multiple pages on shuffle later on in the training process is also a good way to check if there are intents or concepts that have not been captured by your taxonomy and should have been. You can then add existing labels where required, or create new ones if needed


Key steps:

  1. Select 'Unreviewed' from the filter
  2. Select 'Shuffle' from the drop-down menu to be presented with 20 random verbatims
  3. Review each verbatim and any associated predictions
    • If there are predictions, you should either confirm or reject these. Confirm by selecting the ones that apply.
    • Remember you should also add all other labels that apply
    • If you reject the prediction(s) you should apply the correct label, as well as all others that apply. Don’t leave the verbatim with no labels applied
  4. You can also hit the refresh button to get a new set of verbatims


In this phase of the training, we would recommend labelling at least 5 - 10 pages worth of verbatims in shuffle as a minimum. In large datasets with lots of training examples, this could be much more.


Previous: Intro to Explore     |     Next: Reviewing label predictions

Did you find it helpful? Yes No

Send feedback
Sorry we couldn't be helpful. Help us improve this article with your feedback.

Sections

View all