User permissions required: ‘View Sources’ AND ‘Review and label’
Introduction to using 'Teach label'
'Teach' is the third step in the Explore phase and its purpose is to show predictions for a label where the model is most confused if it applies or not. Like previous steps, we then confirm if that prediction is correct or incorrect and by doing so provide the model strong training signals.
This training step should be used for each label as soon as you can see high-confidence predictions for the label in 'label' mode.
'Teach' vs 'Label'
The difference between the ‘Teach’ and ‘Label’ function is highlighted in the images below. In the ‘Review Predictions’ step we used 'Label' mode. In this step, we use the 'Teach label' mode from the dropdown menu (it defaults to unreviewed verbatims) and select the label we want to train.
When a label is selected (Claim > Home in this example) using the label filter, Re:infer shows you verbatims where that label is predicted in descending order of confidence (from the highest). In this example it starts at 100% and verbatims below this would also be 100% or less.
When a label is selected (Claim > Home in this example) using the Teach filter, Re:infer shows you verbatims where that label is predicted but where the model is most confused to whether it applies or not:
- For datasets without sentiment enabled this will start at approximately 50% confidence, i.e. when the model is most confused as to whether the label should apply or not.
- For datasets with sentiment enabled the confidence starts around 66%
By using Teach you are providing the model with much stronger training signals because the model is being given new information on verbatims it is unsure about, as opposed to ones where it already has highly confident predictions (90%+).
Explore in 'Label' mode
Explore in 'Teach Label' mode (showing unreviewed verbatims as default)
- Select Teach from the top-left dropdown menu as shown
- Select the label you wish to train - the default selection in Teach mode is to show unreviewed verbatims
- You will be presented with a selection of Verbatims where the model is most confused as to whether the selected label applied or not - review the predictions and apply the label if they are correct, or apply other labels if they are incorrect
- Predictions will range outwards from ~50% for data with no sentiment and 66% for data with sentiment enabled
- Remember to apply all other labels that apply as well as the specific label you are focusing on
You should use this training mode as required to boost the number of training examples for each label to above 25, whereby Re:infer can then accurately estimate the performance of the label.
The number of examples required for each label to perform well will depend on a number of factors. In the 'Refine' phase we cover how to understand and improve the performance of each label.
Re:infer will regularly recommend using 'Teach label' as a means of improving the performance of specific labels by providing more varied training examples that it can use to identify other instances in your dataset where the label should apply.
- If for a label there are multiple different ways of saying the same thing (e.g. A, B or C), make sure that you give Re:infer training examples for each way of saying it. If you give it 30 examples of A, and only a few of B and C, the model will struggle to pick up future examples of B or C for that label.
- Adding a new label to a mature taxonomy may mean it’s not been applied to previously reviewed verbatims. This then requires going back and teaching the model on new labels, using the 'Missed label' function – see here for how