How much training should I do for this step?
This mode will present you with 20 verbatims at a time, and you should complete a reasonable amount of training in this mode, going through multiple pages of verbatims and applying the correct labels, to help increase the model's coverage (see here for a detailed explanation of coverage).
The total amount of training you need to complete in 'Low confidence' will depend on a few different factors:
- How much training you completed in Shuffle and Teach - the more training you do in Shuffle and Teach, the more your training set should be a representative sample of the dataset as a whole, and the fewer relevant verbatims there should be in 'Low confidence'
- The purpose of the dataset - if the dataset is intended to be used for automation and requires very high coverage, then you should complete a larger proportion of training in 'Low confidence' to identify the various edge cases for each label
At a minimum, you should aim to label 5 pages of verbatims in this mode. Later on in the Refine phase when you come to check your coverage, you may find that you need to complete more training in 'Low confidence' to improve your coverage further.