Before you begin training your model it is important to read the following tips and avoid the common pitfalls. These will help keep the training time shorter and improve the performance of your model.
The three most important things to remember whenever you are training a Communications Mining model are:
Add all labels that apply: remember to add all the labels that apply to a verbatim. It’s a common pitfall for new users to partially label a verbatim by only applying the one they are focusing on and forgetting to add all others that apply. Not applying a label is as powerful as applying one - you are telling the model that the verbatim isn't something as well as what it is. Therefore it's important to apply all labels as it may confuse the model later, potentially leading to poorer performance.
Apply labels consistently: Remember to be consistent in adding labels. For example, if you add the label ‘Room > Size’ to a verbatim and forget to add it another where it should be added you will confuse the model. As with the previous tip above when you don’t apply a label it is as powerful as applying one
Label what you see in front of you: Don’t make assumptions when applying your business knowledge. If nothing in the subject or body of the verbatim indicates that a label should apply, don't apply it, or the model won't be able to understand why it applies.
Don't spend ages deciding label names: Don’t spend too long thinking about the correct name for a label. You can rename a label at any point during the training process.
Be specific when naming a label: Be as specific as possible when naming a label and keep the taxonomy as flat as possible initially. It is better to be as specific as possible with your label name at the outset as you can always change and restructure the hierarchy later.
For example, if you chose to apply a label to describe the cleanliness of a room you could apply ‘Room cleanliness’. If you later decided to change it and have cleanliness as a sub label you can rename it to ‘Room > Cleanliness’. At this stage you should add as many labels as possible to a verbatim as you can always go back and merge later.
Previous: Understanding the status of your dataset | Next: Training with label sentiment analysis enabled