User permissions required: 'View Sources' AND 'Review and label'
Predicted entities appear as colour highlighted text, such as in the first line of the verbatim below, with a different colour appearing for each different entity type. Once an entity has been confirmed by a user, by either manually applying it or accepting a prediction, the entity will appear as highlighted text with a bold, darker outline as shown below.
If a paragraph has had entities assigned, dismissed, or applied, it will appear highlighted in grey, as shown in the body of the verbatim below.
Entity format example
Accepting and rejecting entity predictions
Once entities are enabled (see here), Re:infer will automatically start predicting them within the verbatims throughout your dataset. Users can then accept the predictions that are correct or reject them where they are incorrect. Each of these actions sends training signals to Re:infer that will be used to improve the platform’s understanding of that entity.
For the pre-trained entities that are trained offline (most excluding ‘Organisation’ and ‘Person’), it is more important from an improvement perspective for users to reject or correct wrong predictions than it is for them to accept correct predictions.
For the entities that train live in the platform (currently ‘Organisation’ and ‘Person’), it is equally important to accept correct predictions as well as reject incorrect predictions. You do not, however, need to keep accepting many correct examples of each unique entity for these kinds (e.g. Example Bank Ltd. is a unique organisation entity) if you aren't finding incorrectly predicted ones. The key caveat to this if that if you review any entity in a paragraph, you need to review all of the other entities in that paragraph.
To review an entity prediction, hover the mouse over the prediction and the entity review modal will appear, as shown in the example below. To accept it, click 'Confirm', to reject it, click 'Dismiss'.
Please Note: It's very important when training entities to follow the best practices explained below - particularly regarding not partially labelling paragraphs.
To understand how well the platform is able to predict each entity enabled for a dataset (particularly the trainable ones), see here.
Example verbatim with both assigned and predicted entities
Please Note: It’s important that you reject incorrect entity predictions, but if the highlighted text was in fact a different entity (this would be more common for date-related entities) that you apply the correct one afterwards (see below on how to apply entities).
To apply an entity to some text where Re:infer may not have predicted it, users simply need to highlight the section of test like you would if you were going to copy it.
A dropdown menu will appear, as shown below, containing all of the entities that you have enabled for your dataset. Simply click the correct one to apply it, or press the corresponding keyboard shortcut.
The default keyboard shortcut for each entity is the letter is starts with. If more than one entity starts with the same letter, one will be assigned at random to the other (as seen below).
An example verbatim showing entity application modal
Once an entity has been applied, it will be highlighted in colour with a bold outline (see below). Each entity type will have its own specific colour.
An example verbatim showing an applied ‘Policy Number’ entity
Please Note: There are two very important best practices to remember when accepting, rejecting or applying entities within verbatims:
1. Don't split words
It’s important not to split words – the highlighted entity should cover the entire word (or several) in question, not just part of it (see the incorrect example on the left below, and the correct application on the right)
Incorrect (left) and correct (right) examples of the ‘Organisation’ entity being applied
2. Don't partially label paragraphs
When labelling, if a user assigns one label to a verbatim, they should apply ALL labels that could apply to that verbatim, otherwise you teach the model that those other labels should not apply. For entities, the same is true, except entities are reviewed or applied at the paragraph level, rather than the whole verbatim.
Paragraphs in a verbatim are separated by new lines. The subject line of an email verbatim is considered its own single paragraph.
Make sure to review or apply all of the entities within a paragraph across all entity kinds if you review or apply one of them. Applying, accepting or rejecting entities in a paragraph means that the paragraph is treated as ‘reviewed’ by the platform from an entity perspective. Therefore, it’s important to accept or reject ALL of the predictions in that paragraph.
The example below shows the different paragraphs that have been reviewed within the email verbatim.
Example email verbatim showing correctly reviewed entities across multiple paragraphs
The verbatim shown below shows the same example where the user has not accepted or rejected all of the entity predictions in a single paragraph. This is incorrect, as the model will falsely treat the monetary quantity entity as an incorrect prediction.
Example email verbatim that has not been properly reviewed