The core of tagtog the text annotation editor for data augmentation. This editor is designed to make the user feel comfortable annotating text. We have created a minimalist user interface to interfere as little as possible in the reading experience to increase annotator's focus and the efficiency during annotation tasks.
The annotation editor is used to manually annotate text or/and train a machine learning model to automatically annotate text. By enabling automatic annotations you can build awesome stuff you didn't think of at first.
This web editor includes features as automatic annotations, overlapping text annotations or support for full-text articles, that reduce significantly the time required to annotate text.
You can annotate at text span level or at document level. Let's take a look to the type of annotations you can create using tagtog:
|Entity||Span of text representing a named entity. It can be any span: a part of a word, a word, a sentence or a group of words. Each entity belongs to one or more entity classes (e.g. Barack Obama is
|Normalization||Id assigned to a named entity. These annotations help in disambiguation. Normalization or canonicalization is the process for assigning an id or unique name to data that has more than one possible representation. This process is supported by dictionaries. For example an
Let's say you are extracting technical issues from reports in a CRM. When annotating those reports, you can add extra information to those entities (technical issues), for example severity. You can use this metadata to build a statistical model that retrieves the severity given a particular technical issue in a specific context.
Relation between two named entities. Each relation belong to one specific relation type (e.g. BRCA2
Currently tagtog supports bidirectional relationships (A relates to B, and B relates to A) to connect two entities. If you want to connect more than two entities you need to create more than one relation.
In order to set or see relations, remember you need first to define at least one Relation Type in Settings > Relations. Otherwise the option to See or Add relations in the menu will be disabled.
Relations are supported between entities from different paragraphs or sections.
For example, if you are classifying emails in order to dispatch them to different departments, you can create a document label (enum) and classify emails as, for example,
If you hover the mouse on the icon hotkeys, the list of hotkeys is displayed.
|[||Previous document in the pool|
|]||Next document in the pool|
|r||Start a new relation (only available when the annotation menu is visible)|
|d||Delete annotation (only available when the annotation menu is visible)|
The editor is mainly divided into: Document area, Toolbar and Sidebar.
The text is displayed in the document area. There you can read and annotate text.
Once a piece of text is annotated, it becomes an entity. In tagtog you can operate with entities and do things as normalize them, relate them, etc.
The background color of each annotation depends on the color picked for the Entity Type. The font color changes based on the background color so the contrast is appropriate to read.
Create new text annotations
A new text annotation is created by highlighting text with the mouse. Position the cursor at the beginning of the text you want to highlight. Press and hold your primary mouse button (commonly the left-button). While holding the mouse button, drag the cursor to the end of the text and let go of the mouse button. Once completed, all the text from the beginning to the end should be highlighted using the same Entity Type used in the previous text annotation. Currently the only way to change the entity type used for new annotations is by first changing the entity type of existing annotation.
Tips & tricks:
- If you double-click, you annotate the word clicked.
- If you try to annotate a word that starts or ends in space, the space won't be annotated.
Overlapping text annotations
Just create a new annotation that is contained within the span of existing one or that only overlaps part of it. Overlapping text annotations are recognizable at a glance while not disturbing you from reading the text.
Automatic annotations created upon the manual creation or removal of other equal annotation (same entity type and same text). These type of annotations increase annotator's efficiency as potential candidates for new/to-remove annotations are automatically identified.
|Pre-selection||Equal entities that are annotated upon manual annotation. E.g. if you annotate
|Pre-deselection||Equal entities that are removed upon manual removal, e.g. if you remove an existing annotation with the text "HER2" and Entity Type
You can choose whether pre-annotations are case sensitive or not. As other properties from pre-annotations, this setting can be change both from the editor and/or at project level: Settings > Annotations.
By clicking on the primary mouse button (commonly the left-button) on a text annotation, you display the annotation menu.
These are the actions you can perform:
Start a relation if a Relation Type is defined for the Entity Type of this entity. Once the relation is initialized, you can see highlighted the annotations you can relate your entity to. Other annotations are faded to indicate that you cannot relate the entity to these.
Click on one of the available entities to set the relation. From that moment, both entities will be connected. Both entities will display this icon on the top .
|See relations||-||See the relations this entity is part of.|
|Change Type||-||Change the Entity Type of entity. If you hover the mouse on this menu item, the list of possible Entity Types will show up.|
Each dictionary created for the entity type will appear as an input box. If the box is not empty, the entity is normalized to that value.
If you type at least 3 characters, a list of recommended dictionary entries will appear. To select a normalization simply choose an entry and press the ↵ key or click the ↵ icon.
Update dictionary from annotation editor
If you are using dictionaries, these are automatically updated upon manual normalization. If you add a new normalization, this will either add a new entry to the dictionary or update an existing entry with a new term. By design, the dictionary won't be updated when a normalization is removed.
You can always download the most updated version of a dictionary at Settings > Dictionaries.
The toolbar is located on the top of the document area. From it you can perform these actions:
In case the document or text comes from a known provider, clicking this link you access the original source.
For example, if you upload a PubMed document by PubMedId (PMID), tagtog understands the source. Clicking on this button you will go to the article in Pubmed.
Annotations from other users
Click on the user list to show all the project members. Click on the one you are interested, the version of the annotations for that user will be displayed on the document area.
Depending on your permissions you are able to edit or not the different versions of the annotations. A locker icon indicates that your permissions on that version are read-only.
Import annotations from other versions
You might want to start from the annotations of other user or replace the
master version with your annotations. For such cases, you can use the import option in the toolbar.
If you click on that option, a list of actions shows up:
|Copy to master||Replace master's annotations version with the version displayed in the document area.|
|Copy to mine||Replace your annotations with the version displayed in the document area.|
The availability of these options depends on the role permissions. More information on multi-user annotation
Here you can select whether pre-selections or pre-deselections are activated or deactivated. You can also turn on/off case sensitivity.
Each time you load a new document, the default settings from Settings > Annotations will apply. The changes in this menu won't change these default values and only will affect the current document. There are two types of pre-annotations: pre-selections and pre-deselections. You can find more information about pre-annotations here.
Save a document
Each time a change is made in the document (e.g. new annotation or relation added), the Save button will turn into green to indicate there are changes to save. Click the button to save the changes.
Confirm a document
Usually users confirm the document once the annotations has been reviewed. This is used to indicate that this document can be used as training data for AI, or simply that all annotations has been reviewed by a human. There are different annotation flows you can use for your project.
To confirm a document click on the button with the icon
Once you have confirmed a document, many actions are disabled. You can undo the Confirm action by clicking again the button. It is a toggle button.
View / output mode
Here you can select which way you want to display or export the annotated document.
Annotated documents can be exported in various formats: output formats
tagtog Web Editor refers to the visualization of the annotated document in the annotation editor.
Click on the button with the icon to remove all the annotations in the current document. This won't remove the document.
Click on the button with the icon to remove the document from the document pool.
Each button with an arrow pointing to left and right. If you click on the button with the left arrow, the previous document in the pool will be loaded. If you click on the button with the right arrow, the next document in the pool will be loaded.
The sidebar appearance changes depending on how you configured your project. It will only display those actionable items for those entity types used in the project.
These are the components you can find in the sidebar:
If you have any document label configured at Settings > Document Labels they will appear in this section in the side bar. Here the user can define the value of a document label for the current document. Once a change is made, you can save the document as usual.
Clicking on the icon you reset the label to the default value
The entity tally displays statistics for each entity type in the current document.
On the top of this section you find a summary with the number entities annotated and the entities not normalized. E.g. . Below the header, you can find the statistics for the annotations in the current document:
To digest the status of the annotated entities as fast as possible and reduce the noise of repeated annotations, you can group entities by:
Group annotations by normalization. Very useful to understand which concepts are annotated in the current document.
Entities not normalized are highlighted to spot them at a glance.
Clicking on the icon you expand a view with the information of each single annotation.
||Group annotations by text. It is very common that in the same text, the same entity is repeated multiple times. Sometimes it is better to understand that only two unique entities have been identified in this text, e.g. gene
||Entities are not grouped. They will appear one by one, in the same order they appear in text. This is very handy if you need to review each single annotation. Soon we will enable hotkeys so you can navigate this menu fast and easily.|
It keeps the count of the relations defined in the current document. In this section you can remove existing relations, clicking on the button .
This tally only appears if you have relation types defined at Settings > Relations.