Projects

A project is a collection of documents and rules to annotate documents manually or automatically.

Creating a project

Once you have signed up and you have a user account, you are ready to create a new project.

1Choose a name for your project


2Choose one or more already trained machine learning models (we call them 'machines'). Machines are split into categories (e.g. Biomedical). Each model extract specific information from text.

Project settings

If you selected some machines when you created your project, you probably want to start importing text to tagtog. Otherwise, you need to configure the project's settings either to annotate manually or automatically.

Guidelines

You can write the annotation guidelines for you or your team. It should define what and how to manually annotate. The more clear it is, the better the annotations and the training data you can generate.

Clicking on Edit you turn on the mode to edit the guidelines. Clicking on View you are on the preview mode and you can see the results of your changes. Once it is ready, just save it .

Only users with admin role can edit the guidelines.

Annotatable sections

Here you select which sections you want to manually or automatically annotate in scientific papers. The available sections are: Title, Abstract, Introduction & Background, Materials & Methods, Results, Conclusion & Discussion, Other. The sections not selected will be grayed out in the editor and manual annotations disabled.

You can also select how to annotate Figures & Tables as in always, never or section-dependant.

To disable manual annotation, uncheck each section. Users will be able to read text in the editor as usual, however manual annotation won't be possible.

Entities

Here you should define what type of information you want to annotate manually or automatically. Meaning, which type of information you want to identify or annotate in text. You achieve this by defining Entity Types (a.k.a. Entity Classes).

You define entity types from the user interface

You must add one or more entity types. Each entity is defined by a name, description (optional) and color. For example in the project in the picture above we want to extract vehicle information and for that we have created entity types to annotate vehicle parts (vehiclePart), vehicle types, (vehicleType) and vehicle model (vehicleModel). In order to easily identify the entities in the text, we will assign to each entity type a color.

Dictionaries

As soon as you create one entity type, this will appear in the Dictionaries panel. Each entity type can have associated one or more dictionaries (Dictionary format). From here you can upload, replace or download dictionaries.

Once you have created a dictionary, you can upload/replace or download the dictionary file

As an example of dictionary, let's use the entity type vehicleModel. For example, Volkswagen Golf 7, Golf Mk7 and Golf VII all identify the same canonical or unique object, this object can be identified with an ID, e.g.: VWGOLF7. Let's create our entry in the dictionary:

VWGOLF7 Volkswagen Golf 7 Golf Mk7 Golf VII

When you create a dictionary, you don't need to cover cases as plural, tenses, etc. tagtog uses the dictionary entries and applies grammar rules to identify potential entities doing such modifications for you.

In order to upload a dictionary, you first need to create a dictionary. Click on New Dictionary under the entity type name and Save it. Two options will show up: Download Dictionary and Upload Dictionary.

Upload/Replace: you can use this option to upload a dictionary file. If the file was uploaded previously, it will be replaced with the new dictionary. Once you uploaded a dictionary, all new text imported is annotated automatically following the dictionary rules.

Download: you can use this option to download the dictionary being used in tagtog to your computer. This is very useful to make large edit operations and later replace the existing dictionary.

Dictionaries are automatically updated if a user adds new normalizations using the web editor. More information.

Relations

You can annotate relations in text. For that you must first create a new Relation Type by clicking the New Relation Type button. After just choose two Entity Types, those types you want to identify relations for. Optionally you can add a description for the relation. For example, a new relation type between vehicle parts and vehicle models.

Currently you cannot extract relations in text automatically. However, as a workaround, you can extract the entities automatically and based on the distance in text, infer a relation.

Document labels

Labels used to mark the documents or text you import. It is another type of annotation, but affecting the whole document.

To create a new Document Label, click on the button New Document Label. Then, write a name for the label (required), type (required) and description (optional).

You can create different types of Document Labels:

Type Description
boolean The simplest label. Basically you mark the document as True or False for a specific condition. e.g. should this customer request go to the technical department? Yes or No.
string One or more words describing a document. This is particularly handy whether you don't have a specific list of options or if you do, it might change often. e.g. which disease is related to this clinical profile?
enum list of options which can describe a document. In this case the options should be written in the description of the label separated by commas. e.g. severity of the error report could be: low, medium, high or critical.

Setting a document label with the enum type

Once saved, you can start using them on the web editor.

Soon you will be able to generate document labels automatically within tagtog. Now, as a workaround, you can infer the document label based on the entities extracted automatically.

Entity labels

Labels used to add attributes to existing entities. You can add one or more entity labels to a specific entity.

To create a new Entity Label, click on the button New Document Label. Then, write a name for the label (required), type (required) and description (optional).

You can create different types of Entity Labels:

Type Description
boolean The simplest label. Basically you mark an entity as True or False for a specific condition. e.g. if you are dealing with financial reports, you can annotate organization names and add an attribute Bankruptcy with value True to those organizations going bankrupt. You can later train a model that identifies in text companies that went bankrupt.
string One or more words describing an entity. This is particularly handy whether you don't have a specific list of options or if you do, it might change often. e.g. you can add comments to entities.
enum list of options to describe an entity. In this case the options should be written in the description of the label separated by commas. e.g. if you are processing CVs you could add an entity label to the entities identifying skills. This enum entity label skill type have the values soft skill and hard skill.

Setting an entity label with the enum type

Once saved, you can start using them on the web editor.

Soon you will be able to generate entity labels automatically within tagtog.

Webhooks

The webhooks are useful to integrate tagtog within your system. You can define webhooks to notify automatically an external system after a specific event in tagtog or API.

These events are:

Event Description Source
Import new document A notification is sent when the user uploads a document. GUI and API
Save document A notification is sent when the user saves a document. GUI and API (update annotations via API)

When any of those events is triggered, we'll send a HTTP POST payload to the webhook's configured End Point URL.

We also send information in the delivery HTTP headers for you to better process the event:

Header Description
X-tagtog-onPushSave-source Source of the event. Possible values: GUI, API
X-tagtog-onPushSave-status Type of event. Possible values: created, updated

This is the required information to configure a webhook:

Field Description
End Point URL pointing to the external system
Format Format of the payload to be sent to the End Point. Currently you can select:
  • ann.json (docs). application/json
  • tagtogID (simple json object like: {"owner": "...", "project": "...", "tagtogID": "...document id related to the event..."}). application/json
Only GUI trigger Check it if you want only GUI changes to trigger the webhook. Leave it unchecked if you want that both, API and GUI changes, trigger the webhook.
Authentication

None no authentication

Basic Basic access authentication

NTLMv2 (Windows) NT LAN Manager v2

Annotations

Pre-annotations

These are the annotations that are created automatically while you are manually annotating a document. For example, if you annotate the gene: BRA2. All the mentions of BRA2 in this text will be automatically annotated (more information on pre-annotations).

In this panel you can decide which are the default settings for pre-annotations: pre-selections, pre-deselections and their case sensitivity. You can always change these settings on the web editor for specific documents.

Machine Learning

Each time you press the button Confirm in the annotation editor, in the background, a machine learning model is being trained with all the project documents confirmed. Next time you upload a new document, this model can predict new annotations based on this model. You can remove or add new annotations to continue training the model and get more accurate results.

If activated, machine learning will start annotating automatically from the first document confirmed. No deployments or complex configurations are required, just by annotating you can train a use a machine learning model.

If you don't want to use machine learning, deactivate this option.

More information on how Machine Learning works in tagtog.

PDF

Check this option to annotate directly over the native PDF document. This web interface only works with PDF files. If this option is unchecked, only the plain text stripped from the PDF is annotatable.

Take into consideration that if the PDF file is imported with the option unchecked, only the plain text version will be available for annotation. If you import the file with this option checked, both versions will be available: the native PDF annotation and the plain text annotation when the option is unchecked.

Find more information of the PDF annotation tool here.

Members

In this panel you can invite other users to your project, so they can collaborate in the annotation tasks. For more info about roles and collaborative annotation, go here.

Invite other users to your project

To add a new member simply write the tagtog username in the text box and click on Add Member. Once added, they will receive an email notification.

Admin

Remove a project

To remove a project, go to its Settings > Admin. Click on the Delete Project button. Please notice that removing a project will remove all the documents within the project.

Export settings

Export the project's settings (entity types, relation types, entity labels, document labels, etc.) as a JSON file to reuse as a template on new projects.

Import settings

Import another project's settings. This will overwrite your current settings and remove all your project's documents. This should be applied solely on new projects.