Metrics

Metrics to track the progress of your project and the health of your data. Metrics are split into different sections:

Overview

A simple overview of your project's settings and members.

Inter-Annotator Agreement

You can read all about the Inter-Annotator Agreement (IAA) here.

Documents

The following charts are available:

Confirmed VS Not Confirmed documents. It gives you a general idea of the progress in your project. Please notice that this chart only show data from the master version of the annotations.

Annotated VS Not Annotated documents. Annotated documents are those with at least one annotation of any kind (document label, entity, etc.). Not annotated documents are those with no annotations.

Entities

The following charts are available:

Entity type distribution. Number of entities across all your documents, by entity type.

Entity type distribution across documents. Number of documents annotated with entities of specific entity type.

An entity type misrepresented or concentrated in a small sample of documents might lead to bias or incorrect predictions. Take action to improve the health of your data

Normalizations

The following charts are available:

A chart per dictionary. It shows the number of documents annotated with specific normalizations (i.e. unique ids).

A normalization concentrated in a small sample of documents can lead to a misrepresented normalization and eventually to bias or incorrect predictions.




Document labels

The following charts are available:

Document labels distribution across documents. Number of documents with an specific document label set.

A chart per document label. For the `boolean` or `enum` types, this chart represents the distribution of possible values across the documents of your project. For the `string` type, due to its non-finite nature, this chart represents the top values across the documents of your project.

A misrepresentation of a document label or any of their possible values might impact the health of your data. Pay special attention to the representation of the values from your labels, it can lead to bias or incorrect predictions.