2013 ShARe/CLEF Evaluation Lab Task 1 Script

The evaluation script for the dataset for 2013 ShARe/CLEF eHealth Task 2 consists of de-identified clinical free-text notes authored in the ICU setting including discharge summaries, ECG reports, echocardiogram reports, and radiology reports from the MIMIC II database, version 2.5.
The dataset consists of 200 training set notes, and 100 test set notes. Task 1 dataset contains disease/disorder mentions generated by 2 medical coders. To access the dataset, please follow the instructions on the ShARe website for setting up a physionet account using the link below.

2013 ShARe/CLEF Evaluation Lab Task 2 Script

The evaluation script for the dataset for 2013 ShARe/CLEF eHealth Task 2 consists of de-identified clinical free-text notes authored in the ICU setting including discharge summaries, ECG reports, echocardiogram reports, and radiology reports from the MIMIC II database, version 2.5.
The dataset consists of 200 training set notes, and 100 test set notes. Task 2 contains acronym/abbreviation mention annotations generated by nursing professionals, NLP researchers and biomedical informaticians. To access the dataset, please follow the instructions on the ShARe website for setting up a physionet account using the link below.

Annotation Admin

Annotation Admin is an annotation project management tool developed by the Utah VA and extended for the NLP ecosystem environment. Annotation Admin facilitates and supports annotation project management by communicating with an annotation tool client, the Extensible Human Oracle of Suite Tools (eHOST), using the following functionalities:

  1. records and manages annotator user accounts,
  2. records annotation schema properties,
  3. stores text reports and annotation data,
  4. enables task-specific annotation dependencies, e.g., provide specific pre-annotations to annotators,
  5. supports annotation task assignment to users,
  6. facilitates annotation resource and task distribution to an annotator's annotation desktop client,
  7. supports data aggregation of annotation resources once data is annotated by the annotator, and
  8. tracks annotator progress.

Annotation Registry

Annotation Registry is an archive tool developed to facilitate annotator advertisement and recruitment for annotation tasks. Annotation Registry allows annotators to anonymously advertise their availability for an annotation task using an annotation profile that captures

  1. general contact information,
  2. annotation experience,
  3. related training including human subjects certifications,
  4. educational background,
  5. related annotation experience

The registry enables NLP researchers to search for skilled annotators that qualify for their particular annotation study.

Clinical NLP Evaluation Workbench

The NLP Evaluation Workbench (Workbench) was developed so that an end user could compare and adjudicate the accuracy of clinical document annotations produced by two different NLP transformation pipelines. These transformation pipeline outputs may originate from reference standard human annotation systems (e.g. Knowtator) or automated NLP annotation systems (e.g. Topaz).


ConText is based on a negation algorithm called NegEx. ConText's input is a sentence with indexed clinical conditions. ConText's output for each indexed condition is the value for contextual features or modifiers. The initial version of ConText determines values for three modifiers:

  1. Negation: affirmed or negated
  2. Temporality: recent, historical, or hypothetical
  3. Experiencer: patient or other

Knowledge Author

Knowledge Author is a tool to allow users to collaboratively develop domain content that is necessary for NLP application to produce valid results, as well as provide recommendations for additional domain content. It consists of three distinct applications:

  1. Knowledge Builder for developing domain-specific lexicons. The tool will allow the user to input and organize domain knowledge in a format accessible by NLP tools. To supplement the user's knowledge input, the application will also provide recommendations to potentially enrich the knowledge base such as lexical variants and information from external databases.
  2. Schema Builder for assisting non-NLP experts in developing domain-specific NLP schemas. The user will select which classes of clinical elements they want to extract from the clinical reports (i.e., medications or social risk factors) along with each class's corresponding properties (i.e., attributes such as severity) and restrictions (i.e., allowable relationships such as treats).
  3. Phenotype Builder for creating rules between items within a domain-specific lexicon. Often times, complex logical relationships between concepts are required. For example, a diagnosis of pneumonia requires a set of criteria the patient must have, such as fever and elevated respiratory rate, as well as a set of criteria the patient must not have such as the presence of various other diagnoses. Phenotype Builder allows the user to create these relationships in their knowledge bases.


NegEx locates trigger terms indicating a clinical condition is negated or possible and determines which text falls within the scope of the trigger terms. It can produce two types of output:

  1. Value of indexed conditions: If you indicate the conditions whose negation status you are wondering about, NegEx will return negated or possible for those conditions within the scope of negation terms (no value is returned for the conditions if the condition is considered present)
  2. Text within the scope of a trigger term: This is a more generalized output without needing to predetermine conditions of interest


Onyx is the latest descendant of a string of NLP systems developed by Peter Haug and his students over the years at the University of Utah (SPRUS, SymText, and M+). Onyx integrates syntax and semantics to extract and encode clinical information from text. Onyx is currently being applied to [automatically charting dental conditions from recorded dental exams] but will be expanded to other document types and domains within clinical medicine. Onyx is written in java and is available upon request. We plan to make Onyx available open source when we release version 1.0.


The pyConText algorithm is an extension of the original ConText algorithm. Specifically, the pyConText algorithm differs from the ConText algorithm in a number of ways:

1) This newer version (pyConText) is more extensible and can have user-defined modifiers. To provide new modifiers, you simple supply it with regular expressions for tagging associated trigger terms and termination terms for refining scope.

For example, one project involving radiology reports added the following modifiers: Uncertainty: certain or uncertain. Quality of radiologic exam: limited or not limited. Severity: critical or non-critical. Sidedness: right or left as well as others.

2) The user can provide regular expressions to encode events and entities including their literal normalized concept and associated synonyms.

3) The user can provide rules pyConText to support document-level assertions derived from encoded events and their modifiers e.g., IF Finding: stenosis AND Severity: critical AND Anatomical location: internal carotid artery THEN flag Document for review of SIGNIFICANT CAROTID STENOSIS.

Schema Ontology Parser

SKOS Editor

SKOS (Simple Knowledge Organization System) editor is a web application that allows users with a linguistic background to create SKOS knowledge bases without the need for any specific programming skills. It provides a web interface for creating and editing SKOS ontologies including creating schemas, concepts, object properties, and data properties. The user interface provides the functionality to create, edit, remove, rename, and move concepts within the ontology. The user can create a SKOS ontology from scratch or upload an already existing ontology via file upload or URL into the interface.


TextVect helps you convert clinical text documents into a feature vector that can be used to train classification models. In addition to generating n-grams from the text, TextVect leverages existing state-of-the-art NLP tools to generate syntactic and semantic features that may improve text classification performance. TextVect wraps the NLP tools in a UIMA pipeline so that the user does not have to install the tools. In addition to generating feature, TextVect allows the user to select different types of vector representation, performs feature selection to reduce the feature space, and can train a classifier. For example, if you have a set of discharge summaries with gold standard classifications, such as whether the patient had a bleeding event, TextVect could create a dataset with features and labels that you could use to train a model. This model would then predict bleeding for other unseen discharge summary reports.


Topaz is a tool that supports information extraction of clinical conditions from narrative reports. In addition to identifying words like "fever" in the text and mapping them to standard vocabularies, Topaz can (a) detect the concept temperature and its numeric value 102 F to determine that the concept fever is present in the document and (b) use section headings to help identify clinical conditions, such as the concept cervical lymphadenopathy from the word "adenopathy" in the "NECK" section of the report, and (c) using if-then rules that apply to single or multiple matched concepts in order to infer more complex concepts, for instance inferring “influenza-like illness” from the description “fever AND sore throat OR cough”. If-then rules can also be used to infer more general conditions from more specific ones, e.g. inferring “diarrhea” based on detection of the concept “bloody diarrhea”. Users define a relevant set of “core concepts”, and Topaz contains embedded ConText rules that identify whether each core concept is present, absent, or uncertain, who experienced the condition (the patient or someone else), and whether the condition occurred recently or in the past. Topaz’s last pipeline module determines the document-level status of each of the core concepts, e.g. whether the patient had a fever based on sentence-level references to the concept “fever”. Topaz returns a set of UIMA annotations representing phrase and document-level annotations of core concepts with ConText features (temporality, experiencer, polarity) added.