Knowledge Representation of Clinical Data

The development of appropriate knowledge representations is vital for the large scale Natural Language Processing of clinical data (in particular, narrative notes in Electronic Health Records). A key goal of this work is to provide tools, resources and environments for clinicians to develop their own NLP algorithms in order that they can address their own research problems, without the need for dedicated NLP expertise. To this end, much of our work is focussed on developing user-friendly, easy-to-use, NLP tools and environments, with the ultimate goal of taking NLP researchers “out of the loop”.

Our group has extensive experience in developing knowledge representations for clinical data (e.g. ConText and NegEx, tools that identify uncertainty, temporal characteristics and negated clinical conditions). Most of our current research on clinical knowledge representation is conducted under the auspices of the VA Knowledge Author project, with work centered on building tools for editing and creating knowledge bases (e.g. a web-based editor for developing Simple Knowledge Organization System vocabularies), off-the-shelf linguistic knowledge bases that can applied to data by clinicians without modification (e.g. NegEx), and tools to facilitate the point-and-click application of NLP algorithms & knowledge bases to real, clinical text.