NLP Foundational Studies & Ontologies for Syndromic Surveillance from ED Reports

NLP Foundational Studies & Ontologies for Syndromic Surveillance from ED Reports (2007-2010)

Much of the clinical information required for accurate clinical research, active decision support, and broad-coverage surveillance is locked in text files in an electronic medical record (EMR). The only feasible way to leverage this information for translational science is to extract and encode the information using natural language processing (NLP). Over the last two decades, several research groups have developed NLP tools for clinical notes, but a major bottleneck preventing progress in clinical NLP is the lack of standard, annotated data sets for training and evaluating NLP applications. Without these standards, individual NLP applications abound without the ability to train different algorithms on standard annotations, share and integrate NLP modules, or compare performance. We propose to develop standards and infrastructure that can enable technology to extract scientific information from textual medical records, and we propose the research as a collaborative effort involving NLP experts across the U.S.

To accomplish this goal, we will address three specific aims:

1. Extend existing standards and develop new consensus standards for annotating clinical text in a way that is interoperable, extensible, and usable.
2. Apply existing methods and tools, and develop new methods and tools where necessary for manually annotating a set of publicly available clinical texts in a way that is efficient and accurate.
3. Develop a publicly available toolkit for automatically annotating clinical text and perform a shared evaluation to evaluate the toolkit, using evaluation metrics that are multidimensional and flexible.

Selected Publications, Papers, and Presentations

  • Harkema H, Thornblade T, Dowling J, Chapman WW. Portability of ConText: An Algorithm for determining Negation, Experiencer, and Temporal Status from Clinical Reports. J Biomed Inform. 2009 Oct;42(5):839-51. PMCID: PMC2757457 NIHMSID: NIHMS117020
  • Chapman WW, Dowling JN, Scholer M, et al. Developing syndrome definitions based on consensus and current use. J Am Med Inform Assoc 2010;17:595-601. PMID: 20819870 [PubMed - indexed for MEDLINE] PMCID: PMC2995670
  • Wilson RA, Chapman WW, DeFries SJ, Becich MJ, Chapman BE. Identifying history of ancillary cancers in mesothelioma patients from free-text clinical reports. J Pathology Inform. 2010;1:1:24. PMID: 21031012 PMCID: PMC2956176
  • Chapman BE, Lee S, Kang HP, Chapman WW. Document-Level Classification of CT Pulmonary Angiography Reports based on an Extension of the ConText Algorithm PMID: 21459155
  • Chapman WW, Saul M, Houston J, Irwin J, Mowery D, Harkema H, Becich M. Creation of a repository of automatically de-identified clinical reports: processes, people, and permission. American Medical Informatics Association on Clinical Research Informatics. 2010
  • Mowery DL, Harkema H, Chapman B, Hwa R, Wiebe J, Chapman WW. An automated SOAP classifier for emergency department reports. Annu AMIA Symp. San Francisco, CA. 2010.
  • Jordan PW, Mowery DL, Wiebe J, Chapman WW. Annotating conditions in clinical narratives to support temporal classification. Annu AMIA Symp. San Francisco, CA. 2010.
  • Mowery DL, Jordan PW, Wiebe JM, Liu L, Chapman WW. Does domain knowledge matter for assertion annotation in clinical text? International Conference on Healthcare Informatics, Imaging and Systems Biology. San Diego, CA. Sept 2012.
  • Mowery DL, Wiebe J, Visweswaran SH, Harkema H, Chapman WW. Building an automated SOAP classifier for emergency department reports. J Biomed Inform. 2012: 45. 71-81.
  • Conway M, Chapman W. Discovering Lexical Instantiations of Clinical Concepts using Web Services, WordNet, and Corpus Resources. Annu AMIA Symp. 2012. 1604
  • Henriksson A, Conway M, Duneld M, Chapman WW. Identifying Synonymy between SNOMED Clinical Terms of Varying Length Using Distributional Analysis of Electronic Health Records. AMIA Annu Symp Proc. 2013; 2013: 600–609.
  • Conway M, Dowling J, Chapman W. Using Chief Complaints for Syndromic Surveillance: A Review of Chief Complaint Classifiers in North America. J Biomed Inform. 2013 Aug;46(4):734-43
  • Conway M, Dowling J, Chapman W. Developing an Application Ontology for Mining Free Text Clinical Reports: The Extended Syndromic Surveillance Ontology. In Third International Workshop on Health Document Text Mining and Information Analysis (LOUHI 2011). 2011, pp. 75–82
  • Conway M, Dowling J, Chapman W. Developing a Biosurveillance Application Ontology for Influenza-Like-Illness. In Proceedings of the 6th Workshop on Ontologies and Lexical Resources. 2010, pp. 58–66.
  • Chen, A., Chapman, W., Conway, M., And Chapman, B. A Web Based Platform to Support Text Mining of Clinical Reports. In International Society for Disease Surveillance Annual Conference. 2011, p. 27.
  • Wang, L., Zhang, M., Conway, M., Haug, P., And Chapman, W. Using cKASS to Facilitate Knowledge Authoring and Sharing for Syndromic Surveillance. In International Society for Disease Surveillance Annual Conference. 2011, p. 158
  • Conway M, Dowling J, Chapman W. Evaluating Syndrome Definitions in the Extended Syndromic Surveillance Ontology. In International Society for Disease Surveillance Annual Conference. 2011, p. 32.
  • Conway M, Dowling J, Tsui R, Chapman, W. Developing an Application Ontology for Mining Clinical Reports: The Extended Syndromic Surveillance Ontology. In International Society for Disease Surveillance Annual Conference. 2010, p. 15
  • Chapman W, Christensen L, Dowling J, Lee Q, Conway M, Harkema H, Tsui, R. Challenges in Adapting an NLP System for Real-time Surveillance. In International Society for Disease Surveillance Annual Conference. 2010, p. 7
  • Conway M, Okhmatovskia A, Buckeridge D, Chapman, W. Using Consensus Syndrome Definitions to Classify Chief Complaints — The Onto-Classifier Web Application. In American Medical Informatics Association Symposium. 2010, p. 1012
  • Mowery D, Harkema H, Dowling J, Lustgarten J, Chapman WW. Distinguishing Historical from Current Problems in Clinical Reports – Which Textual Features Help? In: BioNLP Workshop of the Association for Computational Linguistics, Boulder, CO; 2009. pp. 10-18.
  • Irwin JY, Harkema H, Christensen L, Haug PJ, Chapman WW. Methodology to Develop and Evaluate a Semantic Representation for NLP. Proc AMIA Symp. J Am Med Inform Assoc Suppl 16. 2009;:271-5. PMID: 20351863 [PubMed - indexed for MEDLINE] PMCID: PMC2815383
  • Harkema H, Chapman WW, Saul M, Dellon ES, Schoen RE, Mehrotra A. Developing a natural language processing application for measuring the quality of colonoscopy procedures. J AM Med Inform Assoc. 2011 Dec; 18 Suppl 1:150-6
  • PI: 
    Wendy Chapman