University of Pittsburgh

Dissertation Defense: Interactive Natural Language Processing for Clinical Text

PhD Candidate
Date: 
Monday, April 29, 2019 - 3:00pm - 5:00pm

Clinicians use free-text to conveniently capture rich information about patients. Care providers are likely to continue using narratives and first-person stories in Electronic Medical Records (EMRs) due to their convenience and utility, which complicates information extraction for computation and analysis. Despite advances in Natural Language Processing (NLP) techniques, building models is often expensive and time-consuming. Current approaches require a long collaboration between clinicians and data-scientists. Clinicians provide annotations and training data, while data-scientists build the models. With the current approaches, the domain experts - clinicians and clinical researchers - do not have provisions to inspect these models and give feedback. This forms a barrier to NLP adoption in the clinical domain by limiting power and utility of real-world applications.

 

  

  

Figure 1 Interactive Natural Language Processing allows domain experts, without machine learning experience to build models on their own, and also reduce or eliminate the need for collecting prior annotations and training data.  

Building models interactively can help narrow the gap between clinicians and data-scientists (Figure 1). Interactive learning systems may allow clinicians, without machine learning experience, to build NLP models on their own and also reduce the need for prior annotations upfront. These systems make it feasible to extract understanding from unstructured text in patient records; classifying documents against clinical concepts, summarizing records and other sophisticated NLP tasks. Interactive systems enable end-users to review model outputs and make corrections to build model revisions within an interactive feedback loop. 

 

Interactive methods are particularly attractive for clinical text due to the diversity of tasks that need customized training data. In my dissertation, I demonstrate this approach by building and evaluating prototype systems for both clinical care and research applications. I built NLPReViz as an interactive tool for clinicians to train and build binary NLP models on their own for retrospective review of colonoscopy procedure note. Next, I extended this effort to design an intelligent tool to identify incidental findings from radiology notes as clinicians review patient notes during their regular workflow. I follow a two-step evaluation with clinicians as study participants: a usability evaluation to demonstrate feasibility and overall usefulness of the tool, followed by an empirical evaluation to evaluate model correctness and utility. Lessons learned from the development and evaluation of these prototypes will provide insight into the generalized design of interactive NLP systems for wider clinical applications.

Copyright 2009–2019 | Send feedback about this site