The i2b2 NLP Shared Task in 2006 had two parts. The first was to deidentify discharge summaries. A separate task was to identify a patient’s smoking status based on a discharge summary. There were a number of successful methods used for this task which are described in the January 2008 issue of JAMIA.
My project was to further evaluate the utility of semantic features in this task, and determine how well semantic features would perform with a simpler classifier. To generate semantic features I used Columbia’s MedLEE medical language processor.
The rule-based classifier using MedLEE semantic features performed better than I expected with an F-measure of 0.83. The Boostexter classifier trained with semantic MedLEE features was competitive with the top-performing smoking classifier in the Shared Task, with microaveraged precision of 0.90, recall of 0.89, and F-measure of 0.89.
Above is the slide presentation I gave this past Sunday. The full paper is available below.
McCormick PJ, Elhadad N, Stetson PD. Use of Semantic Features to Classify Patient Smoking Status. AMIA 2008 Symposium Proceedings. 2008. PMID 18998969. [PDF]
Tags: amia · boostexter · medlee · nlp · smokingNo Comments

0 responses so far ↓
There are no comments yet...Kick things off by filling out the form below.