Ontology Design for Biomedical Text Mining


Text Mining in biology and biomedicine requires a large amount of domain-specific knowledge. Publicly accessible resources hold much of the information needed, yet their practical integration into natural language processing (NLP) systems is fraught with manifold hurdles, especially the problem of semantic disconnectedness throughout the various resources and components. Ontologies can provide the necessary framework for a consistent semantic integration, while additionally delivering formal reasoning capabilities to NLP.

In this chapter, we address four important aspects relating to the integration of ontology and NLP: (i) An analysis of the different integration alternatives and their respective vantages; (ii) The design requirements for an ontology supporting NLP tasks; (iii) Creation and initialization of an ontology using publicly available tools and databases; and (iv) The connection of common NLP tasks with an ontology, including technical aspects of ontology deployment in a text mining framework. A concrete application example—text mining of enzyme mutations—is provided to motivate and illustrate these points.

Keywords: Text Mining, NLP, Ontology Design, Ontology Population, Ontological NLP


René Witte, Thomas Kappler, and Christopher J. O. Baker. Ontology Design for Biomedical Text Mining. In Semantic Web: Revolutionizing Knowledge Discovery in the Life Sciences, Chapter 13, pp.281-313, Springer Verlag, 2007.

Bibtex entry (also for download):

  author =       {Ren{\'{e}} Witte and Thomas Kappler 
                  and Christopher J. O. Baker},
  title =        {{Ontology Design for Biomedical Text Mining}},
  chapter =      {13},
  crossref =     {sw:ls},
  pages =        {281--313}
  editor =       {Christopher J. O. Baker and Kei-Ho Cheung},
  title =        {{Semantic Web: Revolutionizing Knowledge Discovery 
                  in the Life Sciences}},
  publisher =    {Springer Science+Business Media, New York, NY, USA},
  year =         {2007}


For downloading our open source software, please refer to the successor project, Open Mutation Miner (OMM).


This chapter can be obtained from SpringerLink.
Additionally, a preprint version is available for download here.
MD5 Checksum: d1183c24cb96313228593cab33d16364

Copyright © 2007 Springer US. This is the authors' preprint version of the work. It is posted here for your personal use. Not for redistribution. The definitive version was published in the book Semantic Web: Revolutionizing Knowledge Discovery in the Life Sciences, DOI: 10.1007/978-0-387-48438-9_14.