Mutation impact extraction is a hitherto unaccomplished task in state of the art mutation extraction systems. Protein mutations and their impacts on protein properties are hidden in scientific literature, making them poorly accessible for protein engineers and inaccessible for phenotype-prediction systems that currently depend on manually curated genomic variation databases.
We present the first rule-based approach for the extraction of mutation impacts on protein properties, categorizing their directionality as positive, negative or neutral. Furthermore protein and mutation mentions are grounded to their respective UniProtKB IDs and selected protein properties, namely protein functions to concepts found in the Gene Ontology. The extracted entities are populated to an OWL-DL Mutation Impact ontology facilitating complex querying for mutation impacts using SPARQL. We illustrate retrieval of proteins and mutant sequences for a given direction of impact on specific protein properties. Moreover we provide programmatic access to the data through semantic web services using the SADI (Semantic Automated Discovery and Integration) framework.
We address the problem of access to legacy mutation data in unstructured form through the creation of novel mutation impact extraction methods which are evaluated on a corpus of full-text articles on haloalkane dehalogenases, tagged by domain experts. Our approaches show state of the art levels of precision and recall for Mutation Grounding and respectable level of precision but lower recall for the task of Mutant-Impact relation extraction. The system is deployed using text mining and semantic web technologies with the goal of publishing to a broad spectrum of consumers.
NLP methods for extracting mutation information from the bibliome have become an important new research area within bio-NLP, as manually curated databases, like the Protein Mutant Database (PMD) (Kawabata et al., 1999), cannot keep up with the rapid pace of mutation research. However, while significant progress has been made with respect to mutation detection, the automated extraction of the impacts of these mutations has so far not been targeted. In this paper, we describe the first work to automatically summarize impact information from protein mutations. Our approach is based on populating an OWL-DL ontology with impact information, which can then be queried to provide structured information, including a summary.
We present a lightweight, user-centred approach for document navigation and analysis that is based on an ontology of text mining results. This allows us to bring the result of existing text mining pipelines directly to end users. Our approach is domain-independent and relies on existing NLP analysis tasks such as automatic multi-document summarization, clustering, question-answering, and opinion mining. Users can interactively trigger semantic processing services for tasks such as analyzing product reviews, daily news, or other document sets.
Real-time access to complex knowledge is a business driver in the contact centre environment. In this paper we outline for the domain of telecom technical product support a knowledge sharing paradigm in which a desktop client annotates named entities in technical documents with canonical names, class names or relevant class axioms, derived from an ontology by means of a web services framework. We described the system and its core components; OWL-DL telecom hardware ontology, ontological-natural language processing pipeline, an ontology axiom?extractor; and the Semantic Assistants framework.
Ontology population from text is becoming increasingly important for NLP applications. Ontologies in OWL format provide for a standardized means of modeling, querying, and reasoning over large knowledge bases. Populated from natural language texts, they offer significant advantages over traditional export formats, such as plain XML. The development of text analysis systems has been greatly facilitated by modern NLP frameworks, such as the General Architecture for Text Engineering (GATE). However, ontology population is not currently supported by a standard component. We developed a GATE resource called the OwlExporter that allows to easily map existing NLP analysis pipelines to OWL ontologies, thereby allowing language engineers to create ontology population systems without requiring extensive knowledge of ontology APIs. A particular feature of our approach is the concurrent population and linking of a domain- and NLP-ontology, including NLP-specific features such as safe reasoning over coreference chains.
Software development and maintenance are highly distributed processes that involve a multitude of supporting tools and resources. Knowledge relevant to these resources is typically dispersed over a wide range of artifacts, representation formats, and abstraction levels. In order to stay competitive, organizations are often required to assess and provide evidence that their software meets the expected requirements. In our research, we focus on assessing non-functional quality requirements, specifically evolvability, through semantic modeling of relevant software artifacts. We introduce our SE-Advisor that supports the integration of knowledge resources typically found in software ecosystems by providing a unified ontological representation. We further illustrate how our SE-Advisor takes advantage of this unified representation to support the analysis and assessment of different types of quality attributes related to the evolvability of software ecosystems.
Nowadays, software development and maintenance are highly distributed processes that involve a multitude of supporting tools and resources. Knowledge relevant for a particular software maintenance task is typically dispersed over a wide range of artifacts in different representational formats and at different abstraction levels, resulting in isolated 'information silos'. An increasing number of task-specific software tools aim to support developers, but this often results in additional challenges, as not every project member can be familiar with every tool and its applicability for a given problem. Furthermore, historical knowledge about successfully performed modifications is lost, since only the result is recorded in versioning systems, but not how a developer arrived at the solution. In this research, we introduce conceptual models for the software domain that go beyond existing program and tool models, by including maintenance processes and their constituents. The models are supported by a pro-active, ambient, knowledge-based environment that integrates users, tasks, tools, and resources, as well as processes and history-specific information. Given this ambient environment, we demonstrate how maintainers can be supported with contextual guidance during typical maintenance tasks through the use of ontology queries and reasoning services.
From a maintenance perspective, only software that is well understood can evolve in a controlled and high-quality manner. Software evolution itself is a knowledge-driven process that requires the use and integration of different knowledge resources. The authors present a formal representation of an existing process model to support the evolution of software systems by representing knowledge resources and the process model using a common representation based on ontologies and description logics. This formal representation supports the use of reasoning services across different knowledge resources, allowing for the inference of explicit and implicit relations among them. Furthermore, an interactive story metaphor is introduced to guide maintainers during their software evolution activities and to model the interactions between the users, knowledge resources and process model.
Traceability links provide support for software engineers in understanding relations and dependencies among software artefacts created during the software development process. The authors focus on re-establishing traceability links between existing source code and documentation to support software maintenance. They present a novel approach that addresses this issue by creating formal ontological representations for both documentation and source code artefacts. Their approach recovers traceability links at the semantic level, utilising structural and semantic information found in various software artefacts. These linked ontologies are supported by ontology reasoners to allow the inference of implicit relations among these software artefacts.