NLP

A Quality Perspective of Evolvability Using Semantic Analysis

Abstract

Software development and maintenance are highly distributed processes that involve a multitude of supporting tools and resources. Knowledge relevant to these resources is typically dispersed over a wide range of artifacts, representation formats, and abstraction levels. In order to stay competitive, organizations are often required to assess and provide evidence that their software meets the expected requirements. In our research, we focus on assessing non-functional quality requirements, specifically evolvability, through semantic modeling of relevant software artifacts. We introduce our SE-Advisor that supports the integration of knowledge resources typically found in software ecosystems by providing a unified ontological representation. We further illustrate how our SE-Advisor takes advantage of this unified representation to support the analysis and assessment of different types of quality attributes related to the evolvability of software ecosystems.

Semantic Assistants – User-Centric Natural Language Processing Services for Desktop Clients

Abstract

Semantic Assistants Workflow OverviewSemantic Assistants Workflow OverviewToday's knowledge workers have to spend a large amount of time and manual effort on creating, analyzing, and modifying textual content. While more advanced semantically-oriented analysis techniques have been developed in recent years, they have not yet found their way into commonly used desktop clients, be they generic (e.g., word processors, email clients) or domain-specific (e.g., software IDEs, biological tools). Instead of forcing the user to leave his current context and use an external application, we propose a ``Semantic Assistants'' approach, where semantic analysis services relevant for the user's current task are offered directly within a desktop application. Our approach relies on an OWL ontology model for context and service information and integrates external natural language processing (NLP) pipelines through W3C Web services.

A General Architecture for Connecting NLP Frameworks and Desktop Clients using Web Services


Abstract

Despite impressive advances in the development of generic NLP frameworks, content-specific text mining algorithms, and NLP services, little progress has been made in enhancing existing end-user clients with text analysis capabilities. To overcome this software engineering gap between desktop environments and text analysis frameworks, we developed an open service-oriented architecture, based on Semantic Web ontologies and W3C Web services, which makes it possible to easily integrate any NLP service into client applications.

Enhancing the OpenOffice.org Word Processor with Natural Language Processing Capabilities


Abstract

Today's knowledger workers are often overwhelmed by the vast amount of readily available natural language documents that are potentially relevant for a given task. Natural language processing (NLP) and text mining techniques can deliver automated analysis support, but they are often not integrated into commonly used desktop clients, such as word processors. We present a plug-in for the OpenOffice.org word processor Writer that allows to access any kind of NLP analysis service mediated through a service-oriented architecture. Semantic Assistants can now provide services such as information extraction, question-answering, index generation, or automatic summarization directly within an end user's application.

A Belief Revision Approach to Textual Entailment Recognition

Abstract

An artificial believer has to recognize textual entailment to categorize beliefs. We describe our system – the Fuzzy Believer system – and its application to the TAC/RTE three-way task.

ERSS at TAC 2008

Abstract

An Automatically Generated SummaryAn Automatically Generated Summary
ERSS 2008 attempted to rectify certain issues of ERSS 2007. The improvements to readability, however, do not re?ect in signi?cant score increases, and in fact the system fell in overall ranking. While we have not concluded our analysis, we present some preliminary observations here.

Minding the Source: Automatic Tagging of Reported Speech in Newspaper Articles


Abstract

Reported speech in the form of direct and indirect reported speech is an important indicator of evidentiality in traditional newspaper texts, but also increasingly in the new media that rely heavily on citation and quotation of previous postings, as for instance in blogs or newsgroups. This paper details the basic processing steps for reported speech analysis and reports on performance of an implementation in form of a GATE resource.

Generating Update Summaries for DUC 2007

Abstract

Update summaries as defined for the new DUC 2007 task deliver focused information to a user who has already read a set of older documents covering the same topic. In this paper, we show how to generate this kind of summary from the same data structure—fuzzy coreference cluster graphs—as all other generic and focused multi-document summaries. Our system ERSS 2007 implementing this algorithm also participated in the DUC 2007 main task, without any changes from the 2006 version.

An Initial Fuzzy Coreference Cluster Graph

Creating a Fuzzy Believer to Model Human Newspaper Readers

Montreal 2007

Abstract

We present a system capable of modeling human newspaper readers. It is based on the extraction of reported speech, which is subsequently converted into a fuzzy theory-based representation of single statements. A domain analysis then assigns statements to topics. A number of fuzzy set operators, including fuzzy belief revision, are applied to model different belief strategies. At the end, our system holds certain beliefs while rejecting others.

Fuzzy Clustering for Topic Analysis and Summarization of Document Collections

Montreal 2007

Abstract

Large document collections, such as those delivered by Internet search engines, are difficult and time-consuming for users to read and analyse. The detection of common and distinctive topics within a document set, together with the generation of multi-document summaries, can greatly ease the burden of information management. We show how this can be achieved with a clustering algorithm based on fuzzy set theory, which (i) is easy to implement and integrate into a personal information system, (ii) generates a highly flexible data structure for topic analysis and summarization, and (iii) also delivers excellent performance.

Syndicate content