ERSS 2005: Coreference-Based Summarization Reloaded


Friendly Meetings in Vancouver
We present ERSS 2005, our entry to this year's DUC competition. With only slight modifications from last year's version to accommodate the more complex context information present in DUC 2005, we achieved a similar performance to last year's entry, ranking roughly in the upper third when examining the ROUGE-1 and Basic Element score.

We also participated in the additional manual evaluation based on the new Pyramid method and performed further evaluations based on the Basic Elements method and the automatic generation of Pyramids. Interestingly, the ranking of our system differs greatly between the different measures; we attempt to analyse this effect based on correlations between the different results using the Spearman coefficient.

Context-based Multi-Document Summarization using Fuzzy Coreference Cluster Graphs

The IPD cluster computing cluster summaries using a clustering algorithm :)


Constructing focused, context-based multi-document summaries requires an analysis of the context questions, as well as their corresponding document sets. We present a fuzzy cluster graph algorithm that finds entities and their connections between context and documents based on fuzzy coreference chains and describe the design and implementation of the ERSS summarizer implementing these ideas.

Favourite Framework (Architecture) for NLP/Text Mining?

Fuzzy Set Theory-Based Belief Processing for Natural Language Texts


The growing number of publicly available information sources makes it impossible for individuals to keep track of all the various opinions on one topic. The goal of our artificial believer system we present in this paper is to extract and analyze opinionated statements from newspaper articles.

Beliefs are modeled with a fuzzy-theoretic approach applied after NLP-based information extraction. A fuzzy believer models a human agent, deciding what statements to believe or reject based on different, configurable strategies.

Durm German Lemmatizer v1.0 Released

I'm happy to announce the first public release of our free/open source Durm Lemmatization System for the German language.

The release comes with source code, binaries, documentation, resources (German lexicon, Case Tagger probabilities), and manually annotated texts from the German Wikipedia for evaluation.

Multi-lingual Noun Phrase Chunker Updated

I just posted a small update to my multi-lingual noun phrase chunker (MuNPEx) for GATE.

Changes in v0.2 are:
o preliminary Spanish support (see below)
o renamed from "NPE" to "MuNPEx" in a blatant attempt on Googlewhacking
o small cleanups
o now comes with a sample NE transducer for number markup to improve chunking
Supported languages are now English, German, French, and Spanish (beta).



We present here the outline of an ongoing research effort to recognize, represent, and interpret attributive constructions such as reported speech in newspaper articles. The role of reported speech is attribution: the statement does not assert some information as `true' but attributes it to some source. The description of the source and the choice of the reporting verb can express the reporter's level of confidence in the attributed material.

Syndicate content