Fuzzy Clustering for Topic Analysis and Summarization of Document Collections

Montreal 2007

Abstract

Large document collections, such as those delivered by Internet search engines, are difficult and time-consuming for users to read and analyse. The detection of common and distinctive topics within a document set, together with the generation of multi-document summaries, can greatly ease the burden of information management. We show how this can be achieved with a clustering algorithm based on fuzzy set theory, which (i) is easy to implement and integrate into a personal information system, (ii) generates a highly flexible data structure for topic analysis and summarization, and (iii) also delivers excellent performance.

Reference

René Witte and Sabine Bergler. Fuzzy Clustering for Topic Analysis and Summarization of Document Collections. Advances in Artificial Intelligence, Proceedings of the 20th Conference of the Canadian Society for Computational Studies of Intelligence (Canadian AI 2007), May 28-30, 2007, Montréal, Québec, Canada. Springer LNAI 4509, pp.476-488.

Bibtex entry (also for download):

@InProceedings{WiBe_CAI2007,
  author = 	 {Ren{\'{e}} Witte and Sabine Bergler},
  title = 	 {{Fuzzy Clustering for Topic Analysis and 
                 Summarization of Document Collections}},
  booktitle =	 {Proc.\ of the 20th Canadian Conference on
                 Artificial Intelligence (Canadian A.I. 2007)},
  pages =	 {476--488},
  year =	 {2007},
  editor =	 {Z. Kobti and D. Wu},
  series =	 {LNAI 4509},
  address =	 {Montr{\'{e}}al, Qu{\'{e}}bec, Canada},
  month =	 {May 28--30},
  publisher =	 {Springer},
}

You can also:

Our paper received the best paper award at Canadian AI 2007, which had an acceptance rate of 17.7%.

Download

local copy: fuzzy_clustering_CAI2007.pdf
MD5 checksum: 2d8d62b545046d6f237722c6b4db71f5

Copyright © 2007 Springer-Verlag. This is the author's version of the work. It is posted here by permission of Springer for your personal use. Not for redistribution.