PKUSUMSUM
A Java platform for multilingual document summarization

About

PKUSUMSUM (PKU’s SUMmary of SUMmarization methods) is an integrated toolkit for automatic document summarization. It supports single-document, multi-document and topic-focused multi-document summarizations, and a variety of summarization methods have been implemented in the toolkit.

Users can easily use the toolkit to produce summaries for documents or document sets, and implement their own summarization methods based on the platform.

Main features of PKUSUMSUM include:

  • It integrates stable and various summarization methods, and the performance is good enough.
  • It supports three typical summarization tasks, including simple-document, multi-document and topic-focused multi-document summarizations.
  • It supports Western languages (e.g. English) and Chinese language.
  • It integrates English tokenizer, stemmer and Chinese word segmentation tools.
  • The Java platform can be easily distributed on different OS platforms, like Windows, Linux and MacOS.
  • It is open source and developed with modularization, so that users can add new methods and modules into the toolkit conveniently.

The package of PKUSUMSUM includes the Jar package, source code in “/code” and referenced libraries in “/lib”.

People

  • Jianmin Zhang
  • Tianming Wang
  • Xiaojun Wan

Reference

  • Jianmin Zhang, Tianming Wang and Xiaojun Wan. PKUSUMSUM: A Java Platform for Multilingual Document Summarization. In COLING 2016. (Demo Paper, paper)

Usage

  • The detailed information about the usage of this toolkit can be found in the readme file. Please download the software package and read the file.

Licensing

  • This toolkit is used under the GNU GPL license.

Download

  • You can find and download the toolkit on github.

Contact

  • Contact person: Jianmin Zhang
  • Contact email: zhangjianmin2015@pku.edu.cn
Visitor number: free hits