TR-LDA: A Cascaded Key-Bigram Extractor for Microblog Summarization

Home > Archive > 2015 > Volume 5 Number 3 (Jun. 2015) >

IJMLC 2015 Vol. 5(3): 172-178 ISSN: 2010-3700
DOI: 10.7763/IJMLC.2015.V5.503

Yufang Wu, Heng Zhang, Bo Xu, Hongwei Hao, and Chenglin Liu

Abstract—Microblog summarization can save large amount of time for users in browsing. However, it is more challenging to summarize microblog than traditional documents due to the heavy noise and severe sparsity of posts. In this paper, we propose an unsupervised method named TR-LDA for summarizing microblog by cascading two key-bigram extractors based on TextRank and Latent Dirichlet Allocation (LDA). Cascading strategy contributes to a key-bigram set with better noise immunity. Two sentence ranking strategies are proposed based on the key-bigram set. Moreover, an approach of sentence extraction is proposed by merging two ranking results. Compared with some other text content based summarizers, the proposed method was shown to perform superiorly in experiments on Sina Weibo dataset.

Index Terms—Key-Bigram, extraction, microblog summarization, sentence extraction, TR-LDA.

The authors are with the Institute of Automation, Chinese Academy of Sciences, 95 Zhongguancun East Road, Beijing, 100190, China (e-mail: yufang.wu@ia.ac.cn).

[PDF]

Cite: Yufang Wu, Heng Zhang, Bo Xu, Hongwei Hao, and Chenglin Liu, "TR-LDA: A Cascaded Key-Bigram Extractor for Microblog Summarization," International Journal of Machine Learning and Computing vol. 5, no. 3, pp. 172-178, 2015.

PREVIOUS PAPER

Unsupervised Cross-Language Classification with Stratified Sampling-Based Cluster Ensemble

NEXT PAPER

A Hybrid Artificial Bee Colony with Differential Evolution

General Information

E-ISSN: 2972-368X
Abbreviated Title: Int. J. Mach. Learn.
Frequency: Quaterly
DOI: 10.18178/IJML
Editor-in-Chief: Dr. Lin Huang
Executive Editor: Ms. Cherry L. Chen
Abstracing/Indexing: Inspec (IET), Google Scholar, Crossref, ProQuest, Electronic Journals Library, CNKI.
E-mail: ijml@ejournal.net

Home

About IJML

Editorial Board

Author Guideline

Editor Guideline

Reviewer Guideline

Special Issues

Archive

Home > Archive > 2015 > Volume 5 Number 3 (Jun. 2015) >

TR-LDA: A Cascaded Key-Bigram Extractor for Microblog Summarization

General Information

Article Metrics in Dimensions