• Jul 03, 2017 News!Good News! Since 2017, IJMLC has been indexed by Scopus!
  • Jul 06, 2017 News!Vol.7, No.2 has been published with online version.   [Click]
  • Jul 01, 2017 News!Vol.7, No.1 has been published with online version.   [Click]
Search
General Information
Editor-in-chief
Dr. Lin Huang
Metropolitan State University of Denver, USA
It's my honor to take on the position of editor in chief of IJMLC. We encourage authors to submit papers concerning any branch of machine learning and computing.
IJMLC 2014 Vol.4(4): 313-318 ISSN: 2010-3700
DOI: 10.7763/IJMLC.2014.V4.430

A LDA-Based Approach for Semi-Supervised Document Clustering

Ruizhang Huang, Ping Zhou, and Li Zhang
Abstract—In this paper, we develop an approach for semi-supervised document clustering based on Latent Dirichlet Allocation (LDA), namely LLDA. A small amount of labeled documents are used to indicate user's document grouping preference. A generative model is investigated to jointly model documents and the small amount of document labels. A variational inference algorithm is developed to infer the document collection structure. We explore the performance of our proposed approach on both a synthetic dataset and realistic document datasets. Our experiments indicate that our proposed approach performs well on grouping documents based on different user grouping preferences. The comparison between our proposed approach and state-of-the-art semi-supervised clustering algorithms using labeled instance shows that our approach is effective.

Index Terms—Semi-supervised clustering, document clustering, latent dirichlet allocation, generative model.

The authors are with the College of Computer Science and Technology, Guizhou University, Guiyang, CO 550025 China (corresponding author: Li Zhang; e-mail: cse.rzhuang@gzu.edu.cn, gs.pzhou11@mail.gzu.edu.cn, lizhang_2004@126.com).

[PDF]

Cite: Ruizhang Huang, Ping Zhou, and Li Zhang, "A LDA-Based Approach for Semi-Supervised Document Clustering," International Journal of Machine Learning and Computing vol.4, no. 4, pp. 313-318, 2014.

Copyright © 2008-2015. International Journal of Machine Learning and Computing. All rights reserved.
E-mail: ijmlc@ejournal.net