• Mar 27, 2019 News!Good News! All papers from Volume 9, Number 1 have been indexed by Scopus!   [Click]
  • May 07, 2019 News!Vol.9, No.3 has been published with online version.   [Click]
  • Mar 30, 2019 News!Vol.9, No.2 has been published with online version.   [Click]
General Information
    • ISSN: 2010-3700
    • Abbreviated Title: Int. J. Mach. Learn. Comput.
    • DOI: 10.18178/IJMLC
    • Editor-in-Chief: Dr. Lin Huang
    • Executive Editor:  Ms. Cherry L. Chen
    • Abstracing/Indexing: Scopus(since 2017), EI (INSPEC, IET), Google Scholar, Crossref, ProQuest, Electronic Journals Library.
    • E-mail: ijmlc@ejournal.net
Dr. Lin Huang
Metropolitan State University of Denver, USA
It's my honor to take on the position of editor in chief of IJMLC. We encourage authors to submit papers concerning any branch of machine learning and computing.
IJMLC 2014 Vol.4(4): 313-318 ISSN: 2010-3700
DOI: 10.7763/IJMLC.2014.V4.430

A LDA-Based Approach for Semi-Supervised Document Clustering

Ruizhang Huang, Ping Zhou, and Li Zhang
Abstract—In this paper, we develop an approach for semi-supervised document clustering based on Latent Dirichlet Allocation (LDA), namely LLDA. A small amount of labeled documents are used to indicate user's document grouping preference. A generative model is investigated to jointly model documents and the small amount of document labels. A variational inference algorithm is developed to infer the document collection structure. We explore the performance of our proposed approach on both a synthetic dataset and realistic document datasets. Our experiments indicate that our proposed approach performs well on grouping documents based on different user grouping preferences. The comparison between our proposed approach and state-of-the-art semi-supervised clustering algorithms using labeled instance shows that our approach is effective.

Index Terms—Semi-supervised clustering, document clustering, latent dirichlet allocation, generative model.

The authors are with the College of Computer Science and Technology, Guizhou University, Guiyang, CO 550025 China (corresponding author: Li Zhang; e-mail: cse.rzhuang@gzu.edu.cn, gs.pzhou11@mail.gzu.edu.cn, lizhang_2004@126.com).


Cite: Ruizhang Huang, Ping Zhou, and Li Zhang, "A LDA-Based Approach for Semi-Supervised Document Clustering," International Journal of Machine Learning and Computing vol.4, no. 4, pp. 313-318, 2014.

Copyright © 2008-2019. International Journal of Machine Learning and Computing. All rights reserved.
E-mail: ijmlc@ejournal.net