• Jul 29, 2019 News!IJMLC Had Implemented Online Submission System, Please Sumbit New Submissions thorough This System Only!   [Click]
  • Jul 16, 2019 News!Good News! All papers from Volume 9, Number 3 have been indexed by Scopus!   [Click]
  • Jul 08, 2019 News!Vol.9, No.4 has been published with online version.   [Click]
General Information
    • ISSN: 2010-3700 (Online)
    • Abbreviated Title: Int. J. Mach. Learn. Comput.
    • Frequency: Bimonthly
    • DOI: 10.18178/IJMLC
    • Editor-in-Chief: Dr. Lin Huang
    • Executive Editor:  Ms. Cherry L. Chen
    • Abstracing/Indexing: Scopus (since 2017), EI (INSPEC, IET), Google Scholar, Crossref, ProQuest, Electronic Journals Library.
    • E-mail: ijmlc@ejournal.net
Dr. Lin Huang
Metropolitan State University of Denver, USA
It's my honor to take on the position of editor in chief of IJMLC. We encourage authors to submit papers concerning any branch of machine learning and computing.

IJMLC 2015 Vol.5(6): 454-457 ISSN: 2010-3700
DOI: 10.18178/ijmlc.2015.5.6.551

A Feature-Partition and Under-Sampling Based Ensemble Classifier for Web Spam Detection

Xiaoyong Lu, Musheng Chen, Jhenglong Wu, and Peichan Chan
Abstract—Web spam detection has become one of the top important tasks for web search engines. Web spam detection is a class imbalance problem because normal pages are far more than spam pages. However, most of traditional learning methods are not effective on imbalance classification problems. In order to tackle this problem and make full use of various features extracted from web pages’ content and links, this paper presents an ensemble classifier based on under-sampling and feature-partition techniques and integrates decision tree algorithm C4.5 into it as a sub classifier to detect web spam. The experimental results show that the ensemble classifier outperforms other approaches on several evaluation metrics such as F1-Measue, AUC etc. in WEBSPAM-UK2006 dataset.

Index Terms—Web spam detection, under-sampling, features partition, ensemble classifier, C4.5.

Xiaoyong Lu and Musheng Chen are with Nanchang University, China (e-mail: lxy@ncu.edu.cn, dreaminit@gmail.com).
Jhenglong Wu and Peichan Chan are with the Information Management Department, Yuan Ze University, Taiwan (e-mail: jlwu.yzu@gmail.com, iepchang@saturn.yzu.edu.tw).


Cite: Xiaoyong Lu, Musheng Chen, Jhenglong Wu, and Peichan Chan, "A Feature-Partition and Under-Sampling Based Ensemble Classifier for Web Spam Detection," International Journal of Machine Learning and Computing vol.5, no. 6, pp. 454-457, 2015.

Copyright © 2008-2019. International Journal of Machine Learning and Computing. All rights reserved.
E-mail: ijmlc@ejournal.net