A Novel Heuristic Method for Misclassification Cost Tuning in Imbalanced Data

Home > Archive > 2018 > Volume 8 Number 6 (Dec. 2018) >

IJMLC 2018 Vol.8(6): 565-570 ISSN: 2010-3700
DOI: 10.18178/ijmlc.2018.8.6.746

Anusara Hirunyawanakul, Nittaya Kerdprasop, and Kittisak Kerdprasop

Abstract—Currently, one of the most challenging problem in machine learning and data mining is the data imbalance problem. Many techniques and methods are researched and proposed to solve this problem. Fundamental solution is data balancing with under-sampling and over-sampling techniques. However, these conventional methods might be suffered from the potential loss of useful information leading to the generation of useless patterns. Therefore, the techniques that avoid adjusting the sample size of data are more interesting. One of such technique is misclassification cost adjustment. This paper focuses on improving the performance of classification model built from the misclassification cost adjustment technique by proposing the novel heuristic method. Our proposed method uses a heuristic based on the experience of practitioner working on many manufacturing data. The heuristic employs the relation between misclassification cost, imbalance ratio and a constant factor “e” (Euler’s number). The experiment has been operated on 56 real-world datasets with various number of attributes and different degrees of imbalance ratio. The results confirm that our novel heuristic method can help improving the performance of the classification model. On datasets with high imbalance ratio, our method shows the improvement rate of AUC up to 29%.

Index Terms—Misclassification cost, imbalance data, classification, decision tree learning.

The authors are with School of Computer Engineering, Suranaree University of Technology, Nakhon Ratchasima 30000, Thailand (e-mail: Anusara.hi@gmail.com, nittaya@sut.ac.th, kerdpras@sut.ac.th).

[PDF]

Cite: Anusara Hirunyawanakul, Nittaya Kerdprasop, and Kittisak Kerdprasop, "A Novel Heuristic Method for Misclassification Cost Tuning in Imbalanced Data," International Journal of Machine Learning and Computing vol. 8, no. 6, pp. 565-570, 2018.

PREVIOUS PAPER

Integration Extreme Learning Machine with ARIMA Model for Forecasting Electricity Purchasing and Distribution Data in Thailand

NEXT PAPER

Recommender System for Global Terrorist Database Based on Deep Learning

General Information

E-ISSN: 2972-368X
Abbreviated Title: Int. J. Mach. Learn.
Frequency: Quaterly
DOI: 10.18178/IJML
Editor-in-Chief: Dr. Lin Huang
Executive Editor: Ms. Cherry L. Chen
Abstracing/Indexing: Inspec (IET), Google Scholar, Crossref, ProQuest, Electronic Journals Library, CNKI.
E-mail: ijml@ejournal.net

Home

About IJML

Editorial Board

Author Guideline

Editor Guideline

Reviewer Guideline

Special Issues

Archive

Home > Archive > 2018 > Volume 8 Number 6 (Dec. 2018) >

A Novel Heuristic Method for Misclassification Cost Tuning in Imbalanced Data

General Information

Article Metrics in Dimensions