• May 23, 2018 News![CFP] 2018 the annual meeting of IJMLC Editorial Board, ACMLC 2018, will be held in Ho Chi Minh, Vietnam, December 7-9, 2018   [Click]
  • May 23, 2018 News!Good News! All papers from Volume 8, Number 1 have been indexed by Scopus!   [Click]
  • Jun 21, 2018 News!Vol.8, No.3 has been published with online version.   [Click]
Search
General Information
Editor-in-chief
Dr. Lin Huang
Metropolitan State University of Denver, USA
It's my honor to take on the position of editor in chief of IJMLC. We encourage authors to submit papers concerning any branch of machine learning and computing.
IJMLC 2013 Vol.3(2): 214-218 ISSN: 2010-3700
DOI: 10.7763/IJMLC.2013.V3.305

Constrained Classification of Large Imbalanced Data by Logistic Regression and Genetic Algorithm

Martin Hlosta, Rostislav Stríž, Jan Kupčík, Jaroslav Zendulka, and Tomáš Hruška
Abstract—Imbalance in data classification is a frequently discussed problem that is not well handled by classical classification techniques. The problem we tackled was to learn binary classification model from large data with accuracy constraint for the minority class. We propose a new meta-learning method that creates initial models using cost-sensitive learning by logistic regression and uses these models as initial chromosomes for genetic algorithm. The method has been successfully tested on a large real-world data set from our internet security research. Experiments prove that our method always leads to better results than usage of logistic regression or genetic algorithm alone. Moreover, this method produces easily understandable classification model.

Index Terms—Imbalanced data, classification, genetic algorithm, logistic regression.

The authors are with the Department of Information Systems, Faculty of Information Technology, IT4Innovations Centre of Excellence, Brno University of Technology, Božetěchova 1/2, 612 66 Brno, Czech Republic (e-mail: ihlosta@fit.vutbr.cz, istriz@fit.vutbr.cz, ikupcik@fit.vutbr.cz, zendulka@fit.vutbr.cz, hruska@fit.vutbr.cz).

[PDF]

Cite:Martin Hlosta, Rostislav Stríž, Jan Kupčík, Jaroslav Zendulka, and Tomáš Hruška, "Constrained Classification of Large Imbalanced Data by Logistic Regression and Genetic Algorithm," International Journal of Machine Learning and Computing vol. 3, no. 2, pp. 214-218, 2013.

Copyright © 2008-2018. International Journal of Machine Learning and Computing. All rights reserved.
E-mail: ijmlc@ejournal.net