A Hybrid Active Learning and Progressive Sampling Algorithm

Home > Archive > 2018 > Volume 8 Number 5 (Oct. 2018) >

IJMLC 2018 Vol.8(5): 423-427 ISSN: 2010-3700
DOI: 10.18178/ijmlc.2018.8.5.723

Amr ElRafey and Janusz Wojtusiak

Abstract—Sampling techniques for data mining applications can be broadly categorized into Random Sampling (RS), Active Learning (AL) and Progressive Sampling (PS). Progressive Sampling techniques grow an initial sample up to the point beyond which model accuracy no longer significantly improves. These methods have been shown to be computationally efficient. The sampling schedule to be used with progressive sampling techniques is still an ongoing issue of research due to the fact that available sampling schemes may either overshoot, resulting in a final sample which is larger than necessary, or they may grow the sample too slowly thus requiring many iterations of the algorithm before convergence is reached. We demonstrate how using Batch Mode Uncertainty Sampling from the domain of active learning, to progressively grow the sample, can significantly improve the performance of progressive sampling. Through a series of trials on both simulated and real data, we show that our proposed Progressive Batch Mode Uncertainty Sampling (PBMUS) algorithm converges with a comparable or smaller number of data points at higher accuracy and in some cases, less computational time.

Index Terms—Active learning, uncertainty sampling, progressive sampling, linear regression with local sampling, random sampling, sampling, machine learning.

The authors are with George Mason University, Fairfax, VA 22030, USA (e-mail: aelrafey@gmu.edu, jwojtusi@gmu.edu).

[PDF]

Cite: Amr ElRafey and Janusz Wojtusiak, "A Hybrid Active Learning and Progressive Sampling Algorithm," International Journal of Machine Learning and Computing vol. 8, no. 5, pp. 423-427, 2018.

PREVIOUS PAPER

Senvis-Net: Learning from Imbalanced Machinery Data by Transferring Visual Element Detectors

NEXT PAPER

Generative Adversarial Networks for Generation and Classification of Physical Rehabilitation Movement Episodes

General Information

E-ISSN: 2972-368X
Abbreviated Title: Int. J. Mach. Learn.
Frequency: Quaterly
DOI: 10.18178/IJML
Editor-in-Chief: Dr. Lin Huang
Executive Editor: Ms. Cherry L. Chen
Abstracing/Indexing: Inspec (IET), Google Scholar, Crossref, ProQuest, Electronic Journals Library, CNKI.
E-mail: ijml@ejournal.net

Home

About IJML

Editorial Board

Author Guideline

Editor Guideline

Reviewer Guideline

Special Issues

Archive

Home > Archive > 2018 > Volume 8 Number 5 (Oct. 2018) >

A Hybrid Active Learning and Progressive Sampling Algorithm

General Information

Article Metrics in Dimensions