An overview of classification algorithms for (分类算法的概述).pdf
文本预览下载声明
International Journal of Emerging Technology and Advanced Engineering
Website: (ISSN 2250-2459, Volume 2, Issue 4, April 2012)
An overview of classification algorithms for imbalanced
datasets
Vaishali Ganganwar
Army Institute of Technology, Pune
vaishaliloni@
Abstract— Unbalanced data set, a problem often found in The patient could lose his/her life because of the delay in
real world application, can cause seriously negative effect the correct diagnosis and treatment. Similarly, if carrying a
on classification performance of machine learning bomb is positive, then it is much more expensive to miss a
algorithms. There have been many attempts at dealing with terrorist who carries a bomb to a flight than searching an
classification of unbalanced data sets. In this paper we present
innocent person.
a brief review of existing solutions to the class-imbalance
problem proposed both at the data and algorithmic levels. The unbalanced data set problem appears in many real
Even though a common practice to handle the problem of world applications like text categorization, fault detection,
imbalanced data is to rebalance them artificially by fraud detection, oil-spills detection in satellite images,
oversampling and/or under-sampling, some researchers
显示全部