To read this content please select one of the options below:

An Oversampling Technique for Classifying Imbalanced Datasets

Advances in Business and Management Forecasting

ISBN: 978-1-78743-070-9, eISBN: 978-1-78743-069-3

Publication date: 26 October 2017

Abstract

We propose an oversampling technique to increase the true positive rate (sensitivity) in classifying imbalanced datasets (i.e., those with a value for the target variable that occurs with a small frequency) and hence boost the overall performance measurements such as balanced accuracy, G-mean and area under the receiver operating characteristic (ROC) curve, AUC. This oversampling method is based on the idea of applying the Synthetic Minority Oversampling Technique (SMOTE) on only a selective portion of the dataset instead of the entire dataset. We demonstrate the effectiveness of our oversampling method with four real and simulated datasets generated from three models.

Keywords

Citation

Nguyen, S., Quinn, J. and Olinsky, A. (2017), "An Oversampling Technique for Classifying Imbalanced Datasets", Advances in Business and Management Forecasting (Advances in Business and Management Forecasting, Vol. 12), Emerald Publishing Limited, Leeds, pp. 63-80. https://doi.org/10.1108/S1477-407020170000012004

Publisher

:

Emerald Publishing Limited

Copyright © 2018 Emerald Publishing Limited