Skip to content

Random Sampling for imbalanced datasets #1004

Answered by MaxHalford
ZahirBilal asked this question in Q&A
Discussion options

You must be logged in to vote

Hey there.

I am doing a research on performance of classifiers on extremely imbalanced datasets. For this use-case I tested almost all classifiers on River and also MOA (https://github.com/Waikato/moa).

Some of the people behind MOA are the same ones behind scikit-multiflow, which merged with creme to become River. It's a small world :)

My question is regarding the Random Sampling algorithms, was it developed based on some literature (paper?)? and if yes, please provide the publication.

No, not really. I got the idea from reading Oza and Russell's online bagging and boosting paper. I then realized that rejection sampling was the mechanism I needed to sample data online with a desired …

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@ZahirBilal
Comment options

Answer selected by MaxHalford
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants