Random Sampling for imbalanced datasets #1004
-
Hello guys, I am doing a research on performance of classifiers on extremely imbalanced datasets. For this use-case I tested almost all classifiers on River and also MOA (https://github.com/Waikato/moa). So far the only algorithms that performed ok were the Random Sampling (both up and down) as well as the UOB algorithms in MOA. My question is regarding the Random Sampling algorithms, was it developed based on some literature (paper?)? and if yes, please provide the publication. Also, could someone please explain what is the parameter ( Best wishes, |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hey there.
Some of the people behind MOA are the same ones behind scikit-multiflow, which merged with creme to become River. It's a small world :)
No, not really. I got the idea from reading Oza and Russell's online bagging and boosting paper. I then realized that rejection sampling was the mechanism I needed to sample data online with a desired target distribution. I wrote a blog post about it, which led to the I can't find any of my notes concerning I hope that helps. |
Beta Was this translation helpful? Give feedback.
Hey there.
Some of the people behind MOA are the same ones behind scikit-multiflow, which merged with creme to become River. It's a small world :)
No, not really. I got the idea from reading Oza and Russell's online bagging and boosting paper. I then realized that rejection sampling was the mechanism I needed to sample data online with a desired …