[AAAI 2024] A Provably Accurate Randomized Sampling Algorithm for Logistic Regression

Code repository for the paper:

Agniva Chowdhury and Pradeep Ramuhalli. A Provably Accurate Randomized Sampling Algorithm for Logistic Regression. In Proceedings of the 38th AAAI Conference on Artificial Intelligence, 2024.

Technical Appendix

Technical Appendix of the paper can be found in TechnicalAppendix.pdf.

Datasets

Cardiovascular disease dataset (cardio): cardio_train.csv (sourced from here)
Bank customer churn prediction dataset (churn): Bank Customer Churn Prediction.csv (sourced from here)
Default of credit card clients dataset (default): default of credit card clients.csv (sourced from here)

Codes

To compute row leverage scores of a matrix: leverage_scores.py
To perform leverage score, l2s, or uniform sampling: row_sampling.py

The code for l2s sampling has been sourced from here.

Notebooks

To reproduce the experiments in the paper, run the following Jupyter Notebooks:

For Cardiovascular disease dataset: cardio_train.ipynb
For Bank customer churn prediction dataset: default_of_credit_card_clients.ipynb
For Default of credit card clients dataset: Bank_Customer_Churn_Prediction.ipynb

Citation

@article{Chowdhury_Ramuhalli_2024,
  title={A Provably Accurate Randomized Sampling Algorithm for Logistic Regression},
  author={Chowdhury, Agniva and Ramuhalli, Pradeep},
  journal={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={38},
  number={10},
  year={2024},
  pages={11597-11605},
  url={https://ojs.aaai.org/index.php/AAAI/article/view/29042},
  doi={10.1609/aaai.v38i10.29042}
}

Please contact Agniva Chowdhury for questions or comments.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[AAAI 2024] A Provably Accurate Randomized Sampling Algorithm for Logistic Regression

Technical Appendix

Datasets

Codes

Notebooks

Citation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
.gitattributes		.gitattributes
Bank Customer Churn Prediction.csv		Bank Customer Churn Prediction.csv
Bank_Customer_Churn_Prediction.ipynb		Bank_Customer_Churn_Prediction.ipynb
LICENSE		LICENSE
README.md		README.md
TechnicalAppendix.pdf		TechnicalAppendix.pdf
cardio_train.csv		cardio_train.csv
cardio_train.ipynb		cardio_train.ipynb
default of credit card clients.csv		default of credit card clients.csv
default_of_credit_card_clients.ipynb		default_of_credit_card_clients.ipynb
leverage_scores.py		leverage_scores.py
row_sampling.py		row_sampling.py

License

AgnivaC/SubsampledLogisticRegression

Folders and files

Latest commit

History

Repository files navigation

[AAAI 2024] A Provably Accurate Randomized Sampling Algorithm for Logistic Regression

Technical Appendix

Datasets

Codes

Notebooks

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages