Daniel Cerkoney and Anthony Young
The aim of this project is to study factors contributing to the survival of cancer patients in the US. To that end, we used Kaplan-Meier estimation and Cox proportional hazards regression to analyze all brain tumor cases in the surveillance, epidemiology, and end results (SEER) database from 2000–2020. Data cleaning and pre-processing and the train-test split were performed using scikit-learn, while the lifelines library was used to fit the survival models.
Here are some slides describing the project. Finally, here are some examples of predicted survival functions adjusted for year of diagnosis using the Kaplan-Meier estimator and Cox regression, respectively: