This repository contains a Python-based implementation of a data preprocessing pipeline. The model performs essential preprocessing tasks required for machine learning workflows. The steps include handling missing data, encoding categorical data, splitting the dataset into training and testing sets, and feature scaling.
-
Data Preprocessing
- Handling missing data.
- Encoding categorical data (both independent and dependent variables).
- Splitting data into training and testing sets.
- Feature scaling (standardization).
-
User-Friendly Interface.
- Easily adaptable to different datasets.
- Modular code structure for each preprocessing step.
The model uses the Covid_Data.csv file, which contains anonymized COVID-related data. Ensure the dataset is present in the working directory.
|-- Covid_Data.csv
|-- Cleaned_Covid_Data.csv
|-- model.ipynb
|-- README.md
This project is licensed under the MIT License.
Created by Arpan Surin. Feel free to contact me for any questions or suggestions!