Data preprocessing is a required first step before any machine learning machinery can be applied, because the algorithms learn from the data and the learning outcome for problem solving heavily depends on the proper data needed to solve a particular problem – which are called features.
- Given the database, the CLI provides various options to preprocess the data.
- Options:
- Data Description
- Handling NULL Values
- Encoding Categorical Data
- Feature Scaling
- Data Visualisation
- You can also DOWNLOAD⬇️ the modified dataset.
- Clone this Repo:
git clone https://github.com/priyavratuniyal/ML_Preprocessor_Pipeline_CLI.git
cd ML_Preprocessor_Pipeline_CLI/
pip3 install -r requirements.txt
- Now run,
python3 main.py
[Dataset's Path]
Example:python3 main.py dataset.csv
Note: You can run python3 main.py sample_data.csv
, a sample data file is provided with this repo.