Implemented a research paper on regression analysis from scratch to predict Co2 emissions of different vehicles.
Achieved an accuracy of 99.67% using Random Forest Regressor and improved accuracy by 1.24%.
Research Paper link: https://www.researchgate.net/publication/379941902
This repository contains a machine learning project to predict CO2 emissions using multiple regression models, including:
- Linear Regression
- Decision Tree Regressor
- Random Forest Regressor
- K-Nearest Neighbors (KNN) Regressor
- XGBoost Regressor
-
Handling Missing Values:
- Checked for missing data and filled or dropped values as necessary.
-
Feature Engineering:
- Created derived features such as
Fuel Consumption per Cylinder
andCO2 per Liter
. - Combined redundant features like city and highway fuel consumption.
- Created derived features such as
-
Feature Scaling:
- Standardized numerical features for models sensitive to scaling (e.g., KNN, SVR).
-
Categorical Encoding:
- Used one-hot encoding for categorical features like
Transmission
andFuel Type
.
- Used one-hot encoding for categorical features like
-
Train-Test Split:
- Split the dataset into 80% training and 20% testing sets.
Model | Accuracy (%) |
---|---|
Random Forest | 99.67 |
K-Neighbors Algorithm | 99.11 |
Linear Regression | 99.03 |
Decision Tree | 98.23 |
XGBRegressor | 84.16 |
Below is a bar graph comparing the accuracies of the models:
Clone this repository:
git clone https://github.com/Laasyakshara25/Co2-Emissions-Forecasting-Using-Regression.git