Skip to content

Latest commit

 

History

History
60 lines (43 loc) · 3.03 KB

README.md

File metadata and controls

60 lines (43 loc) · 3.03 KB

Credit Scoring

This project aims to predict the creditworthiness of individuals using machine learning techniques. The dataset used for training and testing the model is a collection of financial and demographic data from credit applicants.

Overview

Credit scoring is the process of evaluating an individual's creditworthiness, or the likelihood that they will be able to repay a loan. Credit scores are used by lenders to assess the risk of lending to a particular borrower and determine the terms of the loan.

This project uses machine learning algorithms to predict credit scores based on a variety of financial and demographic data. The model is trained on a dataset of historical credit data and tested on a separate dataset to evaluate its performance.

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Prerequisites

To run this project, you will need:

  • Python 3.6 or higher
  • NumPy
  • Pandas
  • Scikit-learn You can install these packages using pip:

Copy code pip install numpy pandas scikit-learn

Installation

To install this project, clone the repository and navigate to the project directory:

Copy code git clone https://github.com/mehdy28/Credit_Scoring.git

cd Credit_Scoring

Usage

To train the model, run the following command: Copy code python train.py To test the model, run the following command: Copy code python test.py

Data

The dataset used in this project is a collection of financial and demographic data from credit applicants. It includes information on income, employment, credit history, and other variables that may be relevant for predicting credit scores.

The data was sourced from a credit bureau and is subject to strict confidentiality and privacy regulations. It has been cleaned and preprocessed to remove any personal identifiers and protect the privacy of the individuals included in the dataset.

Model

The machine learning model used in this project is a gradient boosting regressor. The model was trained on a subset of the data and tested on a separate dataset to evaluate its performance.

The model's hyperparameters were tuned using a grid search approach, and the final values chosen were based on the best performance on the validation set.

Results

The model achieved a mean absolute error of 3.4 on the test dataset, indicating that the model's predictions are generally within 3.4 points of the true credit scores.

Future Work

There are several areas for potential future work on this project:

Incorporating additional features or data sources could potentially improve the model's performance. Using more advanced machine learning techniques, such as deep learning models, may also improve the model's ability to predict credit scores. Developing a system for integrating the model's predictions into credit scoring processes and providing lenders with more accurate risk assessments. Credits The dataset used in this project was sourced from a credit bureau and is subject to strict confidentiality and privacy regulations.