Finboost Machine Learning is a financial chatbot project that can answer finance-related questions in Indonesian. The project leverages deep learning models with fine-tuning on the `cahya/gpt2-small-indonesian-522M` model for generative AI and uses an LSTM model for question suggestion.
- Python: Primary programming language used for the project.
- Flask: Web framework for building the REST API.
- TensorFlow: Deep learning framework used for model training and inference.
- Pandas: Library for data manipulation and analysis.
- Transformers: Hugging Face library for working with transformer models.
- NumPy: Library for numerical operations.
- Matplotlib: Library for creating visualizations.
- Google Colab: Cloud service for running Jupyter notebooks.
- GPT-2: Transformer model fine-tuned for generating responses.
- LSTM: Model used for question suggestion based on user input and profile data.
This project uses the following architecture:
- Data Preparation: Preparing question and answer data in CSV format.
- Model Fine-Tuning: Fine-tuning the
cahya/gpt2-small-indonesian-522M
model using the dataset for generative AI. - LSTM Model: Using an LSTM model for question suggestion.
File/Folder Name | Description |
---|---|
data/generative-ai/finansial-dataset-v2.csv |
Example dataset generative-ai in CSV format |
data/question-suggestion/data.csv |
Example dataset question suggestion in CSV format |
notebooks/generative_ai.ipynb |
Notebook for fine-tuning and using the generative AI model |
notebooks/question_suggestion.ipynb |
Notebook for training and using the LSTM model for question suggestion |
models/gen-ai |
Directory to store the fine-tuned generative AI model (create it yourself by running notebooks/generative_ai.ipynb ) |
models/question-suggestion |
Directory to store the question suggestion model (create it yourself by running notebooks/question_suggestion.ipynb ) |
preprocessing/combined_dataset.ipynb |
Notebook for preprocessing the collected dataset |
scripts/ |
Folder to save the script for future features |
generative-ai/ |
Folder to deploy generative-ai model |
generative-ai-v2/ |
Folder to deploy generative-ai-v2 (Coming soon ) |
generative-ai-v3/ |
Folder to deploy generative-ai-v3 (Coming soon ) |
question-suggestion/ |
Folder to deploy question-suggestion model |
requirements.txt |
List of dependencies for this project |
-
Clone the repository:
git clone https://github.com/username/finboost-ml.git cd finboost-ml
-
Create a virtual environment:
python -m venv venv source venv/bin/activate # On Windows use `venv\Scripts\activate`
-
Install dependencies:
pip install -r requirements.txt
-
Download and place the datasets:
- Place
finansial-dataset-v2.csv
in thedata/generative-ai/
directory. - Place
data.csv
in thedata/question-suggestion/
directory.
- Place
-
Create Folder Models:
- Create a
models
folder in the root directory/
:mkdir models
NOTE: When running the Flask application, you must also create model folders within the
generative-ai
andquestion-suggestion
- Create
models
folders in thegenerative-ai/models
andquestion-suggestion/models
directories:mkdir -p generative-ai/models mkdir -p question-suggestion/models
-
Run the notebooks to obtain the models
- Open
notebooks/generative_ai.ipynb
in Google Colab or Jupyter Notebook and execute all the cells to fine-tune thecahya/gpt2-small-indonesian-522M
model and generate the model files. - Open
notebooks/question_suggestion.ipynb
in Google Colab or Jupyter Notebook and execute all the cells to train the LSTM model and generate the model files.
- Open
-
Download and place the models:
- Place the fine-tuned generative AI model in the each
models/gen-ai/
directory. - Place the question suggestion model files (
tokenizer.pickle
,label_dict.pickle
, andmodel_question_suggestion.h5
) in the eachmodels/question-suggestion/
directory.
- Place the fine-tuned generative AI model in the each
-
Run the Flask application:
python generative-ai/app/main.py python question-suggestion/app/main.py
Run the notebook:
- Run the Generative AI notebook:
- Execute all cells in the
generative_ai.ipynb
notebook to fine-tune thecahya/gpt2-small-indonesian-522M
model and generate responses.
- Run the Question Suggestion notebook:
- Execute all cells in the
question_suggestion.ipynb
notebook to train the LSTM model and make question suggestions.
- Fine-Tuning:
- Use the fine-tuning script
generative_ai.py
to fine-tune thecahya/gpt2-small-indonesian-522M
model on new data.
- Question Suggestion:
- Use the
question_suggestion.py
script to train and use the LSTM model for question suggestion.
- Fork the repository
- Create a new branch (
git checkout -b feature-branch
) - Commit your changes (
git commit -am 'Add new feature'
) - Push to the branch (
git push origin feature-branch
) - Create a new Pull Request
You can save the above content into the README.md
file for your project. This file includes all the essential information about the project, including folder structure, setup steps, and usage instructions.