Skip to content

Venkatesh-99/Harmful-Bacteria-Classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Harmful Bacteria Classification with Machine Learning

This repository contains a Jupyter Notebook for classifying bacteria based on their features using various machine learning models. The notebook ('Bacteria_Classification.ipynb') analyzes a dataset ('bacteria_list_200.csv') that includes information about different bacteria, such as their name, family, habitat, and whether they are harmful to humans.

Dataset Details

This dataset was downloaded from Kaggle and can be found by following this link: https://www.kaggle.com/datasets/kanchana1990/bacteria-dataset

Repository Structure

Bacteria_Classification.ipynb

Jupyter Notebook containing the complete analysis and classification process.

bacteria_list_200.csv

Dataset file containing the information on bacteria

Dependencies

Ensure you have the following dependencies installed:

  • Python 3.x
  • pandas
  • matplotlib
  • seaborn
  • scikit-learn
  • xgboost

You can install these dependencies using pip:

pip install pandas matplotlib seaborn scikit-learn xgboost

Usage

  1. Clone the repository:

    mkdir bacteria_classification
    cd bacteria_classification
    git clone https://github.com/Venkatesh-99/Harmful-Bacteria-Classification.git
  2. Open and run the Jupyter Notebook:

    Launch Jupyter Notebook and open 'Bacteria_Classification.ipynb'. Run each cell in the notebook to execute the analysis steps.
  3. Follow the notebook instructions:

    • Load and preprocess the dataset (bacteria_list_200.csv).
    • Explore data insights and visualizations, such as pie charts and heatmaps.
    • Implement machine learning models including Random Forest, AdaBoost, Gradient Boosting, Logistic Regression, SVM, and XGBoost.
    • Evaluate model performance using metrics like accuracy, confusion matrices, and ROC curves.
    • Customize and adapt the notebook for further analysis or experimentation.

Notes

  • Adjust parameters, hyperparameters, and model configurations in the notebook based on specific dataset characteristics and analysis goals.

  • Refer to markdown cells and comments within the notebook for detailed explanations of each step and analysis result.

  • Extend the analysis with additional visualizations, model optimizations, or new machine learning algorithms as needed.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published