Skip to content

beelabhmc/pollen_id

Repository files navigation

Pollen ID

Setup Instructions

Basic Setup

  1. Clone this git repo: git clone https://github.com/beelabhmc/pollen_id
  2. Install anaconda
  3. Create the conda environment: conda env create -f environment.yml (this must be done in the pollen_id directory)
  4. Install pytorch (see here). This is not included in the environment.yml file because it is platform dependent.
  5. Activate the conda environment: conda activate pollen_id

Data Setup & Processing

  1. Download all the pollen slide images from the google drive folder (they may download as multiple .zip files, you will need to unzip and combine them into one folder manually)
  2. Move those files into the pollen_id folder.
    • Note: pollen_id should be the parent folder with each of the indivdual species folders directly inside of it
├── pollen_slides
│  ├── Acmispon glaber
│  ├── Amsinckia intermedia
│  ⋮
│  ├── Sambucus nigra
│  └── Solanum umbelliferum
  1. Download the model.yml.gz file (this is used for edge detection during pollen segmentation)
    • After downloading, unzip it, and make sure it is in the root pollen_id folder and called model.yml
  2. Run the intake data script python intake_data.py
    • This will go through all the images in the pollen_slides folder and create a database that categorizes them based on their folder and file name
  3. Run the pollen extraction script python extract_pollen.py
    • This extracts the individual pollen grains from each pollen slide image and stores them in a new folder pollen_grains

Now you are read to run the machine learning code.

Machine Learning Code

This repository contains code for training and testing two different networks. A CNN that uses transfer learning with ResNet 50 for feature extraction, and a SNN (which also happens to use ResNet 50 for feature extraction).

The CNN is stored in simple_cnn_classifier.py and the SNN is in meta_learning.py. While these are both python files, they contain cells similar to a jupyter notebook as described here. They can be run as normal python files, but for the best developer experience they should be opened in VS Code with the python and jupyter extensions installed. Both files contain comments and documentation in the form of Markdown cells so they are not described here.

Server Setup

  1. Copy the model.yml file into server/api/models and rename it to edge_detection_model.yml
  2. Copy your trained network .pth file into server/api/models and give it a useful name.
    • You will need to update the filename that the server reads at the bottom of classify_pollen.py. If you change the nextwork structure, you will also need to update the network architecture in classify_pollen.py as well.
    • If you change the number of classes, you will also need to update the index to class mapping in classify_pollen.py

Running the Server

The front end of the server is hosted on github pages, and will automatically be updated on each commit The backend ML api is contained in the server folder. To run the server, you will need to have the conda environment activated and then run python server.py from the server folder. The required libraries for the server are the same as in requirements.txt. PyTorch should also be installed on the server following the directions above.

About

Repository for the Pollen ID project

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published