The Blackbird is a highthroughput phenomics imaging platform developed through collaboration of scientists and engineers at Cornell AgriTech, the USDA, and PrinterSys. Many scripts in this repository build off of Tian Qiu's original repository (used for this paper).
If you came here from Plant Health or Plant Bio to familiarize yourself with computer vision, I suggest you start by checking out some simpler examples of computer vision implementation as these models and the associated code is quite complex. The company Roboflow (unaffilated) has some really awesome tutorials on getting started with computer vision, so I encourage you to check them out and work through the examples with them:
This repo is still in progress as I'm still actively improving our code and models; alas, feel free to email me with any questions or clarifications: [email protected]
The code in this repository primarily uses Pytorch pretrained models to train and subsequently make inferences on leaf disks with or without powdery mildew.
Overview of the training and inference process:
1 cm leaf disks were excised using ethanol disinfested leather punches and subsequently arrayed adaxial side onto up on 1% water agar plates. Image acquisition was performed using the Blackbird CNC Imaging Robot (version 2, developed by Cornell University and PrinterSys). The Blackbird is a G-code driven CNC that positions a Nikon FT2 DLSR camera equipped with a 2.5x zoom ultra-macro lens (Venus Optics Laowa 25mm) in the X/Y position and then the camera captures images in a z-stack every 20 uM in Z-height. Blackbird datasheets were prepared using the generateBlackbirdDatasheet.py script. The image stacking process is automated using the stackPhotos.py Python script. Helicon Focus software (Helicon Software, version 8.1) was utilized to perform the focus stacking, with the parameters set to method B (depth map radius: 1, smoothing radius: 4, and sharpness: 2).
Example images can be viewed here. Models, images, and training data to be released with manuscript.
conda env coming soon...
CUDA is required for GPU usage; currently it's only available for PCs. Please check your GPU to figure out which version you need. If running on Apple Silicon, MPS is necessary to take advantage of accelerated Pytorch.
Package Requirements:
torch torchvision tensorboard termcolor optuna pandas captum matplotlib pandas pillow scikit-learn glob2 optuna h5py hashlib opencv-python
If running on google colab (recommended), just run: !pip install optuna==3.1.0 termcolor pandas scikit-learn==1.0.2 numba==0.56.4 captum
as the other packages should already be installed.
To train your own model, you need:
-
image patches to make train/test/val .hdf5 files
- you can make image patches using .preprocessing/makePatches.py. It's easiest to sort these patches into different directories according to the label (e.g. if it has a dog in it, put it in the dog directory)
- you can label these image patches in a given directory by adding a suffix to the filename using ./preprocessing/rename_files.py
- you can then make train/test/val hdf5 files using ./preprocessing/images_to_test_train_hdf5.py
-
determine mean r/g/b values of your test/train/val sets using .preprocessing/get_mean_std.py and plug those into your ./script/train.sh script under
--means
and--stds
(super important...this dramatically effects your model performance). -
Customize other training parameters such as the model, learning rate, etc. within the ./script/train.sh script. See the argparse section in ./classification/run.py to see full list of customizable variables.
Note: You can start with the default values, but your model will perform much better if you try different base models and find the optimal hyperparamters (e.g. by using Optuna hyperparameter engineering as shown below).
Coming soon...
Coming soon...
Coming soon...