Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finetuning functionality #28

Merged
merged 7 commits into from
Nov 21, 2024
Merged

Finetuning functionality #28

merged 7 commits into from
Nov 21, 2024

Conversation

JLrumberger
Copy link
Collaborator

@JLrumberger JLrumberger commented Nov 14, 2024

What is the purpose of this PR?

Purpose is to add finetuning functionality to the Nimbus Inference repo and make it easily accessible for users.

How did you implement your changes

  1. Cleaned up some things: 1.1 data normalization is done in the dataset object, so the nimbus_preprocess was deleted.
  2. Added checkpoint selection methods to the Nimbus class (Nimbus.list_checkpoints, Nimbus.load_local_checkpoint), so that users can select their fine-tuned checkpoints for inference.
  3. Added an LmdbDataset class that is used to load the training and validation data and added a prepare_training_data function that takes in a MultiplexedDataset object and writes the LMDB files for the LmdbDataset to disc.
  4. Added an AugmentationPipeline which does simple augmentations (flipping, rotation, color jiggle and noise) and a SmoothBinaryCELoss class that does label smoothing and masks out the cells labeled as ambigious.
  5. Added a Trainer class that takes in the Nimbus and LmdbDataset objects and does finetuning, validation and writing of the checkpoints.
  6. Added a 2_Nimbus_Finetuning.ipynb notebook which does finetuning of Nimbus on the example dataset. In order for this to work reliably, we would need to upload the noisy_groundtruth.csv to the example dataset huggingface space.

Remaining issues

I need to add the new functionality to the documentation and also upload the new notebook to google colab. I will do so after having added the interactive viewer. An create a new release afterwards

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@JLrumberger JLrumberger changed the title Added magnification levels for scaling and extended Multiplexed datas… Finetuning functionality Nov 14, 2024
@JLrumberger JLrumberger requested review from srivarra and ngreenwald and removed request for srivarra November 20, 2024 12:43
@JLrumberger JLrumberger merged commit 5bb42cd into main Nov 21, 2024
12 checks passed
@JLrumberger JLrumberger deleted the finetuning branch November 21, 2024 19:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant