Feature Request: Enable Self-Supervised Learning in OpenFL #1297

porteratzo · 2025-01-22T22:57:00Z

Is your feature request related to a problem? Please describe.
Collaborators often face challenges in contributing to federated learning training due to the scarcity of labeled data. Generally, unlabeled data is much more abundant compared to labeled data. However, without a method to train with unlabeled data, this resource remains underutilized.

Describe the solution you'd like
Introduce Self-Supervised Learning(SSL) algorithms into OpenFL to enable training on unlabeled data. This can be achieved by creating workflows that utilize techniques such as Masked Autoencoders (MAE) or DinoV2. These algorithms can pretrain models on unlabeled data, which can then be fine-tuned on labeled data for specific tasks. The final model is expected to achieve better accuracy compared to models trained solely on labeled data.

porteratzo · 2025-01-23T17:21:25Z

Proposed Integration: OpenFL Workspaces for Pretraining and Fine-Tuning

Objective:

To enhance the functionality of OpenFL by creating two distinct workspaces: one dedicated to pretraining and another for fine-tuning. This will enable users to understand and compare the benefits of SSL pretraining.

Dataset:

We propose using the BraTS2020 dataset, which is already approved for Intel. The dataset can be accessed here.

Features:

Pretraining Workspace:
- Users can run the pretraining workflow using the full BraTS2020 dataset.
- This workspace will use the full dataset without using labels to build a robust initial model for fine tuning.
Fine-Tuning Workspace:
- Users can run the fine-tuning workflow using a subset of the BraTS2020 dataset.
- This workspace will allow for more focused training to refine the pretrained model for specific tasks or datasets.
Workflow Flexibility:
- Users will have the option to run pretraining followed by fine-tuning in a seamless workflow.
- Alternatively, users can choose to run only the fine-tuning process and compare the results with the pretrained model.
- The user should also be able to configure dataset distribution using Dirichlet Distribution-Based Partitioning, This will allow the user to compare Independent and Identically Distributed data vs Non-Independent and Identically Distributed data, which is a more realistic distribution and where SSL pretraining excels.
Result Comparison:
- We will provide notebook to compare the final results of the fine-tuned model against the pretrained model.
- This will help users evaluate the effectiveness of the SSL pretraining process and make informed decisions about model performance.

Benefits:

Efficiency: By leveraging SSL algorithms, users can utilize abundant unlabeled data to build robust initial models, reducing the dependency on scarce labeled data.
Performance Evaluation: Comparison tools will enable users to assess the impact of SSL pretraining on model performance, leading to better optimization and deployment strategies.

securefederatedai locked and limited conversation to collaborators Jan 27, 2025

porteratzo converted this issue into discussion #1316 Jan 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Feature Request: Enable Self-Supervised Learning in OpenFL #1297

Feature Request: Enable Self-Supervised Learning in OpenFL #1297

porteratzo commented Jan 22, 2025 •

edited

Loading

porteratzo commented Jan 23, 2025 •

edited

Loading

This issue was moved to a discussion.

This issue was moved to a discussion.

Feature Request: Enable Self-Supervised Learning in OpenFL #1297

Feature Request: Enable Self-Supervised Learning in OpenFL #1297

Comments

porteratzo commented Jan 22, 2025 • edited Loading

porteratzo commented Jan 23, 2025 • edited Loading

Proposed Integration: OpenFL Workspaces for Pretraining and Fine-Tuning

Objective:

Dataset:

Features:

Benefits:

This issue was moved to a discussion.

porteratzo commented Jan 22, 2025 •

edited

Loading

porteratzo commented Jan 23, 2025 •

edited

Loading