Skip to content

Commit

Permalink
Merge pull request #4 from scaleoutsystems/feature/compute-package
Browse files Browse the repository at this point in the history
Feature/compute package
  • Loading branch information
KatHellg authored Nov 5, 2024
2 parents f7551fb + 279ec29 commit c1ae1f2
Show file tree
Hide file tree
Showing 37 changed files with 395 additions and 1,065 deletions.
37 changes: 19 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,11 +29,11 @@ This initializes the server which later will be used to run the federated learni
## Step 2: Cloning the repository
Next, you need to clone the repository:
```bash
git clone https://github.com/scaleoutsystems/ultralytics-implementation-in-fedn-tutorial
git clone https://github.com/scaleoutsystems/fedn-ultralytics-tutorial
```
Then navigate into the repository:
```bash
cd ultralytics-implementation-in-fedn-tutorial
cd fedn-ultralytics-tutorial
```
This repository contains all the necessary files and configurations for the federated learning setup.

Expand All @@ -45,11 +45,19 @@ pip3 install -r requirements.txt
This is recommended to be done in a virtual environment to avoid conflicts with other packages.

## Step 4: Setting up the dataset
Ultralytics stores all datasets in a specific directory. You can set the location of this directory by configuring the datasets_dir option with yolo settings. To do this, run the following command:
Start setting up the dataset by creating a directory named 'datasets' inside the repository:
```bash
mkdir datasets
```
Then copy the path to the 'datasets' folder by running the following command:
```bash
echo "$(pwd)/datasets" | pbcopy
```
Ultralytics uses a specific directory to store datasets, which you can configure using the datasets_dir option in YOLO settings. To set this up to the 'datasets' directory you previously created, run the following command, replacing <path_to_dataset> with the path you just copied:
```bash
yolo settings datasets_dir=<path_to_dataset>
```
Replace <path_to_dataset> with the path to your datasets folder, which in this tutorial is named 'datasets'. Then your data needs to be placed within a folder called 'fed_dataset'. The structure should be as follows:
After setting the dataset directory, you’ll need to organize your data into a folder named 'fed_dataset' inside the datasets directory. Your final folder structure should look like this:
```bash
datasets/
fed_dataset/
Expand All @@ -62,7 +70,7 @@ datasets/
image1.txt
image2.txt
...
val/
valid/
images/
image1.jpg
image2.jpg
Expand All @@ -84,31 +92,24 @@ Each line corresponds to one bounding box in the image.

For further details on how to prepare your dataset, you can visit <https://docs.ultralytics.com/datasets/>.

For getting started quickly with a sample dataset, you can navigate into the `examples` repository to download and partition a sample dataset.
For getting started quickly, you can navigate into the `examples` repository to download and partition a sample dataset.

## Step 5: Setting up configurations

### Number of classes
To set up your Ultralytics model, you need to adjust the configuration files. Specifically, the number of classes (nc) must be set in both the `data.yaml` and the `yolov8_.yaml` files. Make sure to update these files with the appropriate number of classes for your specific dataset.

### Size of the model
You also need to select which YOLOv8 model to use by renaming the `yolov8_.yaml` file according to the desired model variant:
- For YOLOv8n (nano), rename the file to `yolov8n.yaml`
- For YOLOv8s (small), rename the file to `yolov8s.yaml`
- For YOLOv8m (medium), rename the file to `yolov8m.yaml`
- For YOLOv8l (large), rename the file to `yolov8l.yaml`
- For YOLOv8x (extra large), rename the file to `yolov8x.yaml`
### Global configurations
To set up your YOLOv8 model, you need to configure the global_config.yaml inside the 'client' folder. Here you choose the number of classes for the YOLOv8 model by setting the `num_classes` parameter, and respective class names. You also choose which YOLOv8 model to use by setting the `model_size` parameter.

### Local client configurations
Each client can set different training configurations in the `client_config.yaml` file. This file contains the configurations for the client environments, such as the number of local epochs, and batch size. You can adjust these configurations to suit each client's hardware and training requirements.

## Step 6: Building the compute package
Once you’ve completed all the configurations, you can build the compute package by running the following command:
```bash
python3 client/setup.py
fedn package create -p client
```
The compute package contains all the necessary files and configurations for the client environments.
If you make any changes to the configurations later, you’ll need to rebuild and reupload the compute package to apply the updates.
This creates he compute package `package.tgz` which contains all the necessary files and configurations for the client environments.
If you make any changes to the global_config.yaml, you’ll need to rebuild (Step 6) and reupload (Step 8) the compute package to apply the updates. For changes in the `client_config.yaml`, you don't need to rebuild the compute package.

## Step 7: Initializing the seed model
To initialize the seed model, run the following command:
Expand Down
7 changes: 0 additions & 7 deletions client/data.yaml

This file was deleted.

10 changes: 10 additions & 0 deletions client/global_config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Configuration for YOLOv8 Model and Dataset Paths
# Adjust settings here to define model size, class details, and dataset paths

model_size: nano # Options: nano, small, medium, large, x-large
num_classes: 3 # Number of classes
class_names: ['Class 1', 'Class 2', 'Class 3'] # A list of class names

train: fed_dataset/train/images # Configure paths (usually not needed to be configured)
val: fed_dataset/valid/images
test: fed_dataset/test/images
13 changes: 2 additions & 11 deletions client/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,27 +3,18 @@
import torch
import collections
import tempfile
import glob

HELPER_MODULE = "numpyhelper"
helper = get_helper(HELPER_MODULE)

def compile_model():
yaml_file = glob.glob("yolov8*.yaml")

if not yaml_file:
raise FileNotFoundError("No YAML file matching 'yolov8*.yaml' found.")

if yaml_file[0] == "yolov8_.yaml":
raise ValueError("Please configure which YOLOv8 model to use by renaming the YAML file.")

if torch.cuda.is_available():
device = 'cuda'
elif torch.backends.mps.is_available():
device = 'mps'
else:
device = 'cpu'
return YOLO(yaml_file[0]).to(device)
return YOLO('model.yaml').to(device)


def load_parameters(model_path):
Expand All @@ -39,7 +30,7 @@ def load_parameters(model_path):
model = compile_model()
params_dict = zip(model.state_dict().keys(), parameters_np)
state_dict = collections.OrderedDict({key: torch.tensor(x) for key, x in params_dict})
model.load_state_dict(state_dict, strict=True)
model.load_state_dict(state_dict, strict=False)
with tempfile.NamedTemporaryFile(suffix='.pt') as tmp_file:
torch.save(model,tmp_file.name)
model = YOLO(tmp_file.name)
Expand Down
54 changes: 54 additions & 0 deletions client/setup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
import yaml
import os

# Load server configuration
with open('client/global_config.yaml', 'r') as config_file:
server_config = yaml.safe_load(config_file)

# Extract configuration values
model_size = server_config.get("model_size", "nano").lower()
num_classes = server_config.get("num_classes", 1)
class_names = server_config.get("class_names", [f"Class {i}" for i in range(num_classes)])

# Dataset paths from config
train_path = server_config.get("train", "fed_dataset/train/images")
val_path = server_config.get("val", "fed_dataset/valid/images")
test_path = server_config.get("test", "fed_dataset/test/images")

# Paths for model files
model_folder = "client/yolov8models"
model_file = os.path.join(model_folder, f"{model_size}.yaml")
output_model_file = "client/model.yaml"
output_data_file = "client/data.yaml"

# Generate model.yaml with 'nc' on line 4
if not os.path.exists(model_file):
print(f"Error: Model file '{model_file}' does not exist in '{model_folder}'.")
else:
# Read the model file content as a list of lines
with open(model_file, 'r') as file:
model_lines = file.readlines()

# Insert the nc line at line 4 (index 3)
model_lines.insert(3, f"nc: {num_classes} # Number of classes from global_config\n")

# Write the modified content to the model.yaml file
with open(output_model_file, 'w') as output_file:
output_file.writelines(model_lines)

print(f"'{output_model_file}' created successfully with nc: {num_classes} based on '{model_file}'")

# Generate data.yaml with paths, nc, and names
data_content = {
"train": train_path,
"val": val_path,
"test": test_path,
"nc": num_classes,
"names": {i: class_name for i, class_name in enumerate(class_names)}
}

# Write the data.yaml content
with open(output_data_file, 'w') as data_file:
yaml.dump(data_content, data_file, sort_keys=False)

print(f"'{output_data_file}' created successfully with paths from global_config.yaml, nc: {num_classes}, and indexed class names.")
6 changes: 3 additions & 3 deletions client/train.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
from data import get_train_size
import yaml

def train(in_model_path, out_model_path, data_yaml_path='data.yaml', epochs=10,batch_size=16):
def train(in_model_path, out_model_path, epochs=10, data_yaml_path='data.yaml', batch_size=16):
"""Complete a model update using YOLOv8.
Load model parameters from in_model_path (managed by the FEDn client),
Expand Down Expand Up @@ -41,7 +41,7 @@ def train(in_model_path, out_model_path, data_yaml_path='data.yaml', epochs=10,b
epochs = config.get('local_epochs', epochs)
batch_size = config.get('batch_size', batch_size)
else:
print(f"Config file not found at {config_path}. Using default epochs ({epochs}) and batch size ({batch_size}).")
print(f"Client config file not found at {config_path}. Using default epochs ({epochs}) and batch size ({batch_size}).")

# Train the model and remove the unnecessary files
with tempfile.TemporaryDirectory() as tmp_dir:
Expand All @@ -60,7 +60,7 @@ def train(in_model_path, out_model_path, data_yaml_path='data.yaml', epochs=10,b

if __name__ == "__main__":
if len(sys.argv) < 3:
print("Usage: python train.py <in_model_path> <out_model_path> [data_yaml_path] [epochs]")
print("Usage: python train.py <in_model_path> <out_model_path> [epochs]")
sys.exit(1)

in_model_path = sys.argv[1]
Expand Down
46 changes: 0 additions & 46 deletions client/yolov8_.yaml

This file was deleted.

36 changes: 36 additions & 0 deletions client/yolov8models/large.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Ultralytics YOLO 🚀, AGPL-3.0 license
# YOLOv8 object detection model with P3-P5 outputs, fixed to YOLOv8l scale


# YOLOv8l backbone
backbone:
- [-1, 1, Conv, [64, 3, 2]]
- [-1, 1, Conv, [128, 3, 2]]
- [-1, 3, C2f, [128, True]]
- [-1, 1, Conv, [256, 3, 2]]
- [-1, 6, C2f, [256, True]]
- [-1, 1, Conv, [384, 3, 2]]
- [-1, 6, C2f, [384, True]]
- [-1, 1, Conv, [512, 3, 2]]
- [-1, 3, C2f, [512, True]]
- [-1, 1, SPPF, [512, 5]]

# YOLOv8l head
head:
- [-1, 1, nn.Upsample, [None, 2, "nearest"]]
- [[-1, 6], 1, Concat, [1]]
- [-1, 3, C2f, [384]]

- [-1, 1, nn.Upsample, [None, 2, "nearest"]]
- [[-1, 4], 1, Concat, [1]]
- [-1, 3, C2f, [256]]

- [-1, 1, Conv, [256, 3, 2]]
- [[-1, 12], 1, Concat, [1]]
- [-1, 3, C2f, [384]]

- [-1, 1, Conv, [384, 3, 2]]
- [[-1, 9], 1, Concat, [1]]
- [-1, 3, C2f, [512]]

- [[15, 18, 21], 1, Detect, [nc]]
36 changes: 36 additions & 0 deletions client/yolov8models/medium.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Ultralytics YOLO 🚀, AGPL-3.0 license
# YOLOv8 object detection model with P3-P5 outputs, fixed to YOLOv8m scale


# YOLOv8m backbone
backbone:
- [-1, 1, Conv, [48, 3, 2]]
- [-1, 1, Conv, [96, 3, 2]]
- [-1, 2, C2f, [96, True]]
- [-1, 1, Conv, [192, 3, 2]]
- [-1, 4, C2f, [192, True]]
- [-1, 1, Conv, [384, 3, 2]]
- [-1, 4, C2f, [384, True]]
- [-1, 1, Conv, [512, 3, 2]]
- [-1, 2, C2f, [512, True]]
- [-1, 1, SPPF, [512, 5]]

# YOLOv8m head
head:
- [-1, 1, nn.Upsample, [None, 2, "nearest"]]
- [[-1, 6], 1, Concat, [1]]
- [-1, 3, C2f, [384]]

- [-1, 1, nn.Upsample, [None, 2, "nearest"]]
- [[-1, 4], 1, Concat, [1]]
- [-1, 3, C2f, [192]]

- [-1, 1, Conv, [192, 3, 2]]
- [[-1, 12], 1, Concat, [1]]
- [-1, 3, C2f, [384]]

- [-1, 1, Conv, [384, 3, 2]]
- [[-1, 9], 1, Concat, [1]]
- [-1, 3, C2f, [512]]

- [[15, 18, 21], 1, Detect, [nc]]
36 changes: 36 additions & 0 deletions client/yolov8models/nano.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Ultralytics YOLO 🚀, AGPL-3.0 license
# YOLOv8 object detection model with P3-P5 outputs, fixed to YOLOv8n scale


# YOLOv8n backbone
backbone:
- [-1, 1, Conv, [16, 3, 2]] # 64 * 0.25 = 16
- [-1, 1, Conv, [32, 3, 2]] # 128 * 0.25 = 32
- [-1, 1, C2f, [32, True]] # min(128 * 0.25, 1024) = 32
- [-1, 1, Conv, [64, 3, 2]] # 256 * 0.25 = 64
- [-1, 2, C2f, [64, True]] # int(6 * 0.33) = 2; min(256 * 0.25, 1024) = 64
- [-1, 1, Conv, [128, 3, 2]] # 512 * 0.25 = 128
- [-1, 2, C2f, [128, True]] # min(512 * 0.25, 1024) = 128
- [-1, 1, Conv, [256, 3, 2]] # min(1024 * 0.25, 1024) = 256
- [-1, 1, C2f, [256, True]]
- [-1, 1, SPPF, [256, 5]]

# YOLOv8n head
head:
- [-1, 1, nn.Upsample, [None, 2, "nearest"]]
- [[-1, 6], 1, Concat, [1]]
- [-1, 1, C2f, [128]]

- [-1, 1, nn.Upsample, [None, 2, "nearest"]]
- [[-1, 4], 1, Concat, [1]]
- [-1, 1, C2f, [64]] # P3/8-small

- [-1, 1, Conv, [64, 3, 2]]
- [[-1, 10], 1, Concat, [1]]
- [-1, 1, C2f, [128]] # P4/16-medium

- [-1, 1, Conv, [128, 3, 2]]
- [[-1, 8], 1, Concat, [1]]
- [-1, 1, C2f, [256]] # P5/32-large

- [[13, 16, 19], 1, Detect, [nc]]
Loading

0 comments on commit c1ae1f2

Please sign in to comment.