0.4.0 Release Timeline #931

adamjstewart · 2022-12-04T17:05:58Z

adamjstewart
Dec 4, 2022
Maintainer

We're planning a 0.4.0 release sometime by the end of the month. This will include all features in the following milestones:

0.4.0
0.3.2

If you have any new features you would like to see in this release, or bugs you would like to see fixed before then, now is the time to push forward on PRs. Any questions about this release or suggestions for things that should be required before the release is finalized should be added here. We will implement a feature freeze on the releases/v0.4 branch a few days before the release is finalized in order to finish testing and release prep.

adamjstewart · 2022-12-10T00:13:31Z

adamjstewart
Dec 10, 2022
Maintainer Author

0 replies

adamjstewart · 2023-01-23T20:57:55Z

adamjstewart
Jan 23, 2023
Maintainer Author

TorchGeo 0.4.0 Release Notes

This is our biggest release yet, with improved support for pre-trained models, faster datamodules and transforms, and more powerful trainers. See the following sections for specific changes to each module:

Backwards-incompatible changes
Dependencies
Datamodules
Datasets
Models
Samplers
Trainers
Transforms
Documentation

As always, thanks to our many contributors!

Backwards-incompatible changes

Datasets: So2Sat bands were renamed (So2Sat: rename bands #735)
Datasets: TropicalCycloneWindEstimation was renamed to TropicalCyclone (Dataset/DataModule consistency #815, Datamodule naming TropicalCyclone #846)
Datasets: VisionDataset and VisionClassificationDataset (deprecated in 0.3) have been removed (Rename VisionDataset to NonGeoDataset #627)
Datamodules: many arguments have been renamed or reordered (Pass datamodule kwargs to datasets #666, DataModules: pass kwargs directly to datasets #730, DataModules: run all data augmentation on the GPU #992)
Datamodules: CycloneDataModule was renamed to TropicalCycloneDataModule (Dataset/DataModule consistency #815, Datamodule naming TropicalCyclone #846)
Models: resnet50 has a new multi-weight API (Add Multi-Weight Support API #917)
Trainers: many arguments have been renamed (Change "classification_model" to "model" #916, Add Multi-Weight Support API #917, Change BYOL task argument name and switch to timm models #918, Change segmentation model argument names #919, Change argument name Object Detection Task #920)
Transforms: now take a single image as input instead of a sample dict (Convert all index transforms to Kornia #999)

Dependencies

Open3D replaced by PyVista (Replace open3d with pyvista #663)
Remove packaging dependency (Remove packaging dependency #1019)
Support einops 0.6 (Bump einops from 0.5.0 to 0.6.0 in /requirements #896)
Support flake8 6 (Bump flake8 from 5.0.4 to 6.0.0 in /requirements #910)
Support mypy 0.991 (Bump mypy from 0.990 to 0.991 in /requirements #900)
Support pytest-cov 4 (Bump pytest-cov from 3.0.0 to 4.0.0 in /requirements #801)
Support pyupgrade 3 (Bump pyupgrade from 2.38.4 to 3.0.0 in /requirements #817)
Support setuptools 66 (Bump setuptools from 65.7.0 to 66.0.0 in /requirements #1017)
Support shapely 2 (Bump shapely from 1.8.5.post1 to 2.0.0 in /requirements #949)
Support sphinx 6 (Bump sphinx from 5.3.0 to 6.0.0 in /requirements #990)
Support timm 0.6 (Bump smp and timm deps #1002)
Support torchmetrics 0.11 (Bump torchmetrics from 0.10.3 to 0.11.0 in /requirements #925)
Support torchvision 0.14 (Bump torch from 1.12.1 to 1.13.0 in /requirements #875)

Datamodules

Our existing datamodules worked well, but suffered from several performance issues. For the average dataset with 3 splits (train/val/test), we were instantiating the dataset 10 times! All data augmentation was done on the CPU, one sample at a time. A multiprocessing bug prevented parallel data loading on macOS and Windows. And a serious bug was discovered in some of our datamodules that allowed training images to leak into the test set (only affected datamodules using torchgeo.datamodules.utils.dataset_split). All of these bugs have been fixed, and performance has been drastically improved. Datasets are only instantiated 3 times (once for each split). All data augmentation happens on the GPU, an entire batch at a time. And multiprocessing is now supported on all platforms. By refactoring our datamodules and adding new base classes, we were able to remove 1.6K lines of duplicated code in the process!

New datamodules:

GID-15 (Add datamodule for GID-15 dataset #928)
SpaceNet1 (Datamodule for SpaceNet1 #965)

Changes to existing datamodules:

Only instantiate dataset in prepare_data if download is requested (DataModules: skip prepare_data #967, DataModules: only instantiate when download requested #974)
Only instantiate datasets needed for a given stage (DataModules: run all data augmentation on the GPU #992)
Use Kornia for all data augmentation (DataModules: run all data augmentation on the GPU #992)
Faster data augmentation (CPU → GPU, sample → batch) (DataModules: run all data augmentation on the GPU #992)
Fix macOS/Windows multiprocessing bug (Trainers: num_workers > 0 results in pickling error on macOS/Windows #886, DataModules: run all data augmentation on the GPU #992)
Fix bug with train images leaking into test set (DataModules: run all data augmentation on the GPU #992)
Add plot method to all datamodules (Add plot method to (most) DataModules #814, DataModules: run all data augmentation on the GPU #992)
torchgeo.datamodules.utils.dataset_split is deprecated, use torch.utils.data.random_split instead (DataModules: run all data augmentation on the GPU #992)
Pass kwargs directly to datasets (Pass datamodule kwargs to datasets #666, DataModules: pass kwargs directly to datasets #730)
Add random cropping to several datamodules (Vaihingen datamodule #851, Fix Vaihingen datamodule #853, Random sized patches support for other non-geospatial datamodules #855, Add random crop logic to DeepGlobeLandCover Datamodule #876, Add crop logic to Potsdam2D datamodule #929)
Inria Aerial Image Labeling: fix predict dimensions (InriaAerialImageLabelingDataModule: fix predict dimensions #975)
LandCover.ai: fix mIoU calculation and plotting (Fix landcoverai datamodule #959)
Tropical Cyclone: CycloneDataModule was renamed to TropicalCycloneDataModule (Dataset/DataModule consistency #815, Datamodule naming TropicalCyclone #846)

New base classes:

Add GeoDataModule and NonGeoDataModule base classes (DataModules: run all data augmentation on the GPU #992)

Datasets

This release adds a new Sentinel-1 dataset. Here is a scene taken over the Big Island of Hawai'i:

Additionally, all image datasets now have a plot method.

New datasets:

Cloud Cover Detection (Add Radiant MLHub (REF) Cloud Cover Dataset #510)
Sentinel-1 (Add Sentinel-1 dataset #821)
SpaceNet 6 (Add SpaceNet6 & data.py #878)

Changes to existing datasets:

Add default root argument to all datasets (Datasets: add default 'root' argument #802)
Consistent capitalization of band names (Datasets: consistent capitalization of band names #778)
Many datasets now return float images and int labels (DataModules: run all data augmentation on the GPU #992)
Chesapeake CVPR: add plot method (Adding plotting to ChesapeakeCVPR dataset #820)
ETCI 2021: fix data loading (Fix a bug for ETCI2021 loader #861)
NASA Marine Debris: fix plot warning when model outputs no prediction boxes (Fix for warning raised if model outputs no box predictions #988)
OSCD: images are now stacked channel-wise (DataModules: run all data augmentation on the GPU #992)
SEN12MS: mask is only single channel (DataModules: run all data augmentation on the GPU #992)
Sentinel-2: use 10,000 as scale factor (Sentinel-2: use 10,000 as scale factor #1027)
So2Sat: rename bands (So2Sat: rename bands #735)
Tropical Cyclone: renamed from TropicalCycloneWindEstimation to TropicalCyclone (Dataset/DataModule consistency #815, Datamodule naming TropicalCyclone #846)
Tropical Cyclone: images are RGB, not grayscale (DataModules: run all data augmentation on the GPU #992)
VHR-10: add plot method (Add plot method for VHR10 dataset #847)
xView2: remove labels folder (Remove "labels" folder from verify for xView2 #787)

Changes to existing base classes:

RasterDataset supports band indexing now (Allow band indexing in RasterDataset #687)
UnionDataset actually works now (Sampling from UnionDataset fails - even when it shouldn't? #769, UnionDataset: fix __getitem__ bug #786)
UnionDataset and IntersectionDataset support transforms (Can't add transforms to UnionDataset or IntersectionDataset #867, Add transforms to UnionDataset and IntersectionDataset #870)
VectorDataset supports multi-label datasets (Allow multilabels in VectorDataset #862)

Models

Due to the nature of satellite imagery (different number of spectral bands for every satellite), it is impossible to have a single set of pre-trained weights for each model. TorchGeo has always had multi-weight support:

model = resnet50(sensor="sentinel2", bands="all", pretrained=True)

However, this is difficult to extend if you want more fine-grained control over model weights. More recently, torchvision introduced a new multi-weight support API:

With the 0.4.0 release, TorchGeo has now adopted the same API:

model = resnet50(weights=ResNet50_Weights.SENTINEL2_ALL_MOCO)

We also support PyTorch Hub now:

>>> import torch
>>> from torchgeo.models import ResNet18_Weights
>>> torch.hub.list("microsoft/torchgeo", trust_repo=True)
Downloading: "https://github.com/microsoft/torchgeo/zipball/models/weights" to ~/.cache/torch/hub/models_weights.zip
['resnet18', 'resnet50', 'vit_small_patch16_224']
>>> model = torch.hub.load("microsoft/torchgeo", "resnet18")
Using cache found in ~/.cache/torch/hub/microsoft_torchgeo_models_weights
>>> model = torch.hub.load("microsoft/torchgeo", "resnet18", weights=ResNet18_Weights.SENTINEL2_RGB_MOCO)
Using cache found in ~/.cache/torch/hub/microsoft_torchgeo_models_weights

In our previous release, we had 1 model pre-trained on 1 satellite with 1 training procedure. We now have 3 models (ResNet-18, ResNet-50, ViT) trained on both Sentinel-1 and Sentinel-2 for all bands and RGB-only bands with 3 SSL techniques (MoCo, DINO, SeCo), and plans to expand this in the future. Shoutout to Zhu Lab and ServiceNow for publishing these weights!

New models:

Add ResNet-18 and ViT models (Add Multi-Weight Support API #917)

Changes to existing models:

Adopt torchvision's multi-weight support API (Multi-Weight Support API #762, Naming Scheme multi-weights pretrained models #804, Add Multi-Weight Support API #917)
Add support for torch.hub (Add Multi-Weight Support API #917)

New utility functions:

Functions to list, query, and initialize models and weights (Add Multi-Weight Support API #917)

Samplers

Changes to existing samplers:

All random samplers now have a default value for length (Random GeoSamplers: add default length #755)

New utility functions:

get_random_bounding_box and tile_to_chips are now public functions (Random GeoSamplers: add default length #755)

Trainers

This release introduces a new trainer for object detection, one of our most highly requested features. All trainers now support prediction. Our old trainers only supported ResNet backbones. Our new trainers now support the 600+ backbones provided by the timm library. And all of the new pre-trained models mentioned above are now supported by our trainers as well.

New trainers:

Object Detection: add trainer, add Faster R-CNN ([Feature Request] Object Detection Trainer #442, Add support for ObjectDetection #758)
Object Detection: add RetinaNet and FCOS (Added More Object Detectors #984)

Changes to existing trainers:

Add support for all timm backbones (Change regression task to timm support #854, Change BYOL task argument name and switch to timm models #918)
Add support for more pretrained models (Add Multi-Weight Support API #917)
Change model argument names (Change "classification_model" to "model" #916, Change BYOL task argument name and switch to timm models #918, Change segmentation model argument names #919, Change argument name Object Detection Task #920)
Support prediction (ClassificationTask predict step #790, MultiLabelClassificationTask predict step #792, Trainers: predict step #813, add predict_step to RegressionTask #818, add predict_step to BYOLTask #819, add predict_step to SemanticSegmentationTask #939)
Fix plotting file handle leak (File handle leak drawing figures #825, Trainers: fix plotting file handle leak #826)
Multi-label Classification: replace softmax with sigmoid (MultiLabelClassificationTask use sigmoid instead of softmax #791)

Transforms

Whenever possible, we try to avoid reinventing the wheel. For data augmentation transforms that aren't specific to geospatial data or satellite imagery, we use existing implementations in popular libraries like:

torchvision (PIL and PyTorch backends)
albumentations (OpenCV backend)
kornia (PyTorch backend)

Until now, we've been fairly agnostic towards data augmentation libraries. However, neither PIL nor OpenCV support multispectral imagery. Because of this, we've decided to use Kornia for all transforms.

Changes to existing transforms:

All transforms are now compatible with kornia.augmentation.AugmentationSequential (Convert all index transforms to Kornia #999)
All transforms now take a single image as input instead of a sample dict (Convert all index transforms to Kornia #999)
torchgeo.transforms.AugmentationSequential is deprecated, use kornia.augmentation.AugmentationSequential instead (DataModules: run all data augmentation on the GPU #992)

Documentation

Add new tutorial for working with pretrained model weights (Missing information about sentinel-2 bands used in pretrained ResNet model #693, Resnet50 pretrained model documentation #799, Add Multi-Weight Support API #917)
Remove execution count from tutorials (Tutorials: remove execution count #783)
Remove __module__ hacks, fixing most documentation issues (Remove __module__ hacks #976)
Use kornia for all transforms in tutorials (Convert all index transforms to Kornia #999)
Improve trainer API docs (Docs/trainers for Classification and Segmentation Task #852)
Add num classes to ReforeTree dataset (Fix ReforesTree entry #907)
Convert tensor to array in tutorials (Jupyter kernel dies when executing plot function in tutorial "custom_raster_dataset.ipynb" #841, explicitly convert image tensor to numpy before plotting #845)
Fix typo in USAVars documentation (Fix typo in usavars.py #1038)
Fix typos in TropicalCyclone and GID-15 documentation (Fix mistakes caught by pydocstyle convention PR #1011)
Fix URL formatting in LoveDA documentation (LoveDA: fix URL formatting #977)
Fix Aster GDEM dataset name (Evaluation -> Elevation #884)
Fix dead link in Vaihingen2D documentation (Replace dead link in Vaihingen dataset documentation #850)
Fix link to iNaturalist in datasets table (iNaturalist: fix ref in datasets table #775)
Fix link in GBIF dataset documentation (GBIF: fix URL #774)

Contributors

This release is thanks to the following contributors:

@adamjstewart
@ashnair1
@bugraaldal
@calebrob6
@daiki-kimura
@eltociear
@fnands
@isaaccorley
@KennSmithDS
@mgnolde
@nilsleh
@Niro4
@osgeokr
@pmandiola
@RitwikGupta

0 replies

adamjstewart · 2023-01-25T00:23:38Z

adamjstewart
Jan 25, 2023
Maintainer Author

TorchGeo 0.4.0 is now out! Hopefully it was worth the wait 😅

If we forgot anyone in the release notes or if you notice any bugs in the new release (it happens...) just let us know!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0.4.0 Release Timeline #931

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

0.4.0 Release Timeline #931

adamjstewart Dec 4, 2022 Maintainer

Replies: 3 comments

adamjstewart Dec 10, 2022 Maintainer Author

Required

Strongly desired

Nice to have

adamjstewart Jan 23, 2023 Maintainer Author

TorchGeo 0.4.0 Release Notes

Backwards-incompatible changes

Dependencies

Datamodules

Datasets

Models

Samplers

Trainers

Transforms

Documentation

Contributors

adamjstewart Jan 25, 2023 Maintainer Author

adamjstewart
Dec 4, 2022
Maintainer

adamjstewart
Dec 10, 2022
Maintainer Author

adamjstewart
Jan 23, 2023
Maintainer Author

adamjstewart
Jan 25, 2023
Maintainer Author