Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add transforms #2148

Merged
merged 7 commits into from
Nov 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -198,8 +198,8 @@ Pixel-level transforms will change just an input image and will leave any additi
- [FDA](https://explore.albumentations.ai/transform/FDA)
- [FancyPCA](https://explore.albumentations.ai/transform/FancyPCA)
- [FromFloat](https://explore.albumentations.ai/transform/FromFloat)
- [GaussNoise](https://explore.albumentations.ai/transform/GaussNoise)
- [GaussianBlur](https://explore.albumentations.ai/transform/GaussianBlur)
- [GaussianNoise](https://explore.albumentations.ai/transform/GaussianNoise)
- [GlassBlur](https://explore.albumentations.ai/transform/GlassBlur)
- [HistogramMatching](https://explore.albumentations.ai/transform/HistogramMatching)
- [HueSaturationValue](https://explore.albumentations.ai/transform/HueSaturationValue)
Expand Down Expand Up @@ -230,7 +230,9 @@ Pixel-level transforms will change just an input image and will leave any additi
- [RandomJPEG](https://explore.albumentations.ai/transform/RandomJPEG)
- [RandomMedianBlur](https://explore.albumentations.ai/transform/RandomMedianBlur)
- [RandomPlanckianJitter](https://explore.albumentations.ai/transform/RandomPlanckianJitter)
- [RandomPosterize](https://explore.albumentations.ai/transform/RandomPosterize)
- [RandomRain](https://explore.albumentations.ai/transform/RandomRain)
- [RandomSaturation](https://explore.albumentations.ai/transform/RandomSaturation)
- [RandomShadow](https://explore.albumentations.ai/transform/RandomShadow)
- [RandomSnow](https://explore.albumentations.ai/transform/RandomSnow)
- [RandomSolarize](https://explore.albumentations.ai/transform/RandomSolarize)
Expand Down
214 changes: 213 additions & 1 deletion albumentations/augmentations/tk/transform.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,14 +17,23 @@
CLAHE,
ColorJitter,
Equalize,
GaussNoise,
ImageCompression,
InvertImg,
PlanckianJitter,
Posterize,
RandomBrightnessContrast,
Solarize,
ToGray,
)
from albumentations.core.pydantic import InterpolationType, check_0plus, check_01, check_1plus, nondecreasing
from albumentations.core.pydantic import (
InterpolationType,
check_0plus,
check_01,
check_1plus,
check_range_bounds,
nondecreasing,
)
from albumentations.core.transforms_interface import BaseTransformInitSchema
from albumentations.core.types import PAIR, ColorType, ScaleFloatType, ScaleIntType, Targets

Expand All @@ -48,6 +57,9 @@
"RandomPlanckianJitter",
"RandomMedianBlur",
"RandomSolarize",
"RandomPosterize",
"RandomSaturation",
"GaussianNoise",
]


Expand Down Expand Up @@ -1378,3 +1390,203 @@ def __init__(

def get_transform_init_args_names(self) -> tuple[str, ...]:
return ("thresholds",)


class RandomPosterize(Posterize):
ternaus marked this conversation as resolved.
Show resolved Hide resolved
"""Reduce the number of bits for each color channel.

This transform is an alias for Posterize, provided for compatibility with
Kornia API. For new code, it is recommended to use albumentations.Posterize directly.

Args:
num_bits (tuple[int, int]): Range for number of bits to keep for each channel.
Values should be in range [0, 8] for uint8 images.
Default: (3, 3).
p (float): probability of applying the transform. Default: 0.5.

Targets:
image

Image types:
uint8, float32

Number of channels:
Any

Note:
This transform is a direct alias for Posterize with identical functionality.
For new projects, it is recommended to use Posterize directly as it
provides a more consistent interface within the Albumentations ecosystem.

For float32 images:
1. Image is converted to uint8 (multiplied by 255 and clipped)
2. Posterization is applied
3. Image is converted back to float32 (divided by 255)

Example:
>>> # RandomPosterize way (Kornia compatibility)
>>> transform = A.RandomPosterize(num_bits=(3, 3)) # Fixed 3 bits per channel
>>> transform = A.RandomPosterize(num_bits=(3, 5)) # Random from 3 to 5 bits
>>> # Preferred Posterize way
>>> transform = A.Posterize(bits=(3, 3))
>>> transform = A.Posterize(bits=(3, 5))

References:
- Kornia: https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomPosterize
"""

class InitSchema(BaseTransformInitSchema):
num_bits: Annotated[tuple[int, int], AfterValidator(check_range_bounds(0, 8)), AfterValidator(nondecreasing)]

def __init__(
self,
num_bits: tuple[int, int] = (3, 3),
always_apply: bool | None = None,
p: float = 0.5,
):
warn(
"RandomPosterize is an alias for Posterize transform. "
ternaus marked this conversation as resolved.
Show resolved Hide resolved
"Consider using Posterize directly from albumentations.Posterize.",
UserWarning,
stacklevel=2,
)

super().__init__(
num_bits=num_bits,
p=p,
)

def get_transform_init_args_names(self) -> tuple[str, ...]:
return ("num_bits",)


class RandomSaturation(ColorJitter):
"""Randomly change the saturation of an RGB image.

This is a specialized version of ColorJitter that only adjusts saturation.

Args:
saturation (tuple[float, float]): Range for the saturation factor.
Values should be non-negative numbers.
A saturation factor of 0 will result in a grayscale image
A saturation factor of 1 will give the original image
A saturation factor of 2 will enhance the saturation by a factor of 2
Default: (1.0, 1.0)
p (float): probability of applying the transform. Default: 0.5.

Targets:
image

Image types:
uint8, float32

Number of channels:
1, 3

Note:
- This transform can only be applied to RGB/BGR images.
- The saturation adjustment is done by converting to HSV color space,
modifying the S channel, and converting back to RGB.

Example:
>>> import albumentations as A
>>> transform = A.RandomSaturation(saturation_range=(0.5, 1.5), p=0.5)
>>> # Reduce saturation by 50% to increase by 50%
>>>
>>> transform = A.RandomSaturation(saturation_range=(0.0, 1.0), p=0.5)
>>> # Randomly convert to grayscale with 50% probability
"""

class InitSchema(BaseTransformInitSchema):
saturation: Annotated[tuple[float, float], AfterValidator(check_0plus), AfterValidator(nondecreasing)]

def __init__(
self,
saturation: tuple[float, float] = (1.0, 1.0),
always_apply: bool | None = None,
p: float = 0.5,
):
super().__init__(
brightness=(1.0, 1.0), # No brightness change
contrast=(1.0, 1.0), # No contrast change
saturation=saturation,
hue=(0.0, 0.0), # No hue change
p=p,
)
self.saturation = saturation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Remove redundant saturation attribute storage as it's already stored in parent class

The saturation parameter is being stored both in the parent ColorJitter class and in RandomSaturation. This creates unnecessary duplication.


def get_transform_init_args_names(self) -> tuple[str]:
return ("saturation",)


class GaussianNoise(GaussNoise):
"""Add Gaussian noise to the input image.

A specialized version of GaussNoise that follows torchvision's API.

Args:
mean (float): Mean of the Gaussian noise as a fraction
of the maximum value (255 for uint8 images or 1.0 for float images).
Value should be in range [0, 1]. Default: 0.0.
sigma (float): Standard deviation of the Gaussian noise as a fraction
of the maximum value (255 for uint8 images or 1.0 for float images).
Value should be in range [0, 1]. Default: 0.1.
p (float): Probability of applying the transform. Default: 0.5.

Targets:
image

Image types:
uint8, float32

Note:
- The noise parameters (sigma and mean) are normalized to [0, 1] range:
* For uint8 images, they are multiplied by 255
* For float32 images, they are used directly
- Unlike GaussNoise, this transform:
* Uses fixed sigma and mean values (no ranges)
* Always applies same noise to all channels
* Does not support noise_scale_factor optimization
- For more flexibility, use GaussNoise which allows sampling both std and mean
from ranges and supports per-channel noise

Example:
>>> import albumentations as A
>>> # Add noise with sigma=0.1 (10% of the image range)
>>> transform = A.GaussianNoise(mean=0.0, sigma=0.1, p=1.0)

References:
- torchvision: https://pytorch.org/vision/master/generated/torchvision.transforms.v2.GaussianNoise.html
- kornia: https://kornia.readthedocs.io/en/latest/augmentation.module.html#kornia.augmentation.RandomGaussianNoise
"""

class InitSchema(BaseTransformInitSchema):
mean: float = Field(ge=-1, le=1)
sigma: float = Field(ge=0, le=1)

def __init__(
self,
mean: float = 0.0,
sigma: float = 0.1,
always_apply: bool | None = None,
p: float = 0.5,
):
warn(
"GaussianNoise is a specialized version of GaussNoise that follows torchvision's API. "
"Consider using GaussNoise directly from albumentations.GaussNoise.",
UserWarning,
stacklevel=2,
)

super().__init__(
std_range=(sigma, sigma), # Fixed sigma value
mean_range=(mean, mean), # Fixed mean value
per_channel=False, # Always apply same noise to all channels
noise_scale_factor=1.0, # No noise scale optimization
p=p,
)
self.mean = mean
self.sigma = sigma

def get_transform_init_args_names(self) -> tuple[str, ...]:
return "mean", "sigma"
Loading