About "pad" tuple of the arguments of the torch.nn.functional.pad function with torch_directml #687

suryodasuke · 2025-02-17T17:43:13Z

I may have found a bug in the torch_directml.

I was using pytorch_directml in combination with kornia. There was an error in the canny process when I set device to directml device. I debugged the filter.py of kornia by comparing it with the "cpu" device. Then, the order of the "pad" tuple of the arguments of the torch.nn.functional.pad function is probably (padding_top, padding_bottom, padding_left, padding_right).

The correct order of "pad" tuples is (padding_left, padding_right, padding_top, padding_bottom).
(https://pytorch.org/docs/stable/generated/torch.nn.functional.pad.html#torch.nn.functional.pad)
Due to incorrect padding, the tensor shape after convolution does not match what is expected.

I think it's easy to make a mistake because the "input" of the argument of the torch.nn.Conv2d function is in the order of (batch, chennel, height, width). That's probably why the order is written in a large font on the referenced page.
Of course, a square convolutional kernel would be fine.

I'm looking forward to the new version of torch_directml.

Code

#Python 3.12.8
from PIL import Image, ImageSequence #pillow 11.1.0
import numpy #numpy 2.1.3
import torch #torch 2.4.1
import torch_directml #torch-directml 0.2.5.dev240914
import traceback
from kornia.filters import canny #kornia 0.8.0

img = Image.open("png1.png")

device = torch_directml.device(0)
#device = "cpu"

for i in ImageSequence.Iterator(img):
    image = i.convert("RGB")
    imagedat = numpy.array(image).astype(numpy.float32) / 255.0
    imgten = torch.from_numpy(imagedat)[None,]
    break

try:
    output = canny(imgten.to(torch.device(device), memory_format=torch.preserve_format).movedim(-1, 1), 0.2, 0.8)
    imgout = output[1].to(torch.device(device)).repeat(1, 3, 1, 1).movedim(1, -1)
except Exception as e:
    print(traceback.format_exc())
    raise e

i = 255. * imgout[0].cpu().numpy()
img = Image.fromarray(numpy.clip(i, 0, 255).astype(numpy.uint8))
img.save("cny1.png")
exit()

Error message

Traceback (most recent call last):
File ".....py", line 20, in
output = canny(imgten.to(torch.device(device)).movedim(-1, 1), 0.2, 0.8)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "....\Python\Python312\Lib\site-packages\kornia\filters\canny.py", line 92, in canny
blurred: Tensor = gaussian_blur2d(input, kernel_size, sigma)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "....\Python\Python312\Lib\site-packages\kornia\filters\gaussian.py", line 84, in gaussian_blur2d
out = filter2d_separable(input, kernel_x, kernel_y, border_type)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "....\Python\Python312\Lib\site-packages\kornia\filters\filter.py", line 209, in filter2d_separable
out_x = filter2d(input, kernel_x[..., None, :], border_type, normalized, padding)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "....\Python\Python312\Lib\site-packages\kornia\filters\filter.py", line 152, in filter2d
out = output.view(b, c, h, w)
^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: shape '[1, 1, 64, 64]' is invalid for input of size 4080

Annotation

I'm using an old graphics card, Radeon RX 460.
A 64x64 image was used.

I set a breakpoint on line 151 of the filter.py file.

when device is torch_directml.device(0)

input.shape=torch.Size([1, 1, 68, 64]) <- after padding
tmp_kernel.shape=torch.Size([1, 1, 1, 5])
output.shape=torch.Size([1, 1, 68, 60]) <- This causes an error.

when device is "cpu"

input.shape=torch.Size([1, 1, 64, 68]) <- after padding
tmp_kernel.shape=torch.Size([1, 1, 1, 5])
output.shape=[1, 1, 64, 64]

If the height and width of the arguments of the "_compute_padding" function on line 140 of the filter.py file are reversed, the error does not occur in "output.view", but an error occurs when the device is "cpu".
(After going through this, I get another error with a boolean casting problem...)

(This is my first issue.)

The text was updated successfully, but these errors were encountered:

suryodasuke · 2025-02-19T22:48:51Z

I understand that the torch.nn.functional.pad function malfunctions because the implementations of the 'reflect' and 'circular' modes for the arguments are not provided.

In the case of mode='constant' and 'replicate', the function seems to be working correctly.

>>> x=torch.tensor([[[0,1,2,3],[4,5,6,7],[8,9,10,11],[12,13,14,15]]], device=torch.device("cpu"))
>>> torch.nn.functional.pad(x, (1, 1, 0, 0), 'replicate')
tensor([[[ 0,  0,  1,  2,  3,  3],
         [ 4,  4,  5,  6,  7,  7],
         [ 8,  8,  9, 10, 11, 11],
         [12, 12, 13, 14, 15, 15]]])
>>> torch.nn.functional.pad(x, (1, 1, 0, 0), 'circular')
tensor([[[ 3,  0,  1,  2,  3,  0],
         [ 7,  4,  5,  6,  7,  4],
         [11,  8,  9, 10, 11,  8],
         [15, 12, 13, 14, 15, 12]]])
>>> x=x.to(torch_directml.device(0))
>>> torch.nn.functional.pad(x, (1, 1, 0, 0), 'replicate')
tensor([[[ 0,  0,  1,  2,  3,  3],
         [ 4,  4,  5,  6,  7,  7],
         [ 8,  8,  9, 10, 11, 11],
         [12, 12, 13, 14, 15, 15]]], device='privateuseone:0')
>>> torch.nn.functional.pad(x, (1, 1, 0, 0), 'circular')
tensor([[[0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0],
         [0, 0, 0, 0, 0, 0]]], device='privateuseone:0')
>>> torch.nn.functional.pad(x, (1, 1, 0, 0), 'reflect')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".....\Python\Python312\Lib\site-packages\torch\nn\functional.py", line 4552, in pad
    return torch._C._nn.pad(input, pad, mode, value)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: invalid vector subscript

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About "pad" tuple of the arguments of the torch.nn.functional.pad function with torch_directml #687

About "pad" tuple of the arguments of the torch.nn.functional.pad function with torch_directml #687

suryodasuke commented Feb 17, 2025

suryodasuke commented Feb 19, 2025

About "pad" tuple of the arguments of the torch.nn.functional.pad function with torch_directml #687

About "pad" tuple of the arguments of the torch.nn.functional.pad function with torch_directml #687

Comments

suryodasuke commented Feb 17, 2025

Code

Error message

Annotation

suryodasuke commented Feb 19, 2025