warning message #163

LTJer · 2024-08-13T17:26:44Z

fatal: not a git repository (or any parent up to mount point /gpfs)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
world_size 0
root_dir ./
id_prop_csv_file exists True
len dataset 351
Using LMDB dataset.
MAX val: 206.9750005
MIN val: 11.04244288
MAD: 9.425622709805278
Baseline MAE: 8.186923942137753
data range 206.9750005 11.04244288
100%|██████████| 280/280 [04:02<00:00, 1.15it/s]
data range 27.18234673 12.03331602
100%|██████████| 35/35 [00:12<00:00, 2.70it/s]
data range 89.94074813 14.72680096
100%|██████████| 35/35 [00:18<00:00, 1.85it/s]
/u/au/sa/liutsungwei/scratch/conda/envs/my_alignn/lib/python3.10/site-packages/torch/optim/lr_scheduler.py:136: UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
warnings.warn("Detected call of lr_scheduler.step() before optimizer.step(). "
/u/au/sa/liutsungwei/scratch/conda/envs/my_alignn/lib/python3.10/site-packages/torch/nn/modules/loss.py:101: UserWarning: Using a target size (torch.Size([1])) that is different to the input size (torch.Size([])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
return F.l1_loss(input, target, reduction=self.reduction)

The bold section is the warning message. Do I need to worry about that, and how can I fix it?

Thanks a lot.

The text was updated successfully, but these errors were encountered:

bdecost · 2024-08-26T19:16:12Z

there are two UserWarnings here - the first one about the learning rate scheduler I think is a holdover from the behavior we had on an earlier version that used PyTorch ignite for the training loop - it's probably worth us fixing (swap this line and the one that follows https://github.com/usnistgov/alignn/blob/main/alignn/train.py#L467) but it's not the most worrisome thing since it only affects the initial optimization step

the second UserWarning is probably more concerning and could be a correctness issue. it looks like you have a batch size of 1 (?) and you have a scalar regression task? Are you able to test if you still get this warning if you use a batch size of 2 or more? It could be that the model has a call to torch.squeeze that removes the batch dimension by mistake if the batch size is 1...

JHWang1001 · 2024-12-28T09:48:32Z

Dear Prof. bdecost

My test file are sample_data_ff and sample_data. No matter how I adjust the batch_size, it still throws an error.
train_alignn.py --root_dir "alignn/examples/sample_data_multi_prop" --config "alignn/examples/sample_data/config_example.json" --output_dir=temp
The error infomation are as follow:
Using a target size (torch.Size([1])) that is different to the input size (torch.Size([])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size. return F.l1_loss(input, target, reduction=self.reuction)
@bdecost

Best wishes,
Jiahui Wang

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

warning message #163

warning message #163

LTJer commented Aug 13, 2024 •

edited

Loading

bdecost commented Aug 26, 2024

JHWang1001 commented Dec 28, 2024

warning message #163

warning message #163

Comments

LTJer commented Aug 13, 2024 • edited Loading

bdecost commented Aug 26, 2024

JHWang1001 commented Dec 28, 2024

LTJer commented Aug 13, 2024 •

edited

Loading