Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

warning message #163

Open
LTJer opened this issue Aug 13, 2024 · 2 comments
Open

warning message #163

LTJer opened this issue Aug 13, 2024 · 2 comments

Comments

@LTJer
Copy link

LTJer commented Aug 13, 2024

fatal: not a git repository (or any parent up to mount point /gpfs)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
world_size 0
root_dir ./
id_prop_csv_file exists True
len dataset 351
Using LMDB dataset.
MAX val: 206.9750005
MIN val: 11.04244288
MAD: 9.425622709805278
Baseline MAE: 8.186923942137753
data range 206.9750005 11.04244288
100%|██████████| 280/280 [04:02<00:00, 1.15it/s]
data range 27.18234673 12.03331602
100%|██████████| 35/35 [00:12<00:00, 2.70it/s]
data range 89.94074813 14.72680096
100%|██████████| 35/35 [00:18<00:00, 1.85it/s]
/u/au/sa/liutsungwei/scratch/conda/envs/my_alignn/lib/python3.10/site-packages/torch/optim/lr_scheduler.py:136: UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
warnings.warn("Detected call of lr_scheduler.step() before optimizer.step(). "
/u/au/sa/liutsungwei/scratch/conda/envs/my_alignn/lib/python3.10/site-packages/torch/nn/modules/loss.py:101: UserWarning: Using a target size (torch.Size([1])) that is different to the input size (torch.Size([])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
return F.l1_loss(input, target, reduction=self.reduction)

The bold section is the warning message. Do I need to worry about that, and how can I fix it?

Thanks a lot.

@bdecost
Copy link
Collaborator

bdecost commented Aug 26, 2024

there are two UserWarnings here - the first one about the learning rate scheduler I think is a holdover from the behavior we had on an earlier version that used PyTorch ignite for the training loop - it's probably worth us fixing (swap this line and the one that follows https://github.com/usnistgov/alignn/blob/main/alignn/train.py#L467) but it's not the most worrisome thing since it only affects the initial optimization step

the second UserWarning is probably more concerning and could be a correctness issue. it looks like you have a batch size of 1 (?) and you have a scalar regression task? Are you able to test if you still get this warning if you use a batch size of 2 or more? It could be that the model has a call to torch.squeeze that removes the batch dimension by mistake if the batch size is 1...

@JHWang1001
Copy link

Dear Prof. bdecost

My test file are sample_data_ff and sample_data. No matter how I adjust the batch_size, it still throws an error.
train_alignn.py --root_dir "alignn/examples/sample_data_multi_prop" --config "alignn/examples/sample_data/config_example.json" --output_dir=temp
The error infomation are as follow:
Using a target size (torch.Size([1])) that is different to the input size (torch.Size([])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size. return F.l1_loss(input, target, reduction=self.reuction)
@bdecost

Best wishes,
Jiahui Wang

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants