-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Input type and weight type error in scene graph code #16
Comments
I have the same problem. How do you fix it? |
@Bartopt @Yassin-fan Hi did any one fix this problem? please if you did, how did you fix it thank you. |
@Bartopt @Yassin-fan @MeklitMa Hi, guys, please tell me if you fix it, thanks very much! |
@Bartopt @tyxtyxtyxtyx @Yassin-fan @MeklitMa @suprosanna Hi, can anyone give me the source code? Thank you very much? My email is [email protected] |
In trainer.py, I just sent the self.network to cuda device before line 41 (h, out = self.network(images)), from which the error was coming @Bartopt @tyxtyxtyxtyx @Yassin-fan @incredibledays @MeklitMa. |
Hi, I have installed the code in python3.8, pytorch 1.8.0 and cuda11. And the debug_relationformer.ipynb runs well about Debug Dataloader and Debug Model part.
However, when I run the train.py using "nohup python3 train.py --config configs/scene_2d.yaml --cuda_visible_device 0 1 2 --exp_name VGtest1 --nproc_per_node 3 --b 16 &> log/Muti.out& ", there is an error:
*** Config file
configs/scene_2d.yaml
Experiment Name : VGtest1
Batch size : 16
Running Distributed: True ; GPU: 0 ; RANK: 0
Number of parameters : 92944451
ERROR:ignite.engine.engine.RelationformerTrainer:Current run is terminating due to exception: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
ERROR:ignite.engine.engine.RelationformerTrainer:Current run is terminating due to exception: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
ERROR:ignite.engine.engine.RelationformerTrainer:Current run is terminating due to exception: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
ERROR:ignite.engine.engine.RelationformerTrainer:Engine run is terminating due to exception: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
ERROR:ignite.engine.engine.RelationformerTrainer:Engine run is terminating due to exception: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
ERROR:ignite.engine.engine.RelationformerTrainer:Engine run is terminating due to exception: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
Traceback (most recent call last):
File "train.py", line 292, in
parallel.run(main, args)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/distributed/launcher.py", line 275, in run
idist.spawn(self.backend, func, args=args, kwargs_dict=kwargs, **self._spawn_params)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/distributed/utils.py", line 323, in spawn
comp_model_cls.spawn(
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/distributed/comp_models/native.py", line 304, in spawn
start_processes(
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes
while not context.join():
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 150, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:
-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
fn(i, *args)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/distributed/comp_models/native.py", line 272, in _dist_worker_task_fn
fn(local_rank, *args, **kw_dict)
File "/home/ymf/dockerFile/relationformer/train.py", line 282, in main
trainer.run()
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/monai/engines/trainer.py", line 56, in run
super().run()
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/monai/engines/workflow.py", line 250, in run
super().run(data=self.data_loader, max_epochs=self.state.max_epochs)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/engine/engine.py", line 702, in run
return self._internal_run()
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/engine/engine.py", line 775, in _internal_run
self._handle_exception(e)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/engine/engine.py", line 469, in _handle_exception
raise e
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/engine/engine.py", line 745, in _internal_run
time_taken = self._run_once_on_dataset()
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/engine/engine.py", line 850, in _run_once_on_dataset
self._handle_exception(e)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/engine/engine.py", line 469, in _handle_exception
raise e
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/ignite/engine/engine.py", line 833, in _run_once_on_dataset
self.state.output = self._process_function(self, self.state.batch)
File "/home/ymf/dockerFile/relationformer/trainer.py", line 40, in _iteration
h, out = self.network(images)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 711, in forward
output = self.module(*inputs, **kwargs)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/ymf/dockerFile/relationformer/models/relationformer_2D.py", line 108, in forward
features, pos = self.backbone(samples)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/ymf/dockerFile/relationformer/models/deformable_detr_backbone.py", line 117, in forward
xs = self0
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/ymf/dockerFile/relationformer/models/deformable_detr_backbone.py", line 84, in forward
xs = self.body(tensor_list.tensors)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torchvision/models/_utils.py", line 63, in forward
x = module(x)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 399, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/data/anaconda3/envs/ymf_rel38/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 395, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
Do you have any clue about this error and how to fix it? Thanks!
The text was updated successfully, but these errors were encountered: