question about the loss #4

anyang1996 · 2020-05-21T02:04:08Z

when training on kitti dataset, two kinds of printed losses akways increased(mean loss and mean box loss), while other types decreased. Have you ever encountered such problem?

Thanks~

qiqihaer · 2020-05-21T07:30:45Z

I've never seen a problem like this. Have you changed the code？

anyang1996 · 2020-05-21T07:51:22Z

I've never seen a problem like this. Have you changed the code？

thanks for reply

actually I didn't change the code and used the provided data(./kitti/gt_database/train_gt_database_3level_Car.pkl) for training......

qiqihaer · 2020-05-21T12:53:47Z

You can check the log_train.txt in the log_kitti folder. That's the training log for 200 epochs. And I train 65 epochs again to check the code. The problem you mentioned does not come up in these two experiments. You can try to clone the code again.

anyang1996 · 2020-05-22T08:33:27Z

You can check the log_train.txt in the log_kitti folder. That's the training log for 200 epochs. And I train 65 epochs again to check the code. The problem you mentioned does not come up in these two experiments. You can try to clone the code again.

sorry to bother you again, I re-download the code without any change and try several times but still require the same result......If possible, could I get the lateset version of code you used yesterday with email [email protected]?

Thanks so much.

qiqihaer · 2020-05-22T10:42:42Z

I have sent the code to you

hova88 · 2020-06-17T09:26:33Z

You can check the log_train.txt in the log_kitti folder. That's the training log for 200 epochs. And I train 65 epochs again to check the code. The problem you mentioned does not come up in these two experiments. You can try to clone the code again.

sorry to bother you again, I re-download the code without any change and try several times but still require the same result......If possible, could I get the lateset version of code you used yesterday with email [email protected]?

Thanks so much.

解决了么？我也有同样的问题

qiqihaer · 2020-06-17T09:44:19Z

You can check the log_train.txt in the log_kitti folder. That's the training log for 200 epochs. And I train 65 epochs again to check the code. The problem you mentioned does not come up in these two experiments. You can try to clone the code again.

sorry to bother you again, I re-download the code without any change and try several times but still require the same result......If possible, could I get the lateset version of code you used yesterday with email [email protected]?
Thanks so much.

解决了么？我也有同样的问题

为什么会有这个问题？我这里训练就没有出现过啊。

anyang1996 · 2020-06-17T10:01:19Z

把train.py里面dataloader那里的num_workers改成32！总之不是0就好，不过原因我不太知道，刚接触pytorch，之前一直tf来着…

qiqihaer · 2020-06-18T01:54:17Z

把train.py里面dataloader那里的num_workers改成32！总之不是1就好，不过原因我不太知道，刚接触pytorch，之前一直tf来着…

这是什么原因？num_workers怎么会对训练产生影响。能不能把不收敛的loss的log发个邮件给我，我看看是什么问题。

hova88 · 2020-06-18T02:08:05Z

把train.py里面dataloader那里的num_workers改成32！总之不是1就好，不过原因我不太知道，刚接触pytorch，之前一直tf来着…

这是什么原因？num_workers怎么会对训练产生影响。能不能把不收敛的loss的log发个邮件给我，我看看是什么问题。

邮箱多少，我把我的log发给你哈。我觉得应该是Loss function对kitti调整的问题，我也还没搞明白。这里我截取两端部分大家看下。box_loss 和center_loss是在累加。
片段一

**** EVAL EPOCH 009 END****
**** EPOCH 010 ****
Current learning rate: 0.001000
Current BN decay momentum: 0.500000
2020-06-17 13:28:24.704650
---- batch: 010 ----
mean box_loss: 21.451081
mean center_loss: 21.343650
mean heading_acc: 0.000000
mean heading_cls_loss: 0.661935
mean heading_reg_loss: 0.039943
mean loss: 247.758943
mean neg_ratio: 0.998730
mean obj_acc: 0.999902
mean objectness_loss: 0.000772
mean pos_ratio: 0.000098
mean sem_acc: 0.200000
mean sem_cls_loss: 0.009268
mean size_acc: 0.200000
mean size_cls_loss: 0.008473
mean size_reg_loss: 0.000448
mean vote_loss: 3.323501

片段二

**** EVAL EPOCH 049 END****
**** EPOCH 050 ****
Current learning rate: 0.001000
Current BN decay momentum: 0.125000
2020-06-17 17:01:50.644762
---- batch: 010 ----
mean box_loss: 82.634578
mean center_loss: 82.634578
mean heading_acc: 0.000000
mean heading_cls_loss: 0.000000
mean heading_reg_loss: 0.000000
mean loss: 901.370782
mean neg_ratio: 0.999512
mean obj_acc: 1.000000
mean objectness_loss: 0.000021
mean pos_ratio: 0.000000
mean sem_acc: 0.000000
mean sem_cls_loss: 0.000000
mean size_acc: 0.000000
mean size_cls_loss: 0.000000
mean size_reg_loss: 0.000000
mean vote_loss: 7.502488
---- batch: 020 ----
mean box_loss: 109.684003
mean center_loss: 109.675870
mean heading_acc: 0.100000
mean heading_cls_loss: 0.062090
mean heading_reg_loss: 0.000375
mean loss: 1182.991797
mean neg_ratio: 0.999512
mean obj_acc: 0.999951
mean objectness_loss: 0.000595
mean pos_ratio: 0.000049
mean sem_acc: 0.100000
mean sem_cls_loss: 0.000003
mean size_acc: 0.100000
mean size_cls_loss: 0.000004
mean size_reg_loss: 0.001548
mean vote_loss: 8.614880

后来，我修改了eval.py。无论是否使用 --use_3d_nms --use_cls_nms --per_class_proposal
均出现这个报错。

Traceback (most recent call last):
File "eval.py", line 210, in
eval()
File "eval.py", line 207, in eval
loss = evaluate_one_epoch()
File "eval.py", line 178, in evaluate_one_epoch
batch_pred_map_cls = parse_predictions(end_points, CONFIG_DICT)
File "/home/hova/Documents/Git_projects/votenet_kitti/models/kitti_ap_helper.py", line 133, in parse_predictions
assert (len(pick) > 0)

当我 # assert (len(pick) > 0) 之后，尝试dump result 。结果完全不行。

可以参考下这个ISSUE，

anyang1996 · 2020-06-18T02:11:39Z

把train.py里面dataloader那里的num_workers改成32！总之不是1就好，不过原因我不太知道，刚接触pytorch，之前一直tf来着…

这是什么原因？num_workers怎么会对训练产生影响。能不能把不收敛的loss的log发个邮件给我，我看看是什么问题。

之前删掉了，重新训练下再发你；除此之外我写错了，不能是0，已修正

hova88 · 2020-06-18T02:14:41Z

把train.py里面dataloader那里的num_workers改成32！总之不是1就好，不过原因我不太知道，刚接触pytorch，之前一直tf来着…

这是什么原因？num_workers怎么会对训练产生影响。能不能把不收敛的loss的log发个邮件给我，我看看是什么问题。

之前删掉了，重新训练下再发你；除此之外我写错了，不能是0，已修正

我搜了下平num_workers是CPU与GPU的内存访问设置，理论上影响的是训练时间，为什么会影响训练精度？

anyang1996 · 2020-06-18T02:16:24Z

把train.py里面dataloader那里的num_workers改成32！总之不是1就好，不过原因我不太知道，刚接触pytorch，之前一直tf来着…

这是什么原因？num_workers怎么会对训练产生影响。能不能把不收敛的loss的log发个邮件给我，我看看是什么问题。

之前删掉了，重新训练下再发你；除此之外我写错了，不能是0，已修正

我搜了下平num_workers是CPU与GPU的内存访问设置，理论上影响的是训练时间，为什么会影响训练精度？

不清楚，不过你自己可以试一下，num_workers从0改成32，其他的不变，看一下loss变化；你刚刚发的loss趋势和我之前一样。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question about the loss #4

question about the loss #4

anyang1996 commented May 21, 2020

qiqihaer commented May 21, 2020

anyang1996 commented May 21, 2020

qiqihaer commented May 21, 2020

anyang1996 commented May 22, 2020

qiqihaer commented May 22, 2020

hova88 commented Jun 17, 2020

qiqihaer commented Jun 17, 2020

anyang1996 commented Jun 17, 2020 via email •

edited

Loading

qiqihaer commented Jun 18, 2020

hova88 commented Jun 18, 2020

anyang1996 commented Jun 18, 2020

hova88 commented Jun 18, 2020

anyang1996 commented Jun 18, 2020

question about the loss #4

question about the loss #4

Comments

anyang1996 commented May 21, 2020

qiqihaer commented May 21, 2020

anyang1996 commented May 21, 2020

qiqihaer commented May 21, 2020

anyang1996 commented May 22, 2020

qiqihaer commented May 22, 2020

hova88 commented Jun 17, 2020

qiqihaer commented Jun 17, 2020

anyang1996 commented Jun 17, 2020 via email • edited Loading

qiqihaer commented Jun 18, 2020

hova88 commented Jun 18, 2020

anyang1996 commented Jun 18, 2020

hova88 commented Jun 18, 2020

anyang1996 commented Jun 18, 2020

anyang1996 commented Jun 17, 2020 via email •

edited

Loading