Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

steps_per_epoch根据训练集的不同需要修改吗? #90

Open
AnMoran opened this issue Sep 7, 2020 · 5 comments
Open

steps_per_epoch根据训练集的不同需要修改吗? #90

AnMoran opened this issue Sep 7, 2020 · 5 comments

Comments

@AnMoran
Copy link

AnMoran commented Sep 7, 2020

我用art,lsvt和rects训练了180000个step,loss不怎么降低了,在1.2左右,测试效果和你提供的2个pb的模型差的有点多,你的大概85%左右,我的大概只有72%,可以提供下你pb对应的checkpoint么?我finetune下,或者有其他训练tricks么?

@alexchungio
Copy link

源码中使用的三个数据集总的样本数为396733, 配置里step_per_peoch=500, gpus=4, batch_size=10, 这样算每个epoch 的可训练的样本数=500 * 4 * 10 =20000,这样的话一个epoch是无法遍历整个数据集的,我这里也有困惑。

@whereitogo
Copy link

whereitogo commented Dec 1, 2020

我用art,lsvt和rects训练了180000个step,loss不怎么降低了,在1.2左右,测试效果和你提供的2个pb的模型差的有点多,你的大概85%左右,我的大概只有72%,可以提供下你pb对应的checkpoint么?我finetune下,或者有其他训练tricks么?

你好,请问,这个最终的训练结果怎么样?我像试一试作者提供的pb模型,但不知道怎么从docker取文件,可以发我一份吗?我这里训练太慢了,一个epoch要30分钟,不知道为啥!

@xianzhe-741
Copy link

你好,我使用过程中有两个问题请教一下:

  1. test.py过程中使用作者docker中的模型text_recognition_5435.pb,在_ = tf.import_graph_def(graph_def, name='')时报错 InvalidArgumentError (see above for traceback): The second input must be a scalar, but it has shape [1,33]
    2.在train.py时报错
    File "/usr/local/lib/python3.5/dist-packages/tensorpack/train/config.py", line 119, in init
    assert_type(model, ModelDescBase, 'model')
    File "/usr/local/lib/python3.5/dist-packages/tensorpack/train/config.py", line 107, in assert_type
    name, tp.name, v.class.name)
    AssertionError: model has to be type 'ModelDescBase', but an object of type 'AttentionOCR' found.

我用art,lsvt和rects训练了180000个step,loss不怎么降低了,在1.2左右,测试效果和你提供的2个pb的模型差的有点多,你的大概85%左右,我的大概只有72%,可以提供下你pb对应的checkpoint么?我finetune下,或者有其他训练tricks么?

@AnMoran
Copy link
Author

AnMoran commented Dec 15, 2020

你好,我使用过程中有两个问题请教一下:

  1. test.py过程中使用作者docker中的模型text_recognition_5435.pb,在_ = tf.import_graph_def(graph_def, name='')时报错 InvalidArgumentError (see above for traceback): The second input must be a scalar, but it has shape [1,33]
    2.在train.py时报错
    File "/usr/local/lib/python3.5/dist-packages/tensorpack/train/config.py", line 119, in init
    assert_type(model, ModelDescBase, 'model')
    File "/usr/local/lib/python3.5/dist-packages/tensorpack/train/config.py", line 107, in assert_type
    name, tp.name, v.class.name)
    AssertionError: model has to be type 'ModelDescBase', but an object of type 'AttentionOCR' found.

我用art,lsvt和rects训练了180000个step,loss不怎么降低了,在1.2左右,测试效果和你提供的2个pb的模型差的有点多,你的大概85%左右,我的大概只有72%,可以提供下你pb对应的checkpoint么?我finetune下,或者有其他训练tricks么?

1.我没用过作者的docker,我是直接按照这个需求配置的本地虚拟环境,也没用过作者的模型
2.应该是版本的问题?

@xianzhe-741
Copy link

xianzhe-741 commented Dec 15, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants