Skip to content
This repository has been archived by the owner on Sep 21, 2020. It is now read-only.

test speed flops and parameters #36

Open
Amarintine opened this issue May 29, 2020 · 2 comments
Open

test speed flops and parameters #36

Amarintine opened this issue May 29, 2020 · 2 comments

Comments

@Amarintine
Copy link

Hi,i have tested the network with mobilenet,but i can see that the speed is not so fast compared with mobilenetv2, #ghostnet flops:147.505M, params:3.903M
#mobilenetv2 flops:312.852M, params:2.225M

so i don't know what's wrong with it?the ghostnet params is higher, in fact test running time code:
x = torch.randn(32, 3, 224, 224)
for _ in range(30):
with torch.no_grad():
inputs = x.cuda()
outputs = model(inputs)
print(time.time() - t)
t = time.time()
it seems that mobilenetv2 is faster than ghostnet???

@iamhankai
Copy link
Owner

iamhankai commented May 29, 2020

On GPUs, when FLOPS is small, the main constrain of speed is the bandwidth. Depthwise conv is not so fast on GPUs but super fast on CPUs. That is why MobileNet and GhostNet mainly test speed on ARM/CPU.
If you are familar to Chinese, you could refer to https://zhuanlan.zhihu.com/p/122943688 and https://www.zhihu.com/question/339909499.

@iamhankai
Copy link
Owner

In addition, your code for testing GPU time is not correct. You should add torch.cuda.synchronize() after forward. Please refer to https://discuss.pytorch.org/t/measuring-gpu-tensor-operation-speed/2513

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants