Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issues with onnx_tf #353

Open
Terizian opened this issue Jan 16, 2019 · 5 comments
Open

Performance issues with onnx_tf #353

Terizian opened this issue Jan 16, 2019 · 5 comments

Comments

@Terizian
Copy link

I have developed a model on Matlab and saved it using the onnx framework. The size of the model is 25 MB and opset version is 6. I am currently trying to move this model to production using the supported libraries on Python. In my code, I have the following:

model = onnx.load(onnx_path)
tf_rep = prepare(model)  

This takes 25.71 seconds to run, making it very heavy for production usage.

Additionally, when running the predictions:

output = tf_rep.run(x)

Each prediction takes on average 4 seconds to run. My target is running 100 predictions in a second and I'm finding that impossible with the framework. What are things that I may try to speed it up?

@tjingrant
Copy link
Collaborator

Try this: #271 .

This issue and related patch makes tf_rep.run significantly faster. As for prepare, you should only do that once and cache the resultant tensorflow representation throughout the lifetime of your production application.

@Terizian
Copy link
Author

I have tried to look into the solution suggested on that thread but I'm not certain on where to include these following changes:

tf_rep.sess = tf.Session(graph=tf_rep.graph)

Current you use batch size 1: float[1,5,224,224], with is inefficient for tf.

It'll be really great if you could guide me on this.

@tjingrant
Copy link
Collaborator

That comment was written because inferencing without batching is inefficient (due to lack of parallelism). Usually, people export models with explicit batch size (often 1) and this can be a performance bottleneck.

Moreover, session creation is also very time consuming, which is what the related PR was trying to resolve. But that PR has gone a bit outdated and may not work out of the box, @fumihwh can you try updating your patch: #273 ? We should probably merge this PR ASAP since it's quite useful.

@anuar12
Copy link

anuar12 commented Mar 8, 2019

Would be great if this PR was merged, I am facing the exact problem it would solve.

@vibhuagrawal14
Copy link

@Terizian Can you try exporting the graph (using tf_rep.export_graph) and using it directly? If I remember correctly, that improved the performance significantly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants