Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decoupled Async Execute #350

Merged
merged 14 commits into from
Apr 11, 2024
Merged

Decoupled Async Execute #350

merged 14 commits into from
Apr 11, 2024

Conversation

kthui
Copy link
Contributor

@kthui kthui commented Apr 1, 2024

Related PR: triton-inference-server/server#7062

Add support for decoupled async execute function on model.py and enable overlapping between different calls into the execute function by having them execute on the same event loop. Also, enable async exec BLS function for decoupled async execute function.

Closes: triton-inference-server/server#3482

@kthui kthui force-pushed the jacky-py-aio branch 2 times, most recently from 37c8524 to 5333613 Compare April 3, 2024 06:34
@kthui kthui marked this pull request as ready for review April 3, 2024 18:28
Tabrizian

This comment was marked as resolved.

@kthui

This comment was marked as resolved.

@kthui kthui requested a review from Tabrizian April 4, 2024 16:54
@Tabrizian

This comment was marked as resolved.

src/pb_stub.cc Outdated Show resolved Hide resolved
@kthui kthui requested a review from GuanLuo April 5, 2024 03:27
README.md Outdated Show resolved Hide resolved
@kthui kthui requested a review from Tabrizian April 5, 2024 20:22
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
src/pb_stub.cc Show resolved Hide resolved
@kthui kthui merged commit 0cdcaf3 into main Apr 11, 2024
3 checks passed
@kthui kthui deleted the jacky-py-aio branch April 11, 2024 17:55
mc-nv pushed a commit that referenced this pull request Apr 11, 2024
* Add async decoupled execute

* Enable decoupled bls async exec

* Improve handling for async execute future object

* Add docs for async execute for decoupled model

* Fix link on docs

* Improve docs wording

* Improve destruction steps for async execute future object

* Piggy back on GIL for protection

* Document model should not modify event loop

* Use Python add_done_callback

* Protect infer_payload_

* Use traceback API that supports Python 3.8 and 3.9

* Update docs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Is python backend going to support asyncio?
3 participants