-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP Allow Python backend to directly write Numpy arrays to SHM #264
base: r23.05
Are you sure you want to change the base?
WIP Allow Python backend to directly write Numpy arrays to SHM #264
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@asos-danielbunting thanks for the PR. I was wondering what is the use-case that this PR is trying to address? Is the idea to pre-allocate the buffers in shared memory and directly work with them to speed up the inference process?
Could you please share more details about the places where this becomes useful?
Hi @Tabrizian I'm looking at trying to speed up passing a large tensor between a Python BLS model doing preprocessing and a Tensorflow inference model. As you say the idea is to allocate the buffer and directly write my data into it from the python side and so avoid an extra allocation + copy time. I've run a couple of tests and for my use case this can speed up my inference time by a decent amount eg for a 100000 x 200 float32 tensor the saving was 30ms |
@@ -0,0 +1,31 @@ | |||
FROM asnpdsacr.azurecr.io/public/tritonserver:23.05-tf2-python-py3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove
@@ -431,8 +431,12 @@ Stub::StubSetup() | |||
py::setattr( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove all the changes except the ones in the src
directory.
|
||
c_python_backend_utils.attr("shared_memory") = py::cast(shm_pool_.get()); | ||
python_backend_utils.attr("shared_memory") = py::cast(shm_pool_.get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not needed.
@@ -494,6 +498,7 @@ Stub::Initialize(bi::managed_external_buffer::handle_t map_handle) | |||
python_backend_utils, "InferenceResponse", | |||
c_python_backend_utils.attr("InferenceResponse")); | |||
c_python_backend_utils.attr("shared_memory") = py::cast(shm_pool_.get()); | |||
python_backend_utils.attr("shared_memory") = py::cast(shm_pool_.get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not required.
@@ -1603,6 +1608,8 @@ PYBIND11_EMBEDDED_MODULE(c_python_backend_utils, module) | |||
|
|||
py::register_exception<PythonBackendException>( | |||
module, "TritonModelException"); | |||
|
|||
module.def("new_shm_tensor", &PbTensor::CreateInSHM, "Creates a new Tensor directly into shared memory"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rename this to pb.Tensor.new(shape, dtype, device='cpu')
?
reinterpret_cast<char*>(tensor_shm_ptr) + pb_memory_offset, | ||
shm_handle + pb_memory_offset, false); | ||
tensor_shm_ptr->memory = 0; | ||
std::cout << "Offset is - " << pb_memory_offset<< "\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove print statement.
{ | ||
|
||
// Input params of tensor | ||
//std::vector<int64_t> dims = std::vector<int64_t>({10, 10}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove comment.
No description provided.