Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding ortvalue features support for MGX EP #81

Merged
merged 10 commits into from
Jan 14, 2025

Conversation

urpetkov-amd
Copy link

Created PR request with implementation of ortvalue_from_numpy() and ortvalue_from_shape_and_type() features for MGX EP on Windows and Linux in order of getting better performance for llama2 int 4 model execution. Some methods have been overridden and some of them implemented similar like it was done in ROCm EP. Implementing these features we significantly decreased amount of time needed for creating and copying tensors, almost whole time is dedicated to GPU now, which caused much better performance in tok/s for our GPUs. Similar option added for ROCM EP.

@urpetkov-amd urpetkov-amd added the enhancement New feature or request label Dec 20, 2024
if (!IsRocmDeviceIdValid(logging::LoggingManager::DefaultLogger(), device.Id())) {
throw std::runtime_error("The provided device id doesn't match any available GPUs on the machine.");
}
allocator = GetRocmAllocator(device.Id());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might be in an odd situation here as our offering has both MIGraphx and ROCm EPs include, thus we should we get both allocators? Did you test this when we build both MIGraphX and ROCm EPs? How does the allocator work for that?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested it with both included USE_ROCM and USE_MIGRAPHX and it's working well. In that case I think it's the same which allocator is used because both of them are for GPU. If you think it can be written better, I can change it or put additional condition for both of them to use specified ROCm or MGX EP.

// make it stream aware
true,
// enable cross stream sharing?
false);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something we want to make controllable from he API later?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be. I created it watching the implementation for ROCm EP in rocm_execution_provider.cc. It's quite same, if you think this is the situation in ROCm EP implementation then the answer is yes.

Copy link

@TedThemistokleous TedThemistokleous left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution!

Few questions about this. Overall looks good.

I've added questions/comments. One detail about combined ROCm/MIGraphX EP builds and if you've tested this with both.

@TedThemistokleous
Copy link

also if you can, download and use lintrunner in your env to solve the lint issue. It'll make upstreaming easier

lintrunner -a on a linux based system

@urpetkov-amd
Copy link
Author

Thanks for the review. I changed what was needed and hope gave you the answers. I'll use lintrunner to solve the issues in next days.

@urpetkov-amd
Copy link
Author

Lint Python format issue has been fixed.

@TedThemistokleous
Copy link

Have you done a build with both ROCm and MIGraphX EP?

Seeing this during baseline unit tests from a clean build (rocm6.3_internal_testing + this branch). Can you resolve this for the linux side and both EPs?

`
[ RUN ] InferenceSessionTests.CheckRunProfilerWithSessionOptions
terminate called after throwing an instance of 'onnxruntime::OnnxRuntimeException'
what(): /onnxruntime/onnxruntime/core/providers/rocm/rocm_call.cc:124 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::RocmCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = hipError_t; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime/onnxruntime/core/providers/rocm/rocm_call.cc:117 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::RocmCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = hipError_t; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] HIP failure 100: no ROCm-capable device is detected ; GPU=-1 ; hostname=aus-navi3x-04 ; file=/onnxruntime/onnxruntime/core/providers/rocm/rocm_common.h ; line=66 ; expr=hipGetDeviceProperties(&deviceProp, 0);

`

@urpetkov-amd
Copy link
Author

I have used onnxruntime_USE_COMPOSABLE_KERNEL=OFF. I suppose that's the difference causing the error because I'm not seeing any errors; all tests have been passed, and I've tested functionality. I'm rebuilding it with onnxruntime_USE_COMPOSABLE_KERNEL=ON now to see what's happening.

@TedThemistokleous TedThemistokleous merged commit f661e4f into rocm6.3_internal_testing Jan 14, 2025
11 of 15 checks passed
@TedThemistokleous TedThemistokleous deleted the urpetkov/ortvalue_support branch January 14, 2025 19:50
@TedThemistokleous
Copy link

Confirmed build on a stock ROCm 6.3 Onnxruntime image from upstream. Will also cherry pick this into ROCm 6.4 internal_testing so this is part of QA's cycle

TedThemistokleous pushed a commit that referenced this pull request Jan 17, 2025
* Adding ourtvalue support for MGX EP

---------

authored-by: Uros Petkovic <[email protected]>
TedThemistokleous pushed a commit that referenced this pull request Jan 17, 2025
* Adding ourtvalue support for MGX EP

---------

authored-by: Uros Petkovic <[email protected]>
TedThemistokleous pushed a commit that referenced this pull request Jan 17, 2025
* Adding ourtvalue support for MGX EP

---------

authored-by: Uros Petkovic <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants