Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vvl: Rework accel structs mem aliasing validation #9492

Closed

Conversation

arno-lunarg
Copy link
Contributor

@arno-lunarg arno-lunarg commented Feb 20, 2025

Closes #9428

In acceleration structures build commands, when looking for overlaps between source acceleration structures, destination acceleration structures, and scratch buffers, do it using device address ranges instead of relying on the device memory objects, and sort device address ranges.
=> Since overlap detection is done when a new device address range is inserted among the previously seen ones, and insertion is done using dichotomy (faster and ok since address ranges are sorted), not all overlap will be detected, but if there is any, it will be detected.
For instance, if all ranges are equal, inserting yet another equal range will likely result in it being directly placed at the beginning or end of the list, with only one overlap detected with first of last element.
Given the performance gains, and the fact that most calls done by apps to vkCmdBuildAccelerationStructuresKHR are expected to be valid so no memory overlaps will be detected, this compromise is well worth it.

Doing some stress test, where I build 8000 acceleration structures in one go, PreCallValidateCmdBuildAccelerationStructuresKHR went from ~4 minutes down to ~200 ms, and the overlap validation itself is about ~100 ms

Still need to do the same change at other spots, but this one in particular was the issue, so let's get this in first

@arno-lunarg arno-lunarg requested a review from a team as a code owner February 20, 2025 14:21
@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build queued with queue ID 376152.

@arno-lunarg arno-lunarg force-pushed the arno-rt-validation-perf-2 branch from c4d1142 to 3856539 Compare February 20, 2025 14:23
@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build queued with queue ID 376167.

@arno-lunarg
Copy link
Contributor Author

arno-lunarg commented Feb 20, 2025

The number of detected overlaps in tests will also now depends on how std::lower_bound is implemented, hopefully all implementations work the same way

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 19127 running.

@artem-lunarg
Copy link
Contributor

artem-lunarg commented Feb 20, 2025

@arno-lunarg std::lower_bound (and other stl algorithms) has some guarantees how it works. We can rely on them. If that's something not coverd by those rules (or can't be derived from them) then we can't.
https://en.cppreference.com/w/cpp/algorithm/lower_bound (or the spec)

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 19127 failed.

// elements.
// => due to validation code optimisations, not *all* overlaps will be detected,
// but if there is *at least one*, it will *always+ be detected.
m_errorMonitor->SetDesiredError("VUID-vkCmdBuildAccelerationStructuresKHR-dstAccelerationStructure-03703");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
m_errorMonitor->SetDesiredError("VUID-vkCmdBuildAccelerationStructuresKHR-dstAccelerationStructure-03703");
m_errorMonitor->SetDesiredError("VUID-vkCmdBuildAccelerationStructuresKHR-dstAccelerationStructure-03703", 3);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yes!

break;
}
case AddressRangeOrigin::DstAccelStruct: {
const auto other_dst_as_state =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be null?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possibly, will take care of this one and the others

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a ASSERT_AND_RETURN type thing should be good

break;
}
case AddressRangeOrigin::DstAccelStruct: {
const auto other_dst_as_state =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same, can this be null?


{
const rt::BuildType rt_build_type =
error_obj.location.function == Func::vkBuildAccelerationStructuresKHR ? rt::BuildType::Host : rt::BuildType::Device;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be removed, this check is only for device call now I think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes indeed!

@arno-lunarg arno-lunarg force-pushed the arno-rt-validation-perf-2 branch from 3856539 to 4c15ced Compare February 20, 2025 15:57
@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build queued with queue ID 376329.

void Device::PreCallRecordCreateBuffer(VkDevice device, const VkBufferCreateInfo *pCreateInfo,
const VkAllocationCallbacks *pAllocator, VkBuffer *pBuffer, const RecordObject &record_obj,
chassis::CreateBuffer &chassis_state) {
if (pCreateInfo->usage & VK_BUFFER_USAGE_ACCELERATION_STRUCTURE_STORAGE_BIT_KHR) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is playing a bit with fire, and there is a chance this might cause a false positive if we have now altered what the drive reports back and the user will see a different result with and without validation

Another option/idea is if they don't have the VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT flag, we just don't check it.

I guess personally want to be on the side of "no false positive" to trying to validate everything, but at the same time, not sure if this will "save many mistakes" that otherwise would be missed if we don't enforce it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is a compromise, for now I deem it well worth it. VVL will still report an error when application forgot to set this flag, I don't expect major problems around here, but we will see.
It is impossible to not have shader device address when using acceleration structure, at least because you need them for the scratch buffer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this would not be the first time a debug/validation tool modifies the behavior of the targeted app, just look at GPU-AV, we do things far more horrible!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but this is not GPU-AV and we have had issues where pipeline creation was setting ignored fields to null and caused lots of headaches

basically people don't expect core validation to alter things, just a read only validation... again, should be fine, but will keep our eyes on the lookout for trouble

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I will keep this change in mind should ray tracing weirdness come

@arno-lunarg arno-lunarg force-pushed the arno-rt-validation-perf-2 branch from 4c15ced to e25bba1 Compare February 20, 2025 15:59
@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build queued with queue ID 376344.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 19130 running.

Copy link
Contributor

@spencer-lunarg spencer-lunarg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - when null check comments are addressed

@arno-lunarg arno-lunarg force-pushed the arno-rt-validation-perf-2 branch from e25bba1 to 8f40b0f Compare February 20, 2025 16:10
@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build queued with queue ID 376377.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 19132 running.

@arno-lunarg arno-lunarg force-pushed the arno-rt-validation-perf-2 branch from 8f40b0f to 7e5a918 Compare February 20, 2025 16:19
@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build queued with queue ID 376393.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 19133 running.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 19133 failed.

@arno-lunarg arno-lunarg force-pushed the arno-rt-validation-perf-2 branch from 7e5a918 to 00898b6 Compare February 20, 2025 17:09
@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build queued with queue ID 376507.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 19138 running.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 19138 failed.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build queued with queue ID 376717.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 19144 running.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 19144 failed.

@arno-lunarg arno-lunarg force-pushed the arno-rt-validation-perf-2 branch from a9a9c7f to f0f2b76 Compare February 21, 2025 09:07
@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build queued with queue ID 377060.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 19151 running.

@arno-lunarg arno-lunarg force-pushed the arno-rt-validation-perf-2 branch from f0f2b76 to ed78078 Compare February 21, 2025 09:13
@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build queued with queue ID 377076.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 19152 running.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 19152 failed.

@arno-lunarg arno-lunarg force-pushed the arno-rt-validation-perf-2 branch from ed78078 to d0effb9 Compare February 21, 2025 09:58
@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build queued with queue ID 377104.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 19153 running.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 19153 failed.

@arno-lunarg
Copy link
Contributor Author

After getting some info back from the working group, it seems that relying on device addresses to look for memory overlaps will not work, at least because of this:
If you have 2 sparse buffers, bound to the same memory, they can still return different base device addresses....

So I need to revamp this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bad performance around validation for acceleration structures
4 participants