-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bindless support #36
Comments
|
apparently undocumented. MoltenVK allows 500k, if argument buffer tier 2 is supported(why?) and 8 otherwise |
New Mac/iOS feature to track residency of resources: According to apple: |
Bindless is quite messy in every api, so need to design nice top-level api with reasonable underlying implementation.
GLSL
GLSL is main language in Tempest, so dedicated section is must. GLSL features 2 ways:
Engine-side
Doesn't fit the engine perfectly - need to add support for sampler and textures(non-combined) on top of it.
Vulkan
Caps-list:
VK_DESCRIPTOR_BINDING_VARIABLE_DESCRIPTOR_COUNT_BIT
can be used (in theory), but only for the very last binding in descriptor set, what doesn't fit GLSL side.Alternatively, it's sufficient to use
VK_DESCRIPTOR_BINDING_PARTIALLY_BOUND_BIT_EXT
with very-large descriptor array. Size of array has to be defined in C++ upfront, atVkDescriptorSetLayout
creation.Current implementation of Tempest can recreate
VkDescriptorSetLayout
andVkDescriptorSet
on a go, if preallocated array is not big enough. But it also requires reallocation ofVkPipeline
, at runtime, based of descriptor set size - this is hard to implement without extra performance cost.VK_DESCRIPTOR_BINDING_UPDATE_AFTER_BIND_BIT
- useless by itself, but there is a special behavior for this type of descriptors in spec:Naturally as there is only single descriptor-set, can just take min of
PerStage
andDescriptorSet
limits.Other limits to concern (obsolete):
With such limits,
realloc
has to manage per-stage + per-resource + per_set limit somehow.DirectX12
Note: Tempest uses spirv-cross to generate HLSL, except produced HLSL is not valid:
Apparently spirv-cross follows
VARIABLE_DESCRIPTOR_COUNT
workflow. This maps directly toD3D12_DESCRIPTOR_HEAP_DESC::NumDescriptors = -1
with same limitation of only one runtime array per set. I theory can workaround with instrumenting spir-v:OpDecorate %tex DescriptorSet 0 -> OpDecorate %tex DescriptorSet UNIQ_SPACE
Limits:
ID3D12GraphicsCommandList::SetDescriptorHeaps
Only one descriptor heap of each type can be set at one time, which means a maximum of 2 heaps (one sampler, one CBV/SRV/UAV) can be set at one time.
DX12 is a bit awkward, because limit is shared for all types of descriptors, except sampler. Probably can "just" split heap in equal partitions.
Metal [3]
Limits (per-app resources available at any given time are):
For both tiers, the maximum number of argument buffer entries in each function argument table is 8.
*Writable textures aren’t supported within an argument buffer.
Tier 1 argument buffers can’t be accessed through pointer indexing, nor can they include pointers to other argument buffers.
Tier 2 argument buffers can be accessed through pointer indexing, as shown in the following example.
T1 argument are practically same as descriptor-set's in vulkan and have nothing usefull in it.
T2 allows for pointer-indexing and can be leveraged for bindless-array.
Sources:
https://gist.github.com/DethRaid/0171f3cfcce51950ee4ef96c64f59617
https://docs.microsoft.com/en-us/windows/win32/api/d3d12/ns-d3d12-d3d12_descriptor_range
https://learn.microsoft.com/en-us/windows/win32/direct3d12/hardware-support?redirectedfrom=MSDN
https://developer.apple.com/documentation/metal/buffers/about_argument_buffers
https://developer.apple.com/documentation/metal/buffers/managing_groups_of_resources_with_argument_buffers
GLSL
Unbound array of descriptors has 2 meanings:
Base spec:
uniform sampler2D tex[]
->OpTypeArray %8 %uint_1
size of array depend on highest index that been used in code.
GL_EXT_nonuniform_qualifier
:May work same as base spec, if runtime-index is not in use, and otherwise:
uniform sampler2D tex[]
->OpTypeRuntimeArray %8
// legal only if driver supports descriptor-indexingEngine side
[wip]
Generally metal-like model is good middle ground:
In DX UAX/Tex - can be achieved by splitting heap in 2 parts
In Vulkan UAV is probably min for all applicable resources
The text was updated successfully, but these errors were encountered: