-
Notifications
You must be signed in to change notification settings - Fork 466
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Change log for August 25, 2023 Vulkan 1.3.262 spec update:
Github Issues * Remove ename:VK_ACCESS_2_UNIFORM_READ_BIT from ename:VK_ACCESS_SHADER_READ_BIT (public issue 2169). * Fix typo in description of slink:VkPerformanceCounterResultKHR (public issue 2186). * Fix missing vkQueue* row in command properties tables (public PR 2189). * Specify that some <<spirvenv-module-validation-runtime, Runtime SPIR-V Validation>> VUs apply only to compute shaders (public PR 2199). * Misc. markup fixes (public pull request 2200). * Specify when compute pipeline metadata must be saved in language following slink:VkPipelineShaderStageModuleIdentifierCreateInfoEXT (HansKristian-Work/vkd3d-proton#1639, internal MR 6058). Internal Issues * Fix MSRTSS + dynamic sample count + single sampling interactions in common draw VUs (internal issue 3474). * Fixes to block processing and attribute handling when generating `validusage.json` (internal issue 3576). * Generate implicit `-parent` VUs for more distant ancestor relationships if the immediate parent type is not present (internal issue 3582). * Fix ename:VK_DYNAMIC_STATE_ALPHA_TO_ONE_ENABLE_EXT interaction with apiext:VK_EXT_extended_dynamic_state3 in VUs for slink:VkGraphicsPipelineCreateInfo (internal issue 3587). * Clarify that synchronization commands do not execute pipeline stages in the <<synchronization-pipeline-stages, Pipeline Stages>> section (internal issue 3592). * VU fixes for apiext:VK_KHR_maintenance5 (internal MR 6032). * Update apiext:VK_KHR_dynamic_rendering proposal document to remove a sentence that should not have been there (internal MR 6047). * Add ray tracing VUs for core synchronization access flags to match the 64-bit versions (internal MR 6047). * Add a CI test to check for include:: paths not starting with a recognized path attribute (internal MR 6063). * Make slink:VkBufferCreateInfo::pname:usage `noautovalidity` in XML and allow for elink:VkBufferUsageFlags2KHR flags (internal MR 6066). * Expand CI test checking for missing asciidoctor attributes to the refpage build, catching additional cases (internal MR 6071). * Remove explicit VU for slink:VkImageFormatProperties2 requiring the <<fetaures-hostImageCopy, pname:hostImageCopy>> feature to be enabled if slink:VkHostImageCopyDevicePerformanceQueryEXT is included in a pname:pNext chain (internal MR 6074). New Extensions * apiext:VK_QCOM_filter_cubic_clamp * apiext:VK_QCOM_filter_cubic_weights * apiext:VK_QCOM_image_processing2 * apiext:VK_QCOM_ycbcr_degamma
- Loading branch information
Showing
48 changed files
with
1,754 additions
and
523 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
// Copyright 2023 The Khronos Group Inc. | ||
// | ||
// SPDX-License-Identifier: CC-BY-4.0 | ||
|
||
include::{generated}/meta/{refprefix}VK_QCOM_filter_cubic_clamp.adoc[] | ||
|
||
=== Other Extension Metadata | ||
|
||
*Last Modified Date*:: | ||
2023-08-02 | ||
*Contributors*:: | ||
- Jeff Leger, Qualcomm Technologies, Inc. | ||
|
||
=== Description | ||
|
||
This extension extends cubic filtering by adding the ability to enable an | ||
anti-ringing clamp. | ||
Cubic filtering samples from a 4x4 region of texels and computes a cubic | ||
weighted average of the region. | ||
In some cases, the resulting value is outside the range of any of the texels | ||
in the 4x4 region. | ||
This is sometimes referred to as "`filter overshoot`" or "`filter ringing`" | ||
and can occur when there is a sharp discontinuity in the 4x4 region being | ||
filtered. | ||
For some use cases this "`ringing`" can produces unacceptable artifacts. | ||
|
||
The solution to the ringing problem is to clamp the post-cubic-filtered | ||
value to be within the max and min of texel values in the 4x4 region. | ||
While such "`range clamping`" can be performed in shader code, the | ||
additional texture fetches and clamping ALU operations can be costly. | ||
|
||
Certain Adreno GPUs are able to perform the range clamp in the texture unit | ||
during cubic filtering at significant performance/power savings versus a | ||
shader-based clamping approach. | ||
This extension exposes such hardware functionality. | ||
|
||
This extension extends elink:VkSamplerReductionMode, adding | ||
ename:VK_SAMPLER_REDUCTION_MODE_WEIGHTED_AVERAGE_RANGECLAMP_QCOM which | ||
enables the range clamp operation. | ||
|
||
include::{generated}/interfaces/VK_QCOM_filter_cubic_clamp.adoc[] | ||
|
||
=== Version History | ||
|
||
* Revision 1, 2023-08-02 (jleger) | ||
** Initial version |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
// Copyright 2023 The Khronos Group Inc. | ||
// | ||
// SPDX-License-Identifier: CC-BY-4.0 | ||
|
||
include::{generated}/meta/{refprefix}VK_QCOM_filter_cubic_weights.adoc[] | ||
|
||
=== Other Extension Metadata | ||
|
||
*Last Modified Date*:: | ||
2023-06-23 | ||
*Contributors*:: | ||
- Jeff Leger, Qualcomm Technologies, Inc. | ||
- Jonathan Wicks, Qualcomm Technologies, Inc. | ||
|
||
=== Description | ||
|
||
This extension extends cubic filtering by adding the ability to select a set | ||
of weights. | ||
Without this extension, the weights used in cubic filtering are limited to | ||
those corresponding to a Catmull-Rom spline. | ||
This extension adds support for 3 additional spline weights. | ||
|
||
This extension adds a new structure that can: be added to the pname:pNext | ||
chain of slink:VkSamplerCreateInfo that can: be used to specify which set of | ||
cubic weights are used in cubic filtering. | ||
A similar structure can be added to the pname:pNext chain of | ||
slink:VkBlitImageInfo2 to specify cubic weights used in a blit operation. | ||
|
||
With this extension weights corresponding to the following additional | ||
splines can be selected for cubic filtered sampling and blits: | ||
|
||
* Zero Tangent Cardinal | ||
* B-Spline | ||
* Mitchell-Netravali | ||
|
||
include::{generated}/interfaces/VK_QCOM_filter_cubic_weights.adoc[] | ||
|
||
=== Version History | ||
|
||
* Revision 1, 2023-06-23 (jleger) | ||
** Initial version |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,142 @@ | ||
// Copyright (c) 2023 Qualcomm Technologies, Inc. | ||
// | ||
// SPDX-License-Identifier: CC-BY-4.0 | ||
|
||
include::{generated}/meta/{refprefix}VK_QCOM_image_processing2.adoc[] | ||
|
||
=== Other Extension Metadata | ||
|
||
*Last Modified Date*:: | ||
2023-03-10 | ||
*Interactions and External Dependencies*:: | ||
- This extension requires | ||
{spirv}/QCOM/SPV_QCOM_image_processing.html[`SPV_QCOM_image_processing2`] | ||
- This extension provides API support for | ||
{GLSLregistry}/qcom/GLSL_QCOM_image_processing2.txt[`GL_QCOM_image_processing2`] | ||
|
||
*Contributors*:: | ||
- Jeff Leger, Qualcomm Technologies, Inc. | ||
|
||
=== Description | ||
|
||
This extension enables support for the SPIR-V code:TextureBlockMatch2QCOM | ||
capability. | ||
It builds on the functionality of QCOM_image_processing with the addition of | ||
4 new image processing operations. | ||
|
||
* The code:opImageBlockMatchWindowSADQCOM` SPIR-V instruction builds upon | ||
the functionality of code:opImageBlockMatchSADQCOM` by repeatedly | ||
performing block match operations across a 2D window. | ||
The "`2D windowExtent`" and "`compareMode`" are are specified by | ||
slink:VkSamplerBlockMatchWindowCreateInfoQCOM in the sampler used to | ||
create the _target image_. | ||
Like code:OpImageBlockMatchSADQCOM, code:opImageBlockMatchWindowSADQCOM | ||
computes an error metric, that describes whether a block of texels in | ||
the _target image_ matches a corresponding block of texels in the | ||
_reference image_. | ||
Unlike code:OpImageBlockMatchSADQCOM, this instruction computes an error | ||
metric at each (X,Y) location within the 2D window and returns either | ||
the minimum or maximum error. | ||
The instruction only supports single-component formats. | ||
Refer to the pseudocode below for details. | ||
* The code:opImageBlockMatchWindowSSDQCOM follows the same pattern, | ||
computing the SSD error metric at each location within the 2D window. | ||
* The code:opImageBlockMatchGatherSADQCOM builds upon | ||
code:OpImageBlockMatchSADQCOM. | ||
This instruction computes an error metric, that describes whether a | ||
block of texels in the _target image_ matches a corresponding block of | ||
texels in the _reference image_. | ||
The instruction computes the SAD error metric at 4 texel offsets and | ||
returns the error metric for each offset in the X,Y,Z,and W components. | ||
The instruction only supports single-component texture formats. | ||
Refer to the pseudocode below for details. | ||
* The code:opImageBlockMatchGatherSSDQCOM follows the same pattern, | ||
computing the SSD error metric for 4 offsets. | ||
|
||
Each of the above 4 image processing instructions are limited to | ||
single-component formats. | ||
|
||
Below is the pseudocode for GLSL built-in function | ||
code:textureWindowBlockMatchSADQCOM. | ||
The pseudocode for code:textureWindowBlockMatchSSD is identical other than | ||
replacing all instances of `"SAD"` with `"SSD"`. | ||
|
||
[source,c] | ||
---- | ||
vec4 textureBlockMatchWindowSAD( sampler2D target, | ||
uvec2 targetCoord, | ||
samler2D reference, | ||
uvec2 refCoord, | ||
uvec2 blocksize) { | ||
// compareMode (MIN or MAX) comes from the vkSampler associated with `target` | ||
// uvec2 window comes from the vkSampler associated with `target` | ||
minSAD = INF; | ||
maxSAD = -INF; | ||
uvec2 minCoord; | ||
uvec2 maxCoord; | ||
for (uint x=0, x < window.width; x++) { | ||
for (uint y=0; y < window.height; y++) { | ||
float SAD = textureBlockMatchSAD(target, | ||
targetCoord + uvec2(x, y), | ||
reference, | ||
refCoord, | ||
blocksize).x; | ||
// Note: the below comparison operator will produce undefined results | ||
// if SAD is a denorm value. | ||
if (SAD < minSAD) { | ||
minSAD = SAD; | ||
minCoord = uvec2(x,y); | ||
} | ||
if (SAD > maxSAD) { | ||
maxSAD = SAD; | ||
maxCoord = uvec2(x,y); | ||
} | ||
} | ||
} | ||
if (compareMode=MIN) { | ||
return vec4(minSAD, minCoord.x, minCoord.y, 0.0); | ||
} else { | ||
return vec4(maxSAD, maxCoord.x, maxCoord.y, 0.0); | ||
} | ||
} | ||
---- | ||
|
||
Below is the pseudocode for code:textureBlockMatchGatherSADQCOM. | ||
The pseudocode for code:textureBlockMatchGatherSSD follows an identical | ||
pattern. | ||
|
||
[source,c] | ||
---- | ||
vec4 textureBlockMatchGatherSAD( sampler2D target, | ||
uvec2 targetCoord, | ||
samler2D reference, | ||
uvec2 refCoord, | ||
uvec2 blocksize) { | ||
vec4 out; | ||
for (uint x=0, x<4; x++) { | ||
float SAD = textureBlockMatchSAD(target, | ||
targetCoord + uvec2(x, 0), | ||
reference, | ||
refCoord, | ||
blocksize).x; | ||
out[x] = SAD; | ||
} | ||
return out; | ||
} | ||
---- | ||
|
||
include::{generated}/interfaces/VK_QCOM_image_processing2.adoc[] | ||
|
||
=== Issues | ||
|
||
1) What is the precision of the min/max comparison checks? | ||
|
||
*RESOLVED*: Intermediate computations for the new operations are performed | ||
at 16-bit floating point precision. | ||
If the value of `"float SAD"` in the above code sample is a 16-bit denorm | ||
value, then behavior of the MIN/MAX comparison is undefined. | ||
|
||
=== Version History | ||
|
||
* Revision 1, 2023-03-10 (Jeff Leger) |
Oops, something went wrong.