-
-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FXC] saturate() on vectors is broken in FXC debug mode, and mojoshader doesn't work around it #10
Comments
I think MOD_SATURATE is the name you want to look up in the source. It’s some weird miscellaneous value modifier and it’s most likely only covering certain types of values. |
The MOD_SATURATE implementation is correct, it seems like clamp() isn't being generated by the shader at all so I'm going to try and figure out why. |
Problem with generated code became evident once I hand-inlined constants: ps_r1 = texture2D(ps_s0, ps_r1.xy);
ps_r2 = ps_r1 + vec4(0);
ps_r1.x = ((ps_r2.x >= 0.0) ? ps_r1.x : 0);
ps_r1.y = ((ps_r2.y >= 0.0) ? ps_r1.y : 0);
ps_r1.z = ((ps_r2.z >= 0.0) ? ps_r1.z : 0);
ps_r1.w = ((ps_r2.w >= 0.0) ? ps_r1.w : 0);
ps_r2 = ps_r1 + vec4(-1.0);
ps_r1.x = ((ps_r2.z >= 0.0) ? 1.0 : ps_r1.z);
ps_r1.y = ((ps_r2.x >= 0.0) ? 1.0 : ps_r1.x);
ps_r1.z = ((ps_r2.y >= 0.0) ? 1.0 : ps_r1.y);
ps_r1.w = ((ps_r2.w >= 0.0) ? 1.0 : ps_r1.w); This problem could be hidden in the d3d9 bytecode in a way that doesn't show up through the D3D shader compiler, I'll have to see. |
That weird zxyw is in the D3D9 bytecode, so maybe this is a fxc bug? I don't know how it works through the D3D api though.
|
Basic test case seems fine with FXC 9.29.952.3111. Source:
ASM:
glsl120:
|
Yeah it only happens at some point in complexity for the shaders. It does show up in the fxc output at that point, which is nice - mojoshader isn't constructing it from thin air. It might be something specific to that swizzle, because it looks like it was added in a later pixel shader profile than the rest of the swizzles. I haven't figured out what causes it to get generated yet. |
Forgot to mention, this is very sensitive to optimization level like most of the other bugs I'm finding. When I messed around with fxc, the codegen for this stuff was dramatically different between /Od and /O0 before you even got to the stuff the later levels of optimization do. This problem only occurs in /Od. |
Same test with Od: ASM:
glsl120:
As... awful as this is, it does appear to be accurate. |
Does this help?
|
Minor corrections (bad at reducing test cases here): // bad
value.rgb = saturate(value.rgb); And then the suspect output from /Od: ps_r3.x = ((ps_r3.x >= 0.0) ? ps_r1.x : ps_c3.z);
ps_r3.y = ((ps_r3.y >= 0.0) ? ps_r1.y : ps_c3.z);
ps_r3.z = ((ps_r3.z >= 0.0) ? ps_r1.z : ps_c3.z);
ps_r4.xyz = ps_r3.xyz + ps_c3.xxx;
ps_r3.x = ((ps_r4.z >= 0.0) ? ps_c3.w : ps_r3.z);
ps_r3.y = ((ps_r4.x >= 0.0) ? ps_c3.w : ps_r3.x);
ps_r3.z = ((ps_r4.y >= 0.0) ? ps_c3.w : ps_r3.y); |
And then for reference, here's that zxyw I noticed before, which was added in a later version of PS (ps_2_0, I think), generated in /Od:
Here's the /O3 output in comparison:
ps_r5.xyz = clamp(ps_r5.xyz, vec3(0.0), vec3(1.0)); |
Still looks accurate to me, the only odd thing I'm seeing is this from fxc:
Weird that it would even bother with |
This might be an upstream bug.
The text was updated successfully, but these errors were encountered: