Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve render perf for MSVC Debug configuration #550

Closed

Conversation

tophyr
Copy link
Contributor

@tophyr tophyr commented Aug 28, 2024

Pull Request Type

  • GitHub Workflow changes
  • Documentation or Wiki changes
  • Build and Dependency changes
  • Runtime changes
    • Render changes
    • Audio changes
    • Input changes
    • Network changes
    • Other changes

Description

MSVC's STL performs extensive bounds and validity checking on iterators in Debug builds. This is desirable for correctness, but causes significant slowdowns in the current renderer's implementation due to the amount of vertex copies and transformations it performs. Until we can fully convert the renderer into more of a 'static data' model where the game does not need to ship thousands of vertices to the GPU each frame, the MSVC Debug build realistically needs to skip the per-iteration checks for its vertex iterators. This is, I think, a happy compromise between full checks and just using bare data pointers: The iterators are still used and checked once each at the start of any operation, but the memory accesses are done via pointers which should be very fast.

Related Issues

#543, #545

Checklist

  • I have tested my changes locally and verified that they work as intended.
  • I have documented any new or modified functionality.
  • I have reviewed the changes to ensure they do not introduce any unnecessary complexity or duplicate code.
  • I understand that by submitting this pull request, I am agreeing to license my contributions under the project's license.

MSVC's STL performs extensive bounds and validity checking on iterators
in Debug builds. This is desirable for correctness, but causes significant
slowdowns in the current renderer's implementation due to the amount of
vertex copies and transformations it performs. Until we can fully convert
the renderer into more of a 'static data' model where the game does not
need to ship thousands of vertices to the GPU each frame, the MSVC Debug
build realistically needs to skip the per-iteration checks for its vertex
iterators. This is, I think, a happy compromise between full checks and
just using bare data pointers: The iterators are still used and checked
once each at the start of any operation, but the memory accesses are done
via pointers which should be very fast.
@tophyr tophyr mentioned this pull request Aug 28, 2024
13 tasks
@winterheart
Copy link
Collaborator

winterheart commented Aug 28, 2024

I'd like just not count Debug build on MSVC as main target of our development. Users does not supposed to use debug builds, and slow performance on Debug is only trade off over clean code, which runs fine on Release.

@tophyr
Copy link
Contributor Author

tophyr commented Aug 28, 2024

Agree ideologically, but this is very nearly identically-safe and should eliminate a very big bottleneck in that particular build, making debugging on Windows easier.

@winterheart
Copy link
Collaborator

We have 10 builds for various platforms, and there only one has performance regression - Windows Debug, so I suggest either fix it alone without affecting other builds or just threat it as "Debug is not about performance".

After all, we can just disable iterator debugs on whole project for Windows Debug, I don't see any usefulness of this feature at all for us.

@tophyr
Copy link
Contributor Author

tophyr commented Aug 29, 2024

This "kind of" does only change anything for MSVC: On both libcxx and gnustl, std::array<T>::iterator is simply T*.

I do think it is significantly valuable to have the iterator debug checks active in general. Early on in this project actually, I was converting some winmain.cpp bits to use a vector and iterator, but I had a concurrent-modification bug. I likely wouldn't have caught the regression if the program hadn't asserted due to exactly those checks. Frankly, I wish libcxx did this.

@pzychotic
Copy link
Contributor

What about CMake's RelWithDebInfo build configuration when you need that speed while still being able to debug?

@winterheart
Copy link
Collaborator

What about CMake's RelWithDebInfo build configuration when you need that speed while still being able to debug?

That would be good solution too.

@tophyr
Copy link
Contributor Author

tophyr commented Aug 30, 2024

RelWithDebInfo turns on neither the RELEASE nor the _DEBUG defines, which unfortunately means it doesn't get many of the other asserts or checks. Some are keyed on !RELEASE while many others are keyed on actually _DEBUG being true.

Normally I'm pretty strongly in agreement with the "don't alter code for special cases" line of thought, but I do at the same time think it's pretty important to keep the Debug build reasonably usable - otherwise people will never use it.

The alternatives to this change that I see are:

  1. Let Windows Debug build be near-unusably slow
  2. #ifdef this change so the conversion from iterator -> T* is only performed on Windows
  3. Turn off iterator validity checks project-wide
  4. Alter the method hierarchies to take T* instead of iterators

Option 1 is, to me, not really an option - we might as well remove the preset for it. Option 2 could be done in a semi-cleanly way using some intermediate local variables, so that significant logic lines remain the same - but this would be entirely a semantic change, because iterator already decays to T* on non-Windows platforms. Additionally, Option 2 would ignore the possibility that other platforms might implement similar functionality, making us hunt down a config-specific perf regression. Options 3 and 4 are undesirable to me because I think the iterator checks are actually extremely valuable.

What's the resistance to this one - is it just that the &* syntax is confusing without the accompanying comment?

@winterheart
Copy link
Collaborator

On RelWithDebInfo we can enable _DEBUG defines in CMakeLists.txt, it's our defined macro. That why I'm trying to not overuse build-defined logics besides ASSERTS and fatal exceptions.

Prebuilt Debug artifacts on Windows already no has any use for users - they cannot run it without MSVC itself (as pointed in #509), so Debug environment is usable only for developer. Slow performance is observed only in one of our supported environments - Windows Debug, so we should either fix our code in way that not affects other environments or fix environment itself. Using &* syntax is hackish looking attempt to fix code that aren't even broken. Other platforms should not suffer with such non-intuitive syntax.

I still suggest either to disable iterator checking "feature" for Windows Debug (we don't has any use of this with well written code and boundary checks where it's really needed) or left Windows Debug slow as is.

@tophyr
Copy link
Contributor Author

tophyr commented Sep 2, 2024

On RelWithDebInfo we can enable _DEBUG defines

This would eliminate the point of the RelWithDebugInfo build - it is meant to be a Release build, just with debug info.

The iterator checks are, imo, incredibly important. They've already caught one bug in the recent past for me and will continue to catch others.

I will try to rewrite this so that it uses clearer syntax, but I don't want to target the change specifically to Windows. (In #528 I talk about how I actually want to turn on similar checks for Mac and Linux.)

@tophyr
Copy link
Contributor Author

tophyr commented Sep 2, 2024

Very interesting about #509. Hadn't seen that one yet.

@pzychotic
Copy link
Contributor

_DEBUG is at least a MSVC/Windows specific define, we can't enable that when linking to the release VC Runtime libraries.
On the other hand RELEASE is our project specific define, NDEBUG would be the opposite for MSVC/Windows in Release build.

There is some inconsistency in the current code, regarding these defines.
#ifdef _DEBUG is used alot and most cases of RELEASE are used in #ifndef.
Which basically resolves to the same when we currently only use Release builds with RELEASE and Debug builds with _DEBUG enabled.
We could break that inconsistency with converting all the #ifdef _DEBUG to #ifndef RELEASE. Or since I'm not a big fan of negated #ifdef's, change everything to our own D3_DEBUG or something.

Btw. I can't say that I find the current Windows Debug build to be near-unusably slow, since that is the build I'm usually working with.

@tophyr
Copy link
Contributor Author

tophyr commented Sep 4, 2024

There is some inconsistency in the current code, regarding these defines. #ifdef _DEBUG is used alot and most cases of RELEASE are used in #ifndef. Which basically resolves to the same when we currently only use Release builds with RELEASE and Debug builds with _DEBUG enabled. We could break that inconsistency with converting all the #ifdef _DEBUG to #ifndef RELEASE. Or since I'm not a big fan of negated #ifdef's, change everything to our own D3_DEBUG or something.

Agree - I talk about this a bit in #528 (comment)

Btw. I can't say that I find the current Windows Debug build to be near-unusably slow, since that is the build I'm usually working with.

Maybe we just accept the CPU-perf cost of this stuff then? I believe enough in the importance of the iterator checks that I'm unwilling to remove them. @Lgt2x did mention in the original report that his test machine was pretty low-end - as long as the game runs performantly-enough on "most" systems that are going to be running Debug builds, I'm ok with leaving this as-is.

@tophyr tophyr closed this Dec 27, 2024
@tophyr tophyr deleted the feature/msvc-debug-render-perf branch December 27, 2024 06:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants