Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SV_Physics() : Performance boost for SV_PushMove() when there is a lot of pushers and pushed entities. #731

Merged
merged 1 commit into from
Sep 30, 2024

Conversation

vsonnier
Copy link
Collaborator

@vsonnier vsonnier commented Sep 6, 2024

On my machine :

  • Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz + Nvidia GTX 1050Ti 4GB = a middle-range gaming laptop of...2017.

This significantly boost the perf in case there is a ton of edicts and ton of MOVETYPE_PUSH pushers

Found with The Immortal Lock
at this place in particular for comparison : s17_immortal_lock.zip

The change consist in pre-computing the list of 'pushable' entities once per SV_Physics() on which SV_PushMove() will iterate.

By comparison, the original code iterate all qcvm->num_edicts at each SV_PushMove().

Turns out this simple change has a great impact on the fps in my case, just by shortenning the iteration or maybe by prefetching entities, because otherwise SV_PushMove() completly bottlenecks the rendering part, especially when host_maxfps 0.

Main settings :

  • vsync off
  • Antialiasing : Multisample a.k.a Edge-only 4x
  • Anisotropic 16x
  • Resolution : 2560 x 1440
  • r_novis 0 : This is the default, but it does have a very positive impact here. I never played a level before this one where r_novis 0/1 made the slightest difference. Not this time, though.
  • Display : 2560 x 1440 @60Hz

This shows the Min-Max FPS you can get depending where you look in the scene:

\ vkQuake sv_fastpushmove 0 (default) vkQuake sv_fastpushmove 1 Ironwail cb1ebef
host_maxfps 58 52-58 58-58 58-58
host_maxfps 0 37-94 63-155 100-110

Another example by @j4reporting:

Machine: Intel nuc11phki7 ( Intel i7-1165g7 @2.80Ghz + Nvidia 2600 RTX mobile )
Resolution : 2560 x 1440
Display: 2560x1440 @144hz
Main setting : host_maxfps 0

vkQuake sv_fastpushmove 0 (default) vkQuake sv_fastpushmove 1 Ironwail (recent master)
70-400 100-650 670-800

Now, this change is not a benign one... It assumes the list of MOVETYPE_PUSH pushers and pushable entities is frozen within a single frame, that is if PR_ExecuteProgram has side-effects and changes the movetype or even allocates / free eddicts, it is going to be taken into account the next frame only.

I'm oppening this to share ideas, opinions, and (God forbid) bug reports.

This optimization, if valid, could be also applied for @sezero QS and @andrei-drexler Ironwail.

@vsonnier vsonnier added this to the 1.31.2 milestone Sep 6, 2024
@vsonnier vsonnier force-pushed the vso_physics_perf branch 3 times, most recently from e03f216 to d8d896e Compare September 9, 2024 04:56
@vsonnier
Copy link
Collaborator Author

vsonnier commented Sep 9, 2024

I've found a good use case to trigger eventual problems, with lots of dynamically spawned items : Rotting Jam map rotj_nickster

@vsonnier vsonnier force-pushed the vso_physics_perf branch 2 times, most recently from 9141dbf to 3430a40 Compare September 10, 2024 19:15
@j4reporting
Copy link
Contributor

not sure if this is really related to that branch. They are stuck in the air, but still alive.
rotj_nickster_s14.zip
vkquake0001

@vsonnier vsonnier force-pushed the vso_physics_perf branch 2 times, most recently from 7c94e87 to a596fb3 Compare September 19, 2024 04:41
…erformance when level has a ton of edicts to process

(again for The Immortal Lock, our new favorite benchmark)
- Added sv_fastpushmove CVAR (default = 0, archived) to control optimized SV_PushMove processing
@vsonnier
Copy link
Collaborator Author

vsonnier commented Sep 27, 2024

I'll have to complete this issue with benchmark data, and I want to slip this into 1.31.2 in inactive (0) form by default, no even mentioning it in the Release notes.

I'll mention it only on the The Immortal Lock Slipseer thread for those who want to see some boost. This monster level is so far the only case sv_fastpushmove 1 makes a difference, so no need to make this setting public. (yet)

@vsonnier vsonnier merged commit af38796 into master Sep 30, 2024
22 checks passed
@vsonnier
Copy link
Collaborator Author

Alea jacta est.

@bananakid
Copy link

bananakid commented Oct 18, 2024

This fix has significantly improved the performance of The Immortal map for my system (Intel Core i9-9880H @ 2.30 GHz CPU): I can now get 60 fps mot of the time. Here's a link to discussion regarding r_novis. It looks like there's no sense to set it to 1. Thank you!

@vsonnier vsonnier deleted the vso_physics_perf branch October 25, 2024 16:59
vsonnier added a commit that referenced this pull request Feb 19, 2025
After various testing left and right, didn't saw any drawback so activate it by default
Initially added in ##731
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants