Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Runtime Issue]: Framerates in outdoor areas drop to half to third of indoor areas and spikes GPU usage. #560

Open
4 of 20 tasks
0xFADDAD opened this issue Sep 1, 2024 · 17 comments
Labels
bug Something isn't working

Comments

@0xFADDAD
Copy link

0xFADDAD commented Sep 1, 2024

Build Version

2db85ca

Operating System Environment

  • Microsoft Windows (32-bit)
  • Microsoft Windows (64-bit)
  • Mac OS X
  • Linux (specify distribution and version below)

CPU Environment

  • x86 (32-bit Intel/AMD)
  • x86_64 (64-bit Intel/AMD)
  • ARM (32-bit)
  • ARM64 (64-bit; sometimes called AArch64)
  • Other (RISC V, PPC...)

Game Modes Affected

  • Single player
  • Anarchy
  • Hyper-Anarchy
  • Robo-Anarchy
  • Team Anarchy
  • Capture the Flag
  • Bounty
  • Entropy
  • Hoard
  • Monsterball
  • Cooperative

Game Environment

No response

Description

Looking at skybox with no terrain in view or returning to indoor areas returns framerate to normal. Possible terrain is not being culled?

Regression Status

No response

Steps to Reproduce

Enter outdoor area, framerate halves and GPU usage triples.

framerate.mp4
@0xFADDAD 0xFADDAD added the bug Something isn't working label Sep 1, 2024
@tophyr
Copy link
Contributor

tophyr commented Sep 2, 2024

Thanks for the report. @0xFADDAD, if you are able, would you mind re-running the test with a build from https://github.com/DescentDevelopers/Descent3/actions/runs/10566626808 ? This will help determine if this problem is caused by the recent renderer modernization or if it was introduced by something prior. Thanks!

@0xFADDAD
Copy link
Author

0xFADDAD commented Sep 2, 2024

https://github.com/DescentDevelopers/Descent3/actions/runs/10566626808 actually managed to work even worse, framerate's are now down in the high 'teens.

@0xFADDAD
Copy link
Author

0xFADDAD commented Sep 2, 2024

I'm glad I decided to try all the levels in the Bedlam set. The last level, 'Polaris', has outdoor areas, but these render mostly correctly, 5 to 10% frame cut or so. 'Plutonium' and 'Apparition' have the severe framerate cuts. I'll try a few more levels with outdoor sections to see if I can make out a pattern.

UPDATE: Dementia's 'Geodomes' is a good test for just how low the framerate can get. The terrain has a few sections of long flat surfaces in the far distance that really crater the performance.

@tophyr
Copy link
Contributor

tophyr commented Sep 2, 2024

https://github.com/DescentDevelopers/Descent3/actions/runs/10566626808 actually managed to work even worse, framerate's are now down in the high 'teens.

Oh wow! I'm glad I asked. We can rule out the modernized renderer as a cause then, at least.

@KynikossDragonn
Copy link

I've experienced the same problem running off of git main builds, I haven't measured the GPU usage through intel_gpu_top on my NUC but the CPU usage is extremely high and according to htop the brunt of it is kernel time.

Is there a way to profile what's happening and try to narrow down what's causing the thrashing in rendering?

@Lgt2x
Copy link
Member

Lgt2x commented Sep 9, 2024

I've experienced the same problem running off of git main builds, I haven't measured the GPU usage through intel_gpu_top on my NUC but the CPU usage is extremely high and according to htop the brunt of it is kernel time.

Is there a way to profile what's happening and try to narrow down what's causing the thrashing in rendering?

We'll need more precise CPU profiling to identify and mitigate bottlenecks. I recommend running the perf (record) set of tools on Linux to get precise CPU sampling. Its output can be processed with other utilities to get the biggest time consumers.

@pzychotic
Copy link
Contributor

I took a quick look on Windows at the beginning of Retribution Level 15.
We spent 66% of CPU time in the graphics driver (Intel integrated graphics), 27% in the Windows kernel and just shy of 5% in our own code.
Screenshot 2024-09-10 204113

Depending on where I look, I get between 75 to 15 FPS. This correlates to about 1000 to 5000 draw calls and scales pretty linear.
The scary part is, that on average we only render 2 triangles per draw call. That is complete overkill concerning the overhead each draw call comes with (state changes, etc).
Ideally we would want to batch as much geometry as possible with the same state into a single draw call. Which might be a challenge with the current architecture.

@Lgt2x
Copy link
Member

Lgt2x commented Sep 13, 2024

very interesting, indeed we need to optimize draw calls. @InsanityBringer any tips for that?

@winterheart
Copy link
Collaborator

@pzychotic could you please do same benchmark on 3cb1e89 revision (before render changes)?

@tophyr
Copy link
Contributor

tophyr commented Sep 13, 2024

The scary part is, that on average we only render 2 triangles per draw call.

This, in particular, is unsurprising - the D3 renderer is set up in terms of drawing polygons (usually quads), not objects, so if it were to draw a cube for example it would perform eight g3_DrawPoly calls: One for each side of the cube. We need to transform the renderer so that it thinks primarily about drawing objects, but doing this transformation will require "lifting" the draw operation up to each callsite of g3_DrawPoly - about 65 callsites. Not prohibitive, but not a light job either.

@pzychotic
Copy link
Contributor

could you please do same benchmark on 3cb1e89 revision (before render changes)?

Interesting changes, we spent alot more time in our own code and not much in the Windows kernel while graphics driver was a bit less.
Screenshot 2024-09-13 202854

@InsanityBringer
Copy link
Contributor

I don't really have a good solution for the legacy renderer. Terrain is the worst because it adds up to the worst of everything. Expensive objects, expensive rooms, the terrain triangles themselves all in a very open environment not conclusive to culling doesn't help but during my attempts to improve legacy in Piccu I found that actually drawing the terrain itself is probably the smallest cause of lag (though the vastly increased limits of the terrain renderer in 1.5 aren't helping in the slightest)

To some degree, pursuing things like stripification of polygons could lead to some gains, but I feel at that point, you're better off pursuing a meshing solution using newer (even OpenGL 2 era) features like GPU-side vertex buffers.

@KynikossDragonn
Copy link

you're better off pursuing a meshing solution using newer (even OpenGL 2 era) features like GPU-side vertex buffers.

I actually agree with this because even the VBO implementation in UA_source really sped up rendering there, and that's already a low poly game.

@0xFADDAD
Copy link
Author

Kind of late, and might already be obvious to some, but I forgot the Fusion engine was two engines in one. So I went looking for the game's post-mortem and found an interesting excerpt from Jason Leighton, one of the programmers.

"The terrain engine actually began as a prototype for another game that Jason was interested in developing. Unfortunately, Bungie’s Myth beat us to the idea, but the terrain technology was solid enough to be incorporated into Descent 3. It was based on a great paper by Peter Lindstrom and colleagues entitled Real-Time, Continuous Level of Detail Rendering of Height Fields (from Siggraph 1996 Computer Graphics Proceedings, Addison Wesley, 1996). Of course, it was bastardized heavily to fit the needs of Descent 3, but the overall concept was the same — create more polygonal detail as you get closer to the ground and take away polygons when you are farther away. After implementing the real-time LOD technology, our frame rates quadrupled."

Perhaps the LOD scaling is broken or non-functional after many of the limits had been expanded? Might be worth investigating.

@KynikossDragonn
Copy link

Perhaps the LOD scaling is broken or non-functional after many of the limits had been expanded? Might be worth investigating.

Well; the LOD scaling is definitely doing something in release 1.5 but it's still a lot of draw calls...

Try setting the "Terrain Detail" slider all the way to the lowest setting. Though I don't recall off the top of my head how 1.4 behaved.

@0xFADDAD
Copy link
Author

Took your advice and tried ticking down the slider, 28 being max, I tried 27 with not much, but some improvement, but 26 seems to be a huge improvement.
It's a solution, but the common reasoning would be, "this is a 25- year old game, it 'should' run completely maxed out", but if we're increasing the max polycount beyond what the engine is capable of putting out, it might just be best to leave limits where they were.

@KynikossDragonn
Copy link

The path to rendering optimization is probably going to involve gutting the engine down the middle; as stated above:

the D3 renderer is set up in terms of drawing polygons (usually quads), not objects, so if it were to draw a cube for example it would perform eight g3_DrawPoly calls: One for each side of the cube. We need to transform the renderer so that it thinks primarily about drawing objects

Modern OpenGL and Vulkan a lot of stuff is carried on in the GPU rather than the CPU too, so we need to have less CPU bound rendering code. It's not going to be a very easy task I imagine...

I wouldn't have a clue how one would; for example: have the GPU do the procedural textures in hardware versus it happening in the CPU and the code constantly uploading a new texture every frame.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

8 participants
@winterheart @InsanityBringer @pzychotic @tophyr @KynikossDragonn @Lgt2x @0xFADDAD and others