-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mapped render target memory can crash replaying detected GPU writes #1863
Comments
Just double-checked that running the same app on either of the GPUs I have is totally validation-free. So I wonder if these flush errors come from RenderDoc itself? |
I'm not able to reproduce this unfortunately. On the closest hardware I could get running the same mesa version I'm able to open the capture and see things working as I'd expect. One thing I noticed which seems quite strange and might have something to do with the problem is that memory which is allocated and only bound to optimal tiled images (the color/depth attachments) is mapped and has writes to it in the frame. Since these images aren't in a host-accessible format those writes seem like they're invalid and might cause problems. FWIW it looks like the errors you see are due to RenderDoc mapping coherent map writes into flushes for the sake of serialisation. Are you able to reproduce the problem on a capture without those optimal tiled images having their memory mapped? |
Huh, we by no means are trying to map that memory (used for render attachments). It would make little sense:) We only map (some of the) buffer memory. |
Yes indeed it stood out to me for that reason :). I can't check without being able to reproduce the capture side, but this is recorded as an application map write and I don't see any way it could be caused by RenderDoc. |
I double checked and didn't find us mapping textures. What we do, however, is mapping any memory that is CPU-visible on allocation. This never applies to textures in practice, because we prefer non-cpu-visible memory types for them. It must have something to do with the way RenderDoc exposes different memory types to us. Could it be the case that all the memory types you are exposing are CPU-visible? We'd then be mapping it, but not doing anything with that memory on CPU side. If you are seeing any writes to that memory, that's especially strange. Would it be possible if the memory types you are exposing are just happen to support both buffers and textures? We'd then end up sub-allocating all of them from the same memory type, and you'd see writes to this memory that are caused by our buffer writes, unrelated to the textures.
Is that a known issue on your side? Would be good to get fixed in order to avoid confusing the users about these validation errors. |
RenderDoc doesn't have anything to do with the memory types that are exposed, that comes from the driver itself. This capture is using the intel GPU which only exposes two memory types - both are CPU visible. It's possible RenderDoc is detecting writes to the memory because the GPU is modifying that memory. It's not possible for me to know where such a write came from, so it will be recorded either way and then replayed as a CPU write. I'll need to think about how/if this can be solved, it likely won't be easy to fix. I'd recommend in the meantime only mapping memory you intend to modify from the CPU so that memory being modified by the GPU isn't visible to the CPU while RenderDoc is capturing it. |
Yes, GPU is modifying the render attachment memory.
Thank you for the suggestion, filed gfx-rs/gfx-extras#12 |
That commit will skip flushed memory writes to memory that only has tiled images bound to it. It will not work if there's aliasing with linear images or buffers, and I haven't been able to test if this really fixes the problem since I wasn't able to reproduce it. It also comes with a performance penalty if it has to process in this way rather than memcpy'ing directly so I'd still strongly recommend not mapping memory regions you're using for GPU-only images. |
Description
When loading a specific capture file, RenderDoc, at first it seems normal. I can switch between tabs, and I see the progress bar at the bottom constantly showing movement. But when I click on one of the events on the left, everything hangs.
Steps to reproduce
The diagnostic log complains about nonCoherentAtomSize not being respected, but I'm not seeing in the app code where it would be the case. Not sure if it's related.
Environment
The log lists 2 GPUs, not sure which one RenderDoc is using:
The text was updated successfully, but these errors were encountered: