Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SDL3] [GPU] Allow using SDL_Renderer with custom GPU device #11829

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

MrOnlineCoder
Copy link

@MrOnlineCoder MrOnlineCoder commented Jan 3, 2025

This PR will probably be discarded as I am not experienced enough in contributing to C projects, nor GPU or API design theory, but anyway, I tried.

Description

As was discussed in Discord, currently if you are using SDL GPU, you will probably have to make a custom sprite batch renderer to perform 2D accelerated drawing as there is not direct way of integrating SDL_Renderer with an existing GPU device.

This PR attempts to fix it, allowing to use all of amazing SDL Renderer features on an existing GPU device context.

Renderer creation

A new property SDL_PROP_RENDERER_CREATE_USER_GPU_DEVICE_POINTER can be used to pass SDL_GPUDevice* during renderer creation. This, surely, will only have effect on renderers with "gpu" driver, so it's probably beneficial to also manually override that too with a property. Also window must be passed too at this point to comply with existing SDL_render.c generic code:

SDL_GPUDevice* gpu_device = ...;
SDL_Window* window = ...;

SDL_PropertiesID props = SDL_CreateProperties();

SDL_SetStringProperty(props, SDL_PROP_RENDERER_CREATE_NAME_STRING, "gpu");
SDL_SetPointerProperty(props, SDL_PROP_RENDERER_CREATE_WINDOW_POINTER, window);
SDL_SetPointerProperty(props, SDL_PROP_RENDERER_CREATE_USER_GPU_DEVICE_POINTER, gpu_device);

SDL_Renderer* renderer = SDL_CreateRendererWithProperties(
        props);

This will create a GPU renderer in a so-called user-managed mode, which is reflected in code via GPURenderData.user_managed_device boolean, and user in that context means the caller of the SDL API. This means that using SDL_Renderer now requires additional calls for correct rendering and the following resources are no longer managed by SDL_Renderer:

  • GPU device itself - it's creation and destruction must be handled by the user. It does not also perform window claiming/unclaiming for the device automatically, assuming that window will be claimed for GPU device on moment of renderer usage
  • GPU command buffer - user must provide a valid command buffer for both rendering and resources upload (usually textures) - see below.
  • GPU swapchain is no longer acquired by SDL_RenderPresent (GPU_RenderPresent), instead there is a new function SDL_RenderPresentToGPUTexture that can be used to present the contents of renderer backbuffer to a texture, which can be a swapchain texture

Command buffers and render commands

As GPU renderer with user-managed device does not create command buffers automatically, a user must provide one for correct workflow. Assuming that user at this point is already doing "low-level" GPU stuff, this gives a bit of freedom to determine when command buffer should be created and which one should be used - for example it can be the same buffer, used to draw non-2D/custom geometry.

This is done via the following function:

bool SDL_SetRenderGPUCommandBuffer(SDL_Renderer *renderer, SDL_GPUCommandBuffer *command_buffer);

where renderer must a be GPU-backed renderer configured in user-managed mode.
Internally, it directly changes the data->state.command_buffer of the renderer.

As command buffer is also needed to perform copy passes for texture uploads, a valid command buffer must be provided at this point too:

Example (error handling omitted):

SDL_Surface *bmp_surface = SDL_LoadBMP("icon.bmp");

SDL_GPUCommandBuffer *cmdbuf = SDL_AcquireGPUCommandBuffer(gpu_device);

SDL_SetRenderGPUCommandBuffer(renderer, cmdbuf);

SDL_Texture* texture = SDL_CreateTextureFromSurface(renderer, bmp_surface);

SDL_SubmitGPUCommandBuffer(cmdbuf);

SDL_DestroySurface(bmp_surface);

And surely, command buffer must be configured when performing the actual draw calls:

SDL_SetRenderGPUCommandBuffer(renderer, cmdbuf);

SDL_SetRenderDrawColor(renderer, 0, 0, 0, 0);
SDL_RenderClear(renderer);

SDL_FRect rect = { 32, 32, 64, 64 };
SDL_SetRenderDrawColor(renderer, 255, 0, 0, 255);
SDL_RenderFillRect(renderer, &rect);

SDL_FRect tex_rect = { 150, 150, 32, 32 };
SDL_RenderTexture(renderer, texture, NULL, &tex_rect);

Finally, a new function named SDL_RenderPresentToGPUTexture can be used to present the result to a SDL_GPUTexture:

bool SDL_RenderPresentToGPUTexture(SDL_Renderer *renderer, SDL_GPUTexture *target, SDL_GPUTextureFormat format);

where texture can also be a swapchain texture, acquired by higher-level code.

Behaviour difference

In current implementation, calling SDL_RenderPresent on a user-managed GPU renderer will have no visual effect aside from flushing queued render commands.
Instead, SDL_RenderPresentToGPUTexture must be called in order to perform the actual presentation.
The difference is also that SDL_RenderPresentToGPUTexture uses a fully-featured render pass for drawing backbuffer texture on the target/swapchain texture, instead of simple blitting that was done in GPU_RenderPresent. This is because I wanted to preserve already rendered contents of the target texture and therefore, needed blending to be available, which, at least from my research, didn't come possible with simple blitting.
For doing that, it creates a separate vertex buffer at GPU_RenderData.vertices.present_quad_buffer which just contains a textured quad that fills the entire screen (render target). During the render pass, it is drawn with VERT_SHADER_TRI_TEXTURE and FRAG_SHADER_TEXTURE_RGBA shaders and backbuffer texture bound as the sampler.
This also puts another requirement - if you want to preserve already rendered contents when doing SDL_RenderPresentToGPUTexture, make sure that your SDL_RenderClear does the clearing with a zero-alpha color (like (0,0,0,0), so that every unfilled pixel is displayed as transparent on presentation.

Tests

  • It compiles on my MacOS with clang-1500.3.9.4, along with examples and tests :D
  • I have created a testgpu_with_sdlrender.c which is based on testgpu_clear.c, but adds usage of SDL_Renderer which draws a red rectangle and a icon.bmp texture after performing a simple no-draw GPU renderpass. Texture is moved on key press to see if clearing is working well. Result is attached as screenshot (background is cleared by custom render pass, rectangles are drawn by SDL_Renderer):
Screenshot 2025-01-03 at 01 25 30

Issues and open questions

  • I am REALLY sorry, but for some reason my editor (VSCode) applied a lot of formatting for files on saving without probably considering .editorconfig, which I wasn't able to undo easily, as I noticed it on the first large diff... I have tried running clang-format on edited files, but it didn't help. This made reading the PR problematic, so I left GitHub comments at places where I have actually made real changes. I would appreciate any advice on fixing that. :(
  • Same thing kinda applies to dynapi definitions - I have manually deleted some functions that were results of my failed initial attempts - I wasn't able to find a README about dynapi workflow, so I just run 'gendynapi.py' a few times.
  • My GPU knowledge is very basic and limited (it's like unexplored COSMOSnaut for me :D ), so some of the sub-tasks probably have a better solution here other than I used.
  • For example, maybe it's possible to leave command buffer management as the responsibility of renderer itself, but I wasn't sure how that would work when both renderer and user-code create a command buffer and/or both start a render or copy pass.
  • IMO, Currently the API becomes really "fragile" and "unexplicit" when you provide a custom GPU device pointer for the renderer. Mostly, the command buffer part is bothering me. It's really easy for the caller to forget to provide it and it's probably harder to debug/monitor the execution flow generally in case of an error.
  • In general, it puts a lot of burden on user side, requiring them to keep track of the device, command buffer and alpha of clear color (depending on usage). But assuming they are already using lower-level GPU API, they probably can deal with it? Although documenting it will probably require a separate wiki page O_o
  • SDL_render_gpu.c now contains a bunch of if checks scattered through the file in style of if (data->user_managed_device) to alter some resource management. Not sure if that's a code smell.
  • I am currently not able to test it on other platforms, so code was not checked on Windows/Linux/etc.
  • Performance wasn't also benchmarked or estimated. The test example runs at around ~1000 frames per second on my MacBook M1 Pro, but that's mostly it.

Again, I would appreciate any comments, criticism or recommendations on the issues above and the PR in general, including also the fact if it's worthy at all to consider such feature to be a part of SDL - otherwise I will just leave it as a fork for my personal usage :D

Thanks in advance.

@@ -279,6 +279,13 @@ extern SDL_DECLSPEC SDL_Renderer * SDLCALL SDL_CreateRenderer(SDL_Window *window
* - `SDL_PROP_RENDERER_CREATE_VULKAN_PRESENT_QUEUE_FAMILY_INDEX_NUMBER`: the
* queue family index used for presentation.
*
* With the gpu renderer:
*
* - `SDL_PROP_RENDERER_CREATE_USER_GPU_DEVICE_POINTER`: user-provided SDL_GPUDevice pointer.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This property determines the mode of SDL_Renderer on creation

@@ -366,6 +366,10 @@ extern void *SDL_AllocateRenderVertices(SDL_Renderer *renderer, size_t numbytes,
// Let the video subsystem destroy a renderer without making its pointer invalid.
extern void SDL_DestroyRendererWithoutFreeing(SDL_Renderer *renderer);

// We must expose this function for manual control over renderer command buffer
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find a better way to expose some driver-specific functions aside from declaring them in sysrender.h

@@ -61,6 +62,8 @@ typedef struct GPU_RenderData
SDL_GPUTransferBuffer *transfer_buf;
SDL_GPUBuffer *buffer;
Uint32 buffer_size;

SDL_GPUBuffer *present_quad_buffer;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This buffer will contain a single fullscreen quad for presenting to a GPU texture. I am not sure if it would be possible to store in general vertices buffer, as it can be overwritten?

@@ -956,6 +1039,10 @@ static bool GPU_RenderPresent(SDL_Renderer *renderer)
{
GPU_RenderData *data = (GPU_RenderData *)renderer->internal;

if (data->user_managed_device) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not play with swapchain texture when GPU device is managed by user.

ChoosePresentMode(data->device, window, vsync, &data->swapchain.present_mode);

SDL_SetGPUSwapchainParameters(data->device, window, data->swapchain.composition, data->swapchain.present_mode);
if (!data->user_managed_device) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if we should play with vsync settings or present mode in case the device is not ours


/* Custom GPU rendering */
renderPass = SDL_BeginGPURenderPass(cmdbuf, &color_target_info, 1, NULL);
/* Render Half-Life 3 or whatever */
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may need to put at least a basic triangle render code here to verify that it actually works, but I presume if swapchain clear color is preserved after SDL_RenderPresentToGPUTexture - we are fine to go?

Comment on lines +2575 to +2619
/**
* Draw all queued render commands to SDL_GPUTexture
*
* This function is only useful for GPU-based renderers, especially when the GPU device is manually controlled.
* If the texture is invalid or not provided, nothing will be rendered.
* The target texture may also be a swapchain texture, obtained from SDL_WaitAndAcquireGPUSwapchainTexture,
* which is the recommended way to combine GPU and SDL_Renderer usage.
*
* \param renderer the renderer
* \param target the GPU texture to render to
* \param format the format of the target texture
* \returns true on success or false on failure; call SDL_GetError() for more
* information.
*
* \threadsafety This function should only be called on the main thread.
*
* \since This function is available since SDL 3.2.0.
*
* \sa SDL_RenderPresent
* \sa SDL_WaitAndAcquireGPUSwapchainTexture
*/
extern SDL_DECLSPEC bool SDLCALL SDL_RenderPresentToGPUTexture(SDL_Renderer *renderer, SDL_GPUTexture *target, SDL_GPUTextureFormat format);

/**
* Set custom command buffer for a GPU-based renderer.
*
* This function will set a custom command buffer for a GPU-based renderer.
* It has affect only on SDL_Renderer that was created with SDL_PROP_RENDERER_CREATE_USER_GPU_DEVICE_POINTER property.
* It is user responsibility to provide a valid command buffer before any render or upload calls of SDL_Renderer.
* All SDL_Render<Something> and texture creation calls require a valid command buffer to be set before calling.
* Note, such SDL_Renderer does not submit the command buffer automatically.
*
* \param renderer the renderer
* \param command_buffer the command buffer to be used to queue draw calls
* \returns true on success or false on failure; call SDL_GetError() for more
* information.
*
* \threadsafety This function should only be called on the main thread.
*
* \since This function is available since SDL 3.2.0.
*
* \sa SDL_SetRenderGPUSwapchainTexture
* \sa SDL_AcquireGPUCommandBuffer
*/
extern SDL_DECLSPEC bool SDLCALL SDL_SetRenderGPUCommandBuffer(SDL_Renderer *renderer, SDL_GPUCommandBuffer *command_buffer);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New public API

Comment on lines +5610 to +5646

bool SDL_SetRenderGPUCommandBuffer(SDL_Renderer *renderer, SDL_GPUCommandBuffer *command_buffer)
{
CHECK_RENDERER_MAGIC(renderer, false);

if (strncmp(renderer->name, "gpu", 3) != 0) {
return SDL_SetError("SDL_SetRenderGPUCommandBuffer must be called on a GPU-based renderer, got '%s'", renderer->name);
}

if (!command_buffer) {
return SDL_SetError("command_buffer must not be NULL");
}

GPU_SetCommandBuffer(renderer, command_buffer);

return true;
}

bool SDL_RenderPresentToGPUTexture(SDL_Renderer *renderer, SDL_GPUTexture *target, SDL_GPUTextureFormat format)
{
CHECK_RENDERER_MAGIC(renderer, false);

if (strncmp(renderer->name, "gpu", 3) != 0) {
return SDL_SetError("SDL_RenderPresentToGPUTexture must be called on a GPU-based renderer, got '%s'", renderer->name);
}

if (!target) {
return SDL_SetError("target must not be NULL");
}

// Hack: We need to flush the command buffer before we present the texture
if (!SDL_RenderPresent(renderer)) {
return false;
}

return GPU_PresentToUserTexture(renderer, target, format);
}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Public functions implementations (comment for better PR review)

@slouken slouken requested a review from thatcosmonaut January 3, 2025 01:03
@slouken slouken added this to the 3.x milestone Jan 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants