Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose StagingBelt allocation directly. #6900

Open
wants to merge 1 commit into
base: trunk
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -164,6 +164,7 @@ By @wumpf in [#6849](https://github.com/gfx-rs/wgpu/pull/6849).
- `DeviceType` and `AdapterInfo` now impl `Hash` by @cwfitzgerald in [#6868](https://github.com/gfx-rs/wgpu/pull/6868)
- Add build support for Apple Vision Pro. By @guusw in [#6611](https://github.com/gfx-rs/wgpu/pull/6611).
- Add `wgsl_language_features` for obtaining available WGSL language feature by @sagudev in [#6814](https://github.com/gfx-rs/wgpu/pull/6814)
- Add `util::StagingBelt::allocate()` so the staging belt can be used to write textures. By @kpreid in [#6900](https://github.com/gfx-rs/wgpu/pull/6900)

##### Vulkan

Expand Down
156 changes: 114 additions & 42 deletions wgpu/src/util/belt.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,49 +5,27 @@ use crate::{
use std::fmt;
use std::sync::{mpsc, Arc};

struct Chunk {
buffer: Arc<Buffer>,
size: BufferAddress,
offset: BufferAddress,
}

/// `Sync` wrapper that works by providing only exclusive access.
///
/// See https://doc.rust-lang.org/nightly/std/sync/struct.Exclusive.html
struct Exclusive<T>(T);

unsafe impl<T> Sync for Exclusive<T> {}

impl<T> Exclusive<T> {
fn new(value: T) -> Self {
Self(value)
}

fn get_mut(&mut self) -> &mut T {
&mut self.0
}
}

/// Efficiently performs many buffer writes by sharing and reusing temporary buffers.
///
/// Internally it uses a ring-buffer of staging buffers that are sub-allocated.
/// It has an advantage over [`Queue::write_buffer()`] in a way that it returns a mutable slice,
/// which you can fill to avoid an extra data copy.
/// Its advantage over [`Queue::write_buffer_with()`] is that the individual allocations
/// are cheaper; `StagingBelt` is most useful when you are writing very many small pieces
/// of data. It can be understood as a sort of arena allocator.
///
/// Using a staging belt is slightly complicated, and generally goes as follows:
/// 1. Write to buffers that need writing to using [`StagingBelt::write_buffer()`].
/// 1. Use [`StagingBelt::write_buffer()`] or [`StagingBelt::allocate()`] to allocate
/// buffer slices, then write your data to them.
/// 2. Call [`StagingBelt::finish()`].
/// 3. Submit all command encoders that were used in step 1.
/// 4. Call [`StagingBelt::recall()`].
///
/// [`Queue::write_buffer()`]: crate::Queue::write_buffer
/// [`Queue::write_buffer_with()`]: crate::Queue::write_buffer_with
pub struct StagingBelt {
chunk_size: BufferAddress,
/// Chunks into which we are accumulating data to be transferred.
active_chunks: Vec<Chunk>,
/// Chunks that have scheduled transfers already; they are unmapped and some
/// command encoder has one or more `copy_buffer_to_buffer` commands with them
/// as source.
/// command encoder has one or more commands with them as source.
closed_chunks: Vec<Chunk>,
/// Chunks that are back from the GPU and ready to be mapped for write and put
/// into `active_chunks`.
Expand All @@ -70,7 +48,7 @@ impl StagingBelt {
/// (per [`StagingBelt::finish()`]); and
/// * bigger is better, within these bounds.
pub fn new(chunk_size: BufferAddress) -> Self {
let (sender, receiver) = std::sync::mpsc::channel();
let (sender, receiver) = mpsc::channel();
StagingBelt {
chunk_size,
active_chunks: Vec::new(),
Expand All @@ -81,7 +59,7 @@ impl StagingBelt {
}
}

/// Allocate the staging belt slice of `size` to be uploaded into the `target` buffer
/// Allocate a staging belt slice of `size` to be copied into the `target` buffer
/// at the specified offset.
///
/// The upload will be placed into the provided command encoder. This encoder
Expand All @@ -99,10 +77,57 @@ impl StagingBelt {
size: BufferSize,
device: &Device,
) -> BufferViewMut<'_> {
let (mapped, belt_buffer, offset_in_belt_buffer) = self.allocate(
size,
const { BufferSize::new(crate::COPY_BUFFER_ALIGNMENT).unwrap() },
device,
);
encoder.copy_buffer_to_buffer(
belt_buffer,
offset_in_belt_buffer,
target,
offset,
size.get(),
);
mapped
}

/// Allocate a staging belt slice with the given `size` and `alignment` and return it.
///
/// This allows you to do whatever you want with the slice after, such as
/// copying it to a texture or executing a compute shader that reads it, whereas
/// [`StagingBelt::write_buffer()`] can only write to other buffers.
///
/// If the `size` is greater than the size of any free internal buffer, a new buffer
/// will be allocated for it. Therefore, the `chunk_size` passed to [`StagingBelt::new()`]
/// should ideally be larger than every such size.
///
/// Three values are returned:
///
/// * The mapped buffer view which you should write into from the CPU side (Rust code).
/// * The buffer containing the slice, for you to use in GPU commands.
/// All commands involving this slice must be submitted after
/// [`StagingBelt::finish()`] is called and before [`StagingBelt::recall()`] is called.
/// * The offset within the buffer at which the slice starts.
/// This offset should be used for all GPU commands, and should not be used with
/// the mapped buffer view.
pub fn allocate(
&mut self,
size: BufferSize,
alignment: BufferSize,
device: &Device,
) -> (BufferViewMut<'_>, &Buffer, BufferAddress) {
assert!(
alignment.get().is_power_of_two(),
"alignment must be a power of two, not {alignment}"
);
// At minimum, we must have alignment sufficient to map the buffer.
let alignment = alignment.get().max(crate::MAP_ALIGNMENT);

let mut chunk = if let Some(index) = self
.active_chunks
.iter()
.position(|chunk| chunk.offset + size.get() <= chunk.size)
.position(|chunk| chunk.can_allocate(size, alignment))
{
self.active_chunks.swap_remove(index)
} else {
Expand All @@ -111,7 +136,7 @@ impl StagingBelt {
if let Some(index) = self
.free_chunks
.iter()
.position(|chunk| size.get() <= chunk.size)
.position(|chunk| chunk.can_allocate(size, alignment))
{
self.free_chunks.swap_remove(index)
} else {
Expand All @@ -129,17 +154,19 @@ impl StagingBelt {
}
};

encoder.copy_buffer_to_buffer(&chunk.buffer, chunk.offset, target, offset, size.get());
let old_offset = chunk.offset;
chunk.offset = align_to(chunk.offset + size.get(), crate::MAP_ALIGNMENT);
let allocation_offset = chunk.allocate(size, alignment);

self.active_chunks.push(chunk);
self.active_chunks
.last()
.unwrap()
.buffer
.slice(old_offset..old_offset + size.get())
.get_mapped_range_mut()
let chunk = self.active_chunks.last().unwrap();

(
chunk
.buffer
.slice(allocation_offset..allocation_offset + size.get())
.get_mapped_range_mut(),
&chunk.buffer,
allocation_offset,
)
}

/// Prepare currently mapped buffers for use in a submission.
Expand Down Expand Up @@ -197,3 +224,48 @@ impl fmt::Debug for StagingBelt {
.finish_non_exhaustive()
}
}

struct Chunk {
buffer: Arc<Buffer>,
size: BufferAddress,
offset: BufferAddress,
}

impl Chunk {
fn can_allocate(&self, size: BufferSize, alignment: BufferAddress) -> bool {
let alloc_start = align_to(self.offset, alignment);
let alloc_end = alloc_start + size.get();

alloc_end <= self.size
}

fn allocate(&mut self, size: BufferSize, alignment: BufferAddress) -> BufferAddress {
let alloc_start = align_to(self.offset, alignment);
let alloc_end = alloc_start + size.get();

assert!(alloc_end <= self.size);
self.offset = alloc_end;
alloc_start
}
}

use exclusive::Exclusive;
mod exclusive {
/// `Sync` wrapper that works by providing only exclusive access.
///
/// See <https://doc.rust-lang.org/nightly/std/sync/struct.Exclusive.html>
pub(super) struct Exclusive<T>(T);

/// Safety: `&Exclusive` has no operations.
unsafe impl<T> Sync for Exclusive<T> {}

impl<T> Exclusive<T> {
pub fn new(value: T) -> Self {
Self(value)
}

pub fn get_mut(&mut self) -> &mut T {
&mut self.0
}
}
}