Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement in-place processing (fix Ardour support) #93

Merged
merged 8 commits into from
Aug 7, 2021

Conversation

prokopyl
Copy link
Member

@prokopyl prokopyl commented Jul 31, 2021

This PR addresses the issue described in #89 that rust-lv2 currently doesn't support in-place processing, i.e. where the input buffer and output buffer are the same.

This PR implements an approach that wasn't discussed in #89 however, which is to leverage the Cell type to prevent creating references to the contents of the buffers (input or output).

This is actually not done directly using the standard Cell type, but with ReadCell and WriteCell types, which are respectively read-only and write-only wrappers around Cell to allow restricting their usage to input ports or output ports.

Using Cells has two big advantages:

  • Cells have interior mutability for Copy types (like all audio sample types, such as LV2's f32), and are by design safe to keep multiple references to (here, one for input, one for output). Therefore, the inPlaceBroken requirement on Rust LV2 plugins can be lifted.
  • This implementation does not create any additional copies anywhere, and allows implementations to be hard-realtime-safe (since, in fact, with Cell being a zero-overhead abstraction, this implementation actually does nothing at runtime).

The only downside of this approach is that it does not actually prevent from reading from the buffer after writing to it. However, doing so is not unsafe if there are no references to the contents (reading a valid but arbitrary number cannot trigger UB, and thanks to Cell other constraints cannot be put), it only results in incorrect behavior. Moreover, I believe this is rather trivial to avoid for plugin developers, so I don't think it is worth trying to prevent.

This PR also adds a new I/O port set: InPlaceInput and InPlaceOutput, allowing to use port types that implements InPlacePortType, which return the new Cells instead of just raw references. Duplicating the port trait infrastructure allows to not break backwards compatibility yet, but this raises a couple questions:

  • Should the in-place-compatible port types be the default? A quick ripgrep through my rather big LV2 plugin library shows that none of the plugins I use actually require inPlaceBroken. This makes sense to me, since making in-place-compatible processing seems rather trivial.
    Moreover, the fact that Ardour, a major player in the LV2 ecosystem (if not the reference? I believe some Ardour devs are also LV2 devs), does not support inPlaceBroken plugins is a pretty big sign that very few (if any) LV2 plugins actually use this feature, and that really, none should use it.
    This can be done rather easily by renaming this PR's InPlace* types to their current counterparts, and making the current implementations into some sort of ExclusivePort* alternative.
  • Should in-place-broken processing be supported at all? The more I dive into it, the less plausible it seems to me that plugin developers would actually want to use it, the single fact that Ardour users wouldn't be able to run it being quite a big deterrent in itself. We could remove the non-in-place-compatible implementations altogether from this crate (only leaving the Cell ones), and if users somehow need long references that point inside the buffer we could consider it uncharted territory, and let them use unsafe if they see fit.

(EDIT: This part of the discussion has been moved to #94, since we can implement this without breaking anything yet. 🙂 )

I'm keeping this PR as a draft since there are a few doc items missing still.

Closes #89.

@JOOpdenhoevel JOOpdenhoevel self-requested a review July 31, 2021 13:14
@JOOpdenhoevel JOOpdenhoevel added the ✨ Feature New feature or request label Jul 31, 2021
@JOOpdenhoevel
Copy link
Contributor

I like the idea of using cells and this approach might actually be the best solution for it. At first, I was a bit set off by the idea of ultimately using e.g. &[Cell<f32>] for audio since an audio slice contains floats, not cells with floats. However, since the sole idea of cells is that they're transparent, this should be fine. However, I would like to have a unit test that verifies that a &[f32] reinterpreted as a &[Cell<f32>] still has the desired contents and behavior.

However, I think that the additional API and the wrappers are unnecessary since they don't really prevent the user from writing to a read-only port. The user can still use a different and wrong port type and do whatever they like. The additional API does not change this and just adds a separate way to access ports, making it both harder to understand and to use. I would suggest something like this:

pub struct InPlaceAudio;

unsafe impl UriBound for InPlaceAudio {
    const URI: &'static [u8] = ::lv2_sys::LV2_CORE__AudioPort;
}

impl PortType for InPlaceAudio {
    type InputPortType = &'static [Cell<f32>];
    type OutputPortType = &'static [Cell<f32>];

    #[inline]
    unsafe fn input_from_raw(pointer: NonNull<c_void>, sample_count: u32) -> Self::InputPortType {
        std::slice::from_raw_parts(pointer.as_ptr() as *const Cell<f32>, sample_count as usize)
    }

    #[inline]
    unsafe fn output_from_raw(pointer: NonNull<c_void>, sample_count: u32) -> Self::OutputPortType {
        std::slice::from_raw_parts_mut(pointer.as_ptr() as *mut Cell<f32>, sample_count as usize)
    }
}

Sure, the user can write to the input port with this implementation, but since they have already said that the same memory is also used by an output port (by saying it's an in-place port), this does not make a difference. We simply don't need to prevent the user from writing to the an in-place input port! 😅

I would also prefer to keep the previous channel types because it would break the API and not all hosts require in-place operations. Ardour support is a strong point, but LV2 isn't only Ardour's plugin standard as VST isn't only Cubase's plugin standard. If you don't need to use celled values, using the value directly is just easier. It's also the rustacean way of implementing channels and I would assume that upcoming LV2 hosts written in Rust will probably not use in-place operations.

@prokopyl
Copy link
Member Author

I like the idea of using cells and this approach might actually be the best solution for it. At first, I was a bit set off by the idea of ultimately using e.g. &[Cell<f32>] for audio since an audio slice contains floats, not cells with floats. However, since the sole idea of cells is that they're transparent, this should be fine. However, I would like to have a unit test that verifies that a &[f32] reinterpreted as a &[Cell<f32>] still has the desired contents and behavior.

I could do that easily, however I don't think that would be necessary if we ditch the custom Cells, since this operation is supported by the standard library, using a combination of Cell::from_mut and (Cell::as_slice_of_cells)[https://doc.rust-lang.org/std/cell/struct.Cell.html#method.as_slice_of_cells]. Then the only remaining unsafe code would be the raw pointer to &[f32] conversion.

However, I think that the additional API and the wrappers are unnecessary since they don't really prevent the user from writing to a read-only port. The user can still use a different and wrong port type and do whatever they like. The additional API does not change this and just adds a separate way to access ports, making it both harder to understand and to use. I would suggest something like this:

I see your point there, I just personally like it a lot when APIs are as misuse-resistant as possible. 😋
I think it helps the users know exactly what to do (rather than them having "wrong" options exposed), but I agree that in this particular it may not be the trouble to maintain those custom wrappers here. I'll make a commit that goes back to regular Cells. ^^

I would also prefer to keep the previous channel types because it would break the API and not all hosts require in-place operations. Ardour support is a strong point, but LV2 isn't only Ardour's plugin standard as VST isn't only Cubase's plugin standard. If you don't need to use celled values, using the value directly is just easier. It's also the rustacean way of implementing channels and I would assume that upcoming LV2 hosts written in Rust will probably not use in-place operations.

I agree that LV2 isn't Ardour's plugin standard, so I did a quick check I noticed that of Ardour, Zrythm, and Carla, only Carla supports in-place-broken plugins (by assigning a separate input and buffer for each plugin, and memcpy'ing the samples around all the time, so basically the only implementation possible).

Seeing this, I actually disagree on the fact that future Rust LV2 wouldn't use in-place operations in the future, because they are just a much more performant implementation. Using Cell isn't any less rustacean, they are the standard type to say "I want read/write operations on this value, but not references". In fact, copying buffers around feels more like the traditional Rust anti-pattern of "let's just slap .clone() on it and stop worrying about ownership", to me at least.

The only advantage of using inPlaceBroken instead of the LV2 default buffer strategy, is that in Rust, you get to write *int instead of int.get(), and that's about it. 😕
On the other hand, a large share of LV2 DAWs (that I know of, at least) rightfully don't support it, and probably nor will future ones, and nobody uses it in their plugins, hence making it a terrible default.

Now, of course since that would be a breaking change, we could wait until we release another major version to do the swap, if the fix is out there I don't think there would be much of an emergency anymore. 🙂

@WeirdConstructor
Copy link

Nice that someone works on this. The Cell solution sounds perfect for me. I've not been familiar enough with Rust to come up with that in the Issue.

Now, of course since that would be a breaking change, we could wait until we release another major version to do the swap, if the fix is out there I don't think there would be much of an emergency anymore. slightly_smiling_face

LV2 is still 0.x and are there many plugins that depend on rust-lv2 out there? On crates.io I see no dependents on rust-lv2.

I would be eager to test this, but my plugins all require UI unfortunately.

@JOOpdenhoevel
Copy link
Contributor

I could do that easily, however I don't think that would be necessary if we ditch the custom Cells, since this operation is supported by the standard library, using a combination of Cell::from_mut and (Cell::as_slice_of_cells)[https://doc.rust-lang.org/std/cell/struct.Cell.html#method.as_slice_of_cells]. Then the only remaining unsafe code would be the raw pointer to &[f32] conversion.

Awesome! Then, we also don't need the test since this is a safe and supported operation!

I'll make a commit that goes back to regular Cells. ^^

Thank you! 😊 However, I still don't like that that the PortType and InPlacePortType traits are exactly the same except for the way the library uses them. I like the idea that you can get an in-place-safe and an in-place-broken version of the same port type, but this solution is just wet. Please try to find a different solution that is more dry or introduce the in-place-safe versions as their own port types.

The only advantage of using inPlaceBroken instead of the LV2 default buffer strategy, is that in Rust, you get to write *int instead of int.get(), and that's about it. 😕

Another advantage of uncelled slices is that you can share them between threads and therefore can work on the input data from different threads without needing to copy them to a shareable buffer first. You may say that you can't do this with hard RT constraints, but almost no one has hard real-time constraints in audio processing, meaning that no one dies if a single frame is dropped while you're mixing your album. You can not achieve hard real-time constraints with a normal CPU and a normal operation system anyway due to caching, branch prediction and whatnot. Audio processing in a DAW running on a Mac or PC has soft real-time constraints and one might achieve their soft real-time constraints more easily with parallelized plugins due to the rise of many-core CPUs like the AMD Ryzen CPUs or Apple's ARM chips. We should leave space for this development and keep the uncelled port types.

LV2 is still 0.x and are there many plugins that depend on rust-lv2 out there? On crates.io I see no dependents on rust-lv2.

While the lv2 crate is still at 0.6.0, the lv2-core crate actually has the version number 3.0.0 and we would need to bump it to 4.0.0 for this breaking change. I must admit that I've made the mistake of promising stability for lv2-core way to early, but I would now like to keep this promise now.

@prokopyl
Copy link
Member Author

prokopyl commented Aug 1, 2021

I'll make a commit that goes back to regular Cells. ^^

Thank you! blush However, I still don't like that that the PortType and InPlacePortType traits are exactly the same except for the way the library uses them. I like the idea that you can get an in-place-safe and an in-place-broken version of the same port type, but this solution is just wet. Please try to find a different solution that is more dry or introduce the in-place-safe versions as their own port types.

Yes, I've been trying to find a more DRY solution, but I think having InPlaceAudio and InPlaceCV might be better for now, considering only those types tend to actually share buffers. I'll change that soon. ^^

Another advantage of uncelled slices is that you can share them between threads and therefore can work on the input data from different threads without needing to copy them to a shareable buffer first.

That would be only true for the input, not the output (&mut T being !Sync for obvious reasons). However, plugins should never spread their processing work onto multiple threads, because they are isolated and unaware of everything else that's going on in a plugin host. It is the host's job to orchestrate processing among the (now many) cores of the CPU at their own (or, often the user's) discretion.

To give you an example, a small track I've made recently in Ardour had 30+ tracks and busses, each of which had between 5 and 10 plugins on them, and it was no sweat for my 8-core/16-thread mobile CPU.
However, if these plugins each wanted to spawn their own threadpool (à la rayon) for processing, that would be (at worst) 4800 threads permanently competing for my 8 cores (and it gets worse the more cores you have). I'd be pretty surprised if any frame would make it to my speakers in time in that setup.

You may say that you can't do this with hard RT constraints, but almost no one has hard real-time constraints in audio processing, meaning that no one dies if a single frame is dropped while you're mixing your album.

If I was mixing for a mafia party and I had X-runs/dropped frames, pretty sure I'd get killed. 🙃
Bad jokes aside, this is quite false and actually the other way around: a lot of people need hard RT, and the vast majority wants it:

  • Musicians and DJs playing live at a venue cannot afford dropped frames under any circumstance.
  • Live streamers that use realtime processing (eg. for cleaning up their voice, or using multiple mics) could shove dropped frames as "technical issues", but it has to stay very rare or it's gonna become problematic.
  • If you are recording something while processing (either on the track itself, or on other tracks that play along), if an X-run happens it's the whole audio system that falls on its face for that frame, which means it couldn't get recorded in time and you can pretty much throw that take away and do another one. (… if you can. Takes can be unique sometimes.)
  • Even in my case (bedroom music production), I am very very annoyed by anything that makes my session X-run, and you better believe that any plugin that I know is causing this I'm going to throw away and never touch again until it's fixed. 😛

Even when a plugin is hard-RT capable, I've also found myself hunting for plugins that take too much time processing (some of them even in the 20ms+ range!), and had to (painfully) sacrifice them for a lighter alternative, on that fact alone.
Being hard-RT capable is one thing, but the thing with digital audio processing (especially in DAWs) is that you really want to be as lightweight and low-footprint as possible, simply because there is going to be dozens (or more) instances all running at the same time, and you have to play nice with others. Handling your own threads is simply incompatible with that.

You can not achieve hard real-time constraints with a normal CPU and a normal operation system anyway due to caching, branch prediction and whatnot. Audio processing in a DAW running on a Mac or PC has soft real-time constraints and one might achieve their soft real-time constraints more easily with parallelized plugins due to the rise of many-core CPUs like the AMD Ryzen CPUs or Apple's ARM chips. We should leave space for this development and keep the uncelled port types.

This is not a new nor "future" developments: DAWs have been taking advantage of many-core CPUs for many years now, they can scale to N cores already. Moreover, if multithread processing of single plugins becomes a thing at some point, then the current uncelled port types will be very much incompatible with the current &[f32] / &mut[f32] port types because of synchronization needs.

LV2 is still 0.x and are there many plugins that depend on rust-lv2 out there? On crates.io I see no dependents on rust-lv2.

While the lv2 crate is still at 0.6.0, the lv2-core crate actually has the version number 3.0.0 and we would need to bump it to 4.0.0 for this breaking change. I must admit that I've made the mistake of promising stability for lv2-core way to early, but I would now like to keep this promise now.

Yeah, that kinda sucks, but I agree it's best not to wreck what's on crates.io already.

@YruamaLairba
Copy link
Contributor

I just saw work is progressing to solve the inplacebroken issue. I read all the comment, but honestly, i didn't understood what part of inplacebroken issue the current work try to solve ? i just understood it wouldn't solve the read after write issue.

@prokopyl
Copy link
Member Author

prokopyl commented Aug 2, 2021

I just saw work is progressing to solve the inplacebroken issue. I read all the comment, but honestly, i didn't understood what part of inplacebroken issue the current work try to solve ? i just understood it wouldn't solve the read after write issue.

This work makes it so plugins with potentially shared input and output buffers (i.e. those without the inPlaceBroken feature enabled) cannot trigger UB in safe Rust.

Reading after writing is fine to not enforce through an API: it's incorrect to do so, but it is not unsafe (see the whole first chapter of the Nomicon for to get a better idea on what's safe and what isn't).

What is unsafe is allowing reading while writing (or, more exactly, allowing writing while still having a shared reference to the thing you just wrote to), because you've pretty much swept the rug under the feet of other parts of the code that may have checked for invariants on that data, which are suddenly not true anymore. Except the code that's running on that data has no idea it has changed, and it runs into UB.

In practice however, LV2 plugin developers never needed to "hold invariants" of any sorts on audio buffers: they are just a big array of very simple types (here, f32) that either get read from or written to, nothing more. The LV2 spec never enforces hosts to uphold the invariant of "this data will not change until now and the next time you'll look at it" (quite the opposite, in fact), and therefore it is incorrect (and unsafe) to mark it as enforced by using &[f32].

This specific case is best represented by the Cell<T> type, which forces you to either read or replace full values (for Copy types), without any consideration for what any past state may have been. Which is exactly what we need for a sample in an audio buffer: just read, or write, nothing more. 🙂

@prokopyl
Copy link
Member Author

prokopyl commented Aug 2, 2021

By the way, @Janonard I'm moving the discussion about "should we keep the non-cell port types or not?" to #94, because while I think this discussion is very important, it does not need to be resolved before this PR is implemented. ^^

@JOOpdenhoevel
Copy link
Contributor

Thanks for the work @prokopyl! I think this is a good implementation now until #94 is resolved. The only thing missing from my point of view are docs that explain this feature, but you don't have to do it in this PR. Just mark the PR as ready for review and I will give my review and merge it!

@prokopyl prokopyl marked this pull request as ready for review August 3, 2021 01:31
@prokopyl
Copy link
Member Author

prokopyl commented Aug 3, 2021

@Janonard Done! I actually took the time to document the whereabouts of using the InPlace port types, as well as considerably extend the documentation for the non-InPlace port types to match. 🙂

I also took the liberty of splitting the port types into their own files (but kept the same mod structure of course), with all those docs and types it was getting a little cramped in there. ^^

@PieterPenninckx
Copy link
Contributor

PieterPenninckx commented Aug 7, 2021

Hi @prokopyl , it's nice to see you back! :-)

Thank you for this pull request, it helps me to re-frame my thinking about this topic. I share my initial thoughts with you below and finish with an idea that is inspired by this PR.

First thought: I think an &[Cell<f32>] is not practical for use in combination with SIMD, but I'm not a SIMD expert, so I may be wrong.

Sure, the user can write to the input port with this implementation, but since they have already said that the same memory is also used by an output port (by saying it's an in-place port), this does not make a difference. We simply don't need to prevent the user from writing to the an in-place input port! 😅

I don't think hosts are actually required to also use the memory of an input port for an output port. That would be very impractical e.g. for a plugin that has two input ports and only one output port.

While the lv2 crate is still at 0.6.0, the lv2-core crate actually has the version number 3.0.0 and we would need to bump it to 4.0.0 for this breaking change. I must admit that I've made the mistake of promising stability for lv2-core way to early, but I would now like to keep this promise now.

Well, nobody says you can't publish version 0.1.0 after version 3.0.0 ;-) LV2 is still very young and we may need some more iterations, so I don't think anybody is going to be mad when you switch to a pre-1.0 version number to reflect that.

This PR actually inspired me for another idea, which is to enforce at compile time that the input ports and the output ports cannot be mixed. The idea is a little as follows:

struct InputPorts;

struct OutputPorts;

struct Ports {
    inputs: InputPorts,
    outputs: OutputPorts
}

impl Ports {
    fn inputs(&self) -> &InputPorts {
        &self.inputs
    }
    
    fn outputs(&mut self) -> &mut OutputPorts {
        &mut self.outputs
    }
}

The current version of the borrow checker prevents incorrect use, e.g. by giving an error for the following code:

let mut ports = Ports::new();
let inputs = ports.inputs();
let outputs = ports.outputs();
let a = &inputs;

My restraint with this idea is that it's not entirely clear to me if the borrow checker may be relaxed to allow the above code (since what's the meaning of fn(&mut) -> &?). I'm not sure if we can rely on this behaviour, but we can ask.
Edit: actually, for this to work, inputs can also have the signature fn inputs(&self) -> &InputPorts. The real drawback is that this doesn't support iterating frame-per-frame (which the PR does support).

Copy link
Contributor

@JOOpdenhoevel JOOpdenhoevel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The chapter in the book isn't updated to the fact that inPlaceBroken isn't required anymore. But as I've said before: You don't need to do this in this PR. I will create an issue for that. However, please note that the next version will not be released before this is done.

Thank you for your contribution!

@JOOpdenhoevel JOOpdenhoevel merged commit aa58d5c into develop Aug 7, 2021
@JOOpdenhoevel JOOpdenhoevel deleted the in-place-processing branch August 7, 2021 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
✨ Feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

No support for Ardour - No support for inPlaceBroken
5 participants