-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding draft extension for host-provided scratch memory #423
Conversation
Add .DS_Store to .gitignore (MacOS specific) Fixes: free-audio#390
Summarizing the conversation in discussion free-audio#414, this PR adds a more complete description of when the realtime constraint must be met, by which party, and how it interacts with the thread safe tag.
- allow plugin_latency->get to be called during plugin->activate - require host_latency->changed to be called during plugin->activate
- the plugin interfaces have been separated into 2 independent ones - the plugin interfaces are optional - simplification of the design
…-audio#422) * Add a description of the expectation of request_callback timing Without making a requirement, indicate the intent of the timing. * Add an apostrophe * Add host can starve feedback from alex * more review feedback * notjusthosts
I have some questions with the interface.
|
The only constraint is that the memory is valid for a plugin during it's |
Then consider this:
If you have a single scratch buffer for process(), you can't process your voice in parallel, or you need 32 * 10K. I believe this needs to be clarified in the spec. |
You can only request/register one buffer per instance with a predefined size during activation. The host signals in the return value of In the process call you just call But there is only one buffer and the size does not change as long as the plugin is active. |
I think Alex's point is if you use the thread pool extension to schedule jobs, those thread pool local jobs will be effectively parallel and running under process. Can a thread-pool extension job access memory? If so is it distinct per thread or is it a single memory location? My guess is: The thread pool and memory scratch extensions need some careful co-consideration. And the patterns where people use the scratch memory will also require scratch-per-thread-voice not scratch-per-process-block in those cases. |
Yeah, thanks for bringing this up, I hadn't considered the inter-operation of scratch-memory and the thread-pool extension. I think the simplest solution would be to have the plugin request the total amount of scratch memory that it needs across all possible threads. Since struct My_Plugin
{
size_t scratch_mem_per_voice = 10'000;
size_t num_voices = 32;
char* scratch_memory_data = nullptr;
void activate(...) {
scratch_memory_ext.pre_reserve(host, scratch_mem_per_voice * num_voices);
}
void process(...) {
// Get all the scratch memory here
scratch_memory_data = scratch_memory_ext.access(host);
thread_pool_ext.request_exec(host, num_voices);
}
void thread_pool_callback(uint32_t task_index) {
// Get a partition of the scratch memory for this voice to use
char* this_voice_scratch_mem = scratch_memory_data + scratch_mem_per_voice * task_index;
// do the actual DSP work here...
}
}; However, I can see a few reasons why this might not be an ideal solution... For example, if the host thread pool only has 8 threads, then reserving enough scratch memory for 32 voices to be processed in parallel is a bit wasteful. Maybe the best solution is to have two scratch memory mechanisms... one scratch buffer that is intended to be accessed during the typedef struct clap_host_scratch_memory {
bool(CLAP_ABI *pre_reserve_process_scratch)(const clap_host_t *host, size_t scratch_size_bytes);
void*(CLAP_ABI *access_process_scratch)(const clap_host_t *host);
bool(CLAP_ABI *pre_reserve_thread_pool_scratch)(const clap_host_t *host, size_t scratch_size_per_thread_bytes);
void*(CLAP_ABI *access_thread_pool_scratch)(const clap_host_t *host, uint32_t task_index);
} clap_host_scratch_memory_t; All that said, I don't have much experience with the thread-pool extension, so I'll defer to your more informed opinions :). |
Here is my proposal: The scratch is a thread local pointer, so if you retrieve it from the process call, you'll get a pointer that you can share with all the jobs. If you retrieve it from the thread pool, you get a pointer that is only for the current job. I think this is the correct direction because it corresponds to how the host will implement this feature: each audio thread will have a single scratch buffer (thread local) whose size is greater or equal to the max requested size of all plugin instances. The total scratch memory is: |
// host when the plugin is de-activated. | ||
// | ||
// [main-thread & being-activated] | ||
bool(CLAP_ABI *pre_reserve)(const clap_host_t *host, size_t scratch_size_bytes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we ever use size_t
in any other extension? -> uint?_t
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, uint32_t
would do the job I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use uint64_t which is size_t in most of our systems? The host has the option to return no for values out of bounds
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a preference for uint32_t
vs uint64_t
... but Trinitou is right that CLAP doesn't use size_t
anywhere else, so I don't think we should use it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A scratch size bigger than 4 GB would be problematic I suppose, remember nthreads * max_scratch_size
.
Anyway regardless of the type, many host will likely have their own threshold.
uint32_t
seems sufficient to me, but I'm happy with uint64_t
as well.
This makes sense! I've added some comments intended to clarify this point, but please let me know if there are ways I can improve my explanation :). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks pretty good to me now. Any further details can be discussed on next IMO.
No description provided.