You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As reported in a thread on the MakerBox forum, there can be playback artefacts (clicks at mostly regular intervals) especially with long continuous drone sounds.
Testing on macOS with the latest build of AO (0.22) I can reliably reproduce the problem with a 48KHz test signal of a 100-300Hz sine sweep over 30 seconds in Safari and Chrome, where I get clicks approximately every 4 seconds and sometimes in between. In Firefox, I get far fewer problems but it does tend to happen at least once around the 8 second mark.
I can think of two potential causes
Clicks being introduced by ffmpeg in splitting and encoding the audio
Clicks being introduced in the browser where the split segments are recombined
The first issue, ffmpeg splitting/encoding:
We package all audio twice (once for Safari/iOS devices, and once for every other browser). As far as I remember, we've usually seen better quality with the generic version. For the generic version we use ffmpeg's built-in DASH packaging which should correctly encode then split. For the Safari version we use ffmpeg's "segment" output format where I'm not sure in which order the operations are performed.
Below is the encoding command we use (the example being a continuous sine tone for 30 seconds). The only thing on top of ffmpeg is that Audio Orchestrator first splits each track around long silent gaps and throws the gaps away (by running this with different -ss and -t parameters for each non-silent item). It will also replace the manifest generated by ffmpeg with a simpler one that our playback library understands.
I can concatenate the resulting init segment with the m4s chunks and get back a seamless track, while I can't easily do the same with the safari .m4a segments - I suspect because the m4a headers for timing information are not correct. This might need further investigation, but the fact that the clicks also appear in non-Safari/iOS browsers makes me think there's something else at play.
The second issue, playback in the browser:
Audio Orchestrator is using the WebAudio API to emulate DASH playback, because it is based on an old internal audio toolkit library (bbcat-js) that was written before we had widespread support for Media Source Extensions as a more reliable way of playing back DASH streams. The source code for the DASH source nodes is here: https://github.com/bbc/audio-orchestration/tree/main/packages/bbcat-js/src/dash/dash-source-node
I'm unfortunately not very familiar with this code, but I think it generally works by combining the binary data for adjacent segments and decoding them, then scheduling playback of those buffers on the WebAudio timeline where maybe browser-dependent inaccuracies could be introduced. A few years ago there were some security changes in browsers to limit access to very accurate timers that could be exploited for finger-printing and side-channel timing attacks, and I wonder if this might have had an impact here as well.
As an alternative to splitting into short segments for DASH-like playback, we could download the entire audio for each item upfront. Audio Orchestrator will do this for short items (under 10 seconds of audio separated from other items by at least 1 second of silence). Tuning these parameters to take that approach for everything might bypass the splitting and re-assembly issues. However, it might cause other problems, such as a longer download / decode delay before the item can play, which would lead to the beginning of an item being missed if not scheduled long enough in advance.
The text was updated successfully, but these errors were encountered:
As reported in a thread on the MakerBox forum, there can be playback artefacts (clicks at mostly regular intervals) especially with long continuous drone sounds.
Testing on macOS with the latest build of AO (0.22) I can reliably reproduce the problem with a 48KHz test signal of a 100-300Hz sine sweep over 30 seconds in Safari and Chrome, where I get clicks approximately every 4 seconds and sometimes in between. In Firefox, I get far fewer problems but it does tend to happen at least once around the 8 second mark.
I can think of two potential causes
The first issue, ffmpeg splitting/encoding:
We package all audio twice (once for Safari/iOS devices, and once for every other browser). As far as I remember, we've usually seen better quality with the generic version. For the generic version we use ffmpeg's built-in DASH packaging which should correctly encode then split. For the Safari version we use ffmpeg's "segment" output format where I'm not sure in which order the operations are performed.
Below is the encoding command we use (the example being a continuous sine tone for 30 seconds). The only thing on top of ffmpeg is that Audio Orchestrator first splits each track around long silent gaps and throws the gaps away (by running this with different -ss and -t parameters for each non-silent item). It will also replace the manifest generated by ffmpeg with a simpler one that our playback library understands.
I can concatenate the resulting init segment with the m4s chunks and get back a seamless track, while I can't easily do the same with the safari .m4a segments - I suspect because the m4a headers for timing information are not correct. This might need further investigation, but the fact that the clicks also appear in non-Safari/iOS browsers makes me think there's something else at play.
The second issue, playback in the browser:
Audio Orchestrator is using the WebAudio API to emulate DASH playback, because it is based on an old internal audio toolkit library (bbcat-js) that was written before we had widespread support for Media Source Extensions as a more reliable way of playing back DASH streams. The source code for the DASH source nodes is here: https://github.com/bbc/audio-orchestration/tree/main/packages/bbcat-js/src/dash/dash-source-node
I'm unfortunately not very familiar with this code, but I think it generally works by combining the binary data for adjacent segments and decoding them, then scheduling playback of those buffers on the WebAudio timeline where maybe browser-dependent inaccuracies could be introduced. A few years ago there were some security changes in browsers to limit access to very accurate timers that could be exploited for finger-printing and side-channel timing attacks, and I wonder if this might have had an impact here as well.
As an alternative to splitting into short segments for DASH-like playback, we could download the entire audio for each item upfront. Audio Orchestrator will do this for short items (under 10 seconds of audio separated from other items by at least 1 second of silence). Tuning these parameters to take that approach for everything might bypass the splitting and re-assembly issues. However, it might cause other problems, such as a longer download / decode delay before the item can play, which would lead to the beginning of an item being missed if not scheduled long enough in advance.
The text was updated successfully, but these errors were encountered: