10 votes

The SDL3 Audio Subsystem

3 comments

  1. tjf
    Link
    SDL developer Ryan C. Gordon (aka Icculus) gives a brief overview of the new audio subsystem in the upcoming SDL3.

    SDL developer Ryan C. Gordon (aka Icculus) gives a brief overview of the new audio subsystem in the upcoming SDL3.

    4 votes
  2. Moonchild
    (edited )
    Link
    Frankly speaking, although I appreciate that they've added fuller audio graph management, this seems to me like somewhat poorer layering and abstraction compared to what gorilla audio does (both...

    Frankly speaking, although I appreciate that they've added fuller audio graph management, this seems to me like somewhat poorer layering and abstraction compared to what gorilla audio does (both in its initial incarnation and what I've added)—of course, there are a number of things that I think gorilla could do better at or misses, but I think it compares favourably, even in its current state.

    The kind of low-level interface he initially describes sdl2 as having—with no mixing done automatically—is definitely something you want; not even necessarily to avoid the latency added by the mix—since you will have to do some sort of mixing anyway—but in order to have control and knowledge of the latency characteristics (suppose you make live music and your principal goal is to keep latency consistent rather than minimise it). But sdl2 doesn't actually let you know anything about how much latency you have to contend with. Gorilla has direct accommodations thattaways (note in particular the subtle difference between SyncPipe and SyncShared). One thing it could do better, though, is to be more granular—currently it discretises to 'buffers', which is not super nice.

    Push-style streams are trivial to layer over callback-based pull streams (I haven't done it in gorilla, but it would probably be a 50-line patch), but the issue is that without lightweight threads or coroutines (which allow for more effective abstraction of control), backpressure/queue sizing has to be managed in scattershot fashion; pull-style lets that be centralised (one facet of which gorilla's buffered streams handle nicely and simply).

    (Hence the comment 'if you are not ready for it when it comes and you can't generate enough sound you are gonna get ... some sort of terrible thing' rings a bit hollow to me—it's not that the old interface was deficient in this respect; it's just that it was low level, so you had to implement the queueing yourself, and now they've added higher-level abstractions. Which is definitely valuable!—but a change to the interface doesn't fundamentally solve or change the problems of latency and queueing.)

    One thing that I think is flawed about the current gorilla design is that it handles format conversions automatically at playback time. Better and more modular would be to have an explicit adapter node. SDL's design is even worse, though, shoving conversions directly into the core audio stream abstraction. (Similarly, forking—which is not currently supported in gorilla, but which would again be a 50-line patch and trivially include user-controllable queue parameters—should be explicit, just like mixing.) SDL3 does putatively applies jukebox parameters at mix-time, rather than earlier, just like gorilla, putatively avoiding spurious added latency. (I think the missing link for them wrt the latter issue is probably the lack of a 'handle' abstraction.)

    Sdl does seem to have gotten the queueing management right for a fancy sinc fir for resampling. I was going to use libsrc for this, but it lacks the requisite controls, so I am going to have to implement that myself. On the other hand, I have some extra queueing functionality exposed. And my design admits user-controllable resampling methods, which is significant as a sinc fir is obligatorily going to need more queueing and add more latency than, say, a linear interpolator (which only needs to queue a single frame and still offers reasonable quality for a lot of applications)—I hear users having knowledge and needing control over latency/queueing is a thing :)

    Both gorilla and sdl3 have zero-copy negotiation at the device level, but not at the source level; hence, there can be spurious copies (ignore the mutex shenanigans; it was that way when I got it and haven't gotten around to fixing it yet).

    I should also mention that, although I've talked a lot about functionality, gorilla remains easy to use for simple applications—this is again an issue of good abstraction and layering. See a simple example of a tone rendered using a high-level api and a mixer, same tone rendered directly to device, and audio player—all are really very simple.

    3 votes
  3. text_garden
    Link
    To me this seems a bit like the audio queues in SDL2 with the addition of the logical device abstraction, but I've always opted to use the callback, so I don't know the pains of audio queues. Is...

    To me this seems a bit like the audio queues in SDL2 with the addition of the logical device abstraction, but I've always opted to use the callback, so I don't know the pains of audio queues. Is there much more to it?

    1 vote