[L2Ork-dev] damage-control scheduler

Thu May 21 01:10:20 EDT 2020

Hi all,

I just noticed that Zoom appears to have some kind of error-recovery
for dropouts. Maybe others have heard stuff like it but I've never
seen/heard anything like it before.

Suppose you give Pd a 1-second math puzzle to compute during audio
computation. It sits there chugging away for 1 seconds. Then it
advances time in a loop to compute every single block-- from the
moment the rude computation happened to the present moment-- and
finally output the present block to the audio subsystem.

In other words, it squashes 1 second of block computations into zero
logical time, even though the scheduler internally updates the
timestamps for each block so that all the math in the patch works out
EXACTLY the same as if the dropout had never happened. Yay for
determinism!

The problem is that for that 1 second of work, only the most recent
block got sent to the audio subsystem. So the initial dropout is
compounded by a Greek tragedy-- software built to deliver realtime
audio does all that work to catch back up to reality, yet nearly none
of that work was piped to the one place where it should have gone--
the listener's ears. (Unless of course you're recording it to the
filesystem, in which case it all arrives in the correct order,
uncorrupted. But let's ignore that for now.)

As I heard it, Zoom does two things to counter this:

1. In a 1 second dropout, it recovers some or all of the data from the
remote host and stores it in a local buffer (which implies a buffer
storing some amount of data from the past)
2. When the realtime connection resumes, it doesn't immediately jump
to the present moment. Instead, it plays back the recovered data at a
speed greater than 1x until it finally catches up with the present
moment.

It's pretty clever. While it of course doesn't prevent dropouts, it
resumes from the content that immediately preceded the dropout. So the
listener receives almost everything from that 1 second, whereas the Pd
user receives almost nothing. Even better, a dangling transient from a
dropout is followed by silence, leaving space for the listener to
remember that sound and connect it with the resumed audio. It's easy
to recombine that mentally and make sense of it to cut down immensely
on the number of times you must ask someone to repeat something due to
dropouts.

The thing is-- that seems like a gargantuan engineering task for a
realtime audio/video app running over an untrusted/unreliable network,
serving multiple nodes.

But in Pd during a dropout, there's not even a buffer necessary to
store data. All the data is just sitting there in the future, ready to
be computed in sequence when the rude blocking computation finally
returns.

So-- could the scheduler be changed to resume from a dropout over a
longer period of time than zero, by interpolating among that 1 second
of lost blocks to output an accelerated version of them until the
engine catches back up with the present?

-Jonathan