[L2Ork-dev] Purr Data: File / Print... PDF

Albert Graef aggraef at gmail.com
Sat Oct 24 06:46:46 EDT 2020


On Sat, Oct 24, 2020 at 3:32 AM Jonathan Wilkes <jon.w.wilkes at gmail.com>
wrote:

> What I'm talking about is the backend-- the audio engine and message
> dispatcher-- taking 2 or 3 minutes to process some data. I believe
> Joseph even hit this bug with the exponential list-walking explosion
> when he was trying to drag a large selection of objects.
>
> In those cases, pd-watchdog just keeps printing out messages until the
> backend finishes whatever brute-force task it was working on.
> Pd-watchdog will never bail in those cases.
>

Right, that's funny. But the real issue is obviously somewhere in the
engine's idle processing, I'd say. As long as the *engine* is busy, it just
won't listen to those SIGHUPs, so the watchdog just keeps sending those and
printing the "signaling pd" messages along with it.

But if the *GUI* is busy, the engine keeps chugging along doing its usual
idle processing, which apparently involves reading some sockets via the
recv() call. I have no idea where and why exactly that happens, but
apparently that recv() call *will* listen to the SIGHUP, and be upset
enough about it that it crashes the process, because the SIGHUP isn't
handled anywhere. The gdb backtrace seems to indicate that the select()
call in sys_domicrosleep() is the culprit, but I suspect that it's actually
related to the audio backend in some way, because this crash does or
doesn't happen depending on which audio backend is being used. (That
discrepancy might also be due to the part of Jack which runs in the client
taking care of masking those SIGHUPs, but I can't find anything in the
audio backend code of Pd itself which explains the differences between
backends there.)

Anyway, I was able to solve this by just adding a global method which can
be used by the GUI to mark sections when it is busy, during which the
engine just keeps the watchdog happy by itself as if running GUI-less. I
added those calls to the new print dialog as well as the message operation
and the make_index() call which constructs the help index when bringing up
the help browser. I *think* that this should cover all cases where this
crash may happen.

I already tested that on Linux, now testing on Mac and Windows to make sure
that the new mechanism doesn't break anything there, so expect a merge
request soon. (I also left the sys_domicrosleep() backport from vanilla in
there, because I think that it's good to have those fixes as well, even
though they didn't help with the issue at hand.)

Albert


> >
> >> Also, given various use cases with long-running number cruncher
> >> patches, watchdog *cannot* bail in any of those cases without solving
> >> the halting problem.
> >
> >
> > It doesn't try to solve the halting problem. It's a very dumb little
> routine which just loops waiting for its socket to dry up. ;-)
> >
> > I guess that the watchdog is just fine as long as the gui is still alive
> and kicking, no matter what the engine does.
>
> Hm... I'm stumped then. Because I know for one of the problems Joseph
> reported, the backend is stuck walking lists in a blocking call that
> will keep chugging away for minutes before letting the gui routine do
> any socket business. I don't know why this wouldn't cause pd-watchdog
> to bail, yet a few seconds of a javascript blocking call would.
>
> >
> > About realtime priorities: Yes, there have been improvements in the API,
> but Pd doesn't use them AFAICT.
> >
> >> So I'm just not sure what its purpose is.
> >
> >
> > I guess that it's a left-over from the bad old days when we all had
> single core computers. At that time it was fairly easy to completely lock
> up the system with a runaway realtime process. Nowadays, this has become
> much harder, usually you can still kill that process.
>
> Ah, right.
>
> -Jonathan
>
> >
> > Albert
> >
> >>
> >> -Jonathan
> >>
> >> On Fri, Oct 23, 2020 at 5:20 PM Jonathan Wilkes <jon.w.wilkes at gmail.com>
> wrote:
> >> >
> >> > On Fri, Oct 23, 2020 at 4:55 PM Albert Graef <aggraef at gmail.com>
> wrote:
> >> > >
> >> > > On Fri, Oct 23, 2020 at 10:03 PM Jonathan Wilkes <
> jon.w.wilkes at gmail.com> wrote:
> >> > >>
> >> > >> > I see that there are some fairly recent code changes in
> sys_domicrosleep() in vanilla, though. I'll port those and see whether that
> helps.
> >> > >>
> >> > >> I don't think they will help.
> >> > >
> >> > >
> >> > > You're right, backporting those didn't help.
> >> > >
> >> > >> The problem is that if pd-watchdog
> >> > >> doesn't receive an acknowledgement it will shut down the backend.
> >> > >
> >> > >
> >> > > Yep, that's it. If I modify the watchdog so that it never bails out
> then both the message and the print dialog work alright. Of course that's
> not the proper solution ;-), but it proves that the watchdog is the culprit
> here. So we just have to find a way to keep it happy, see below.
> >> >
> >> > I'm curious-- is there any extant user report of watchdog saving them
> >> > from having to hard reboot their machine (which is what I guess is
> >> > what it's there for)?
> >> >
> >> > The thing is, this is the first case I've heard of where the GUI has
> >> > caused a hangup. And it's a false positive because the blocking call
> >> > was triggered by the user purposely and is temporary.
> >> >
> >> > On the other hand, I've seen my backend lock down the CPU *all the
> >> > time*. Exponential explosion is the name of the game all through
> >> > g_editor.c, and there are various other freezes from external libs and
> >> > bugs. I have *never* seen watchdog bail in that case-- it just keeps
> >> > spitting out "watchdog: signaling pd..." until I ctrl-c out of the
> >> > terminal.
> >> >
> >> > Also-- hasn't Linux realtime priority processing improved
> >> > significantly over the past decade to make it less likely to lock up
> >> > the machine? (I've only heard in general about improvements, so I'm
> >> > not sure about this.)
> >> >
> >> > >
> >> > >> We can change the "message" dialog to something that doesn't block
> the
> >> > >> js engine. I'm not sure about the print dialog, though.
> >> > >
> >> > >
> >> > > Probably not much that we can do about the print dialog. And there
> might be other cases where things may go awry. Search index generation
> comes to mind -- we might just have been lucky there because this is a very
> quick operation on Linux, but it might give the same issue on slower
> computers.
> >> > >
> >> > > This really calls for a general solution where the gui informs the
> engine that it should keep the watchdog happy while it's executing a
> callback which may run for a while, and then inform it again when that
> callback is finished. That shouldn't be too hard to do, the engine already
> does this anyway when running gui-less. I'm working on it.
> >> >
> >> > I'm curious what happens if we get rid of watchdog and run Purr Data
> >> > with realtime priorities on a current distro... :)
> >> >
> >> > -Jonathan
> >> >
> >> > >
> >> > > Albert
> >> > >
> >> > >
> >> > >>
> >> > >> -Jonathan
> >> > >>
> >> > >> >
> >> > >> > Albert
> >> > >> >
> >> > >> >>
> >> > >> >> -Jonathan
> >> > >> >> _______________________________________________
> >> > >> >> L2Ork-dev mailing list
> >> > >> >> L2Ork-dev at disis.music.vt.edu
> >> > >> >> https://disis.music.vt.edu/listinfo/l2ork-dev
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > --
> >> > >> > Dr. Albert Gr"af
> >> > >> > Computer Music Research Group, JGU Mainz, Germany
> >> > >> > Email: aggraef at gmail.com, web: https://agraef.github.io/
> >> > >> > _______________________________________________
> >> > >> > L2Ork-dev mailing list
> >> > >> > L2Ork-dev at disis.music.vt.edu
> >> > >> > https://disis.music.vt.edu/listinfo/l2ork-dev
> >> > >> _______________________________________________
> >> > >> L2Ork-dev mailing list
> >> > >> L2Ork-dev at disis.music.vt.edu
> >> > >> https://disis.music.vt.edu/listinfo/l2ork-dev
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > Dr. Albert Gr"af
> >> > > Computer Music Research Group, JGU Mainz, Germany
> >> > > Email: aggraef at gmail.com, web: https://agraef.github.io/
> >> > > _______________________________________________
> >> > > L2Ork-dev mailing list
> >> > > L2Ork-dev at disis.music.vt.edu
> >> > > https://disis.music.vt.edu/listinfo/l2ork-dev
> >> _______________________________________________
> >> L2Ork-dev mailing list
> >> L2Ork-dev at disis.music.vt.edu
> >> https://disis.music.vt.edu/listinfo/l2ork-dev
> >
> >
> >
> > --
> > Dr. Albert Gr"af
> > Computer Music Research Group, JGU Mainz, Germany
> > Email: aggraef at gmail.com, web: https://agraef.github.io/
> > _______________________________________________
> > L2Ork-dev mailing list
> > L2Ork-dev at disis.music.vt.edu
> > https://disis.music.vt.edu/listinfo/l2ork-dev
> _______________________________________________
> L2Ork-dev mailing list
> L2Ork-dev at disis.music.vt.edu
> https://disis.music.vt.edu/listinfo/l2ork-dev



-- 
Dr. Albert Gr"af
Computer Music Research Group, JGU Mainz, Germany
Email: aggraef at gmail.com, web: https://agraef.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://disis.music.vt.edu/pipermail/l2ork-dev/attachments/20201024/3d5c0b63/attachment.html>


More information about the L2Ork-dev mailing list