[CRIU] crash when restoring with current git master?
Tycho Andersen
tycho.andersen at canonical.com
Fri Apr 24 15:30:55 PDT 2015
On Fri, Apr 24, 2015 at 10:40:39PM +0300, Pavel Emelyanov wrote:
> On 04/24/2015 09:41 PM, Tycho Andersen wrote:
> > On Fri, Apr 24, 2015 at 09:15:46PM +0300, Cyrill Gorcunov wrote:
> >> On Fri, Apr 24, 2015 at 11:53:41AM -0600, Tycho Andersen wrote:
> >>> Hi all,
> >>>
> >>> When doing a c/r of lxc containers (vivid host, vivid container), I'm
> >>> getting a crash sometimes. It looks like something is dying in the
> >>> restorer blob:
> >>>
> >>> http://paste.ubuntu.com/10880188/
> >>>
> >>> Has anyone seen this? Any idea what's causing it?
> >>
> >> Didn't see like this before. But definitely something goes bad in cgroups at least
> >>
> >> | (00.084962) 333: Error (files-reg.c:1055): File sys/fs/cgroup/perf_event/lxc/u2-30/cgroup.procs has bad size 17 (expect 11)
> >>
> >> someone write another task there?
> >
> > Ah, whoops. I missed this somehow in the log. That's probably what's
> > happening. Note that criu hangs on this error; the attached patch
> > fixes the hang.
>
> > +
> > +err:
> > + futex_abort_and_wake(&task_entries->nr_in_progress);
> > + return -1;
> > }
>
> But this thing has never been here. Instead, when child gets an error is
> exits and then the sigchld_handler() runs and does futex_abort_and_wake().
> Why hasn't this logic worked this time?
Good question. While trying to reproduce this with some debugging, I
got another hang, this time with no errors in the log:
http://paste.ubuntu.com/10881828/
I have to go for today, but I will try to reproduce this over the
weekend and get some more info.
Tycho
> -- Pavel
More information about the CRIU
mailing list