[CRIU] crash when restoring with current git master?

Tycho Andersen tycho.andersen at canonical.com
Fri Apr 24 15:30:55 PDT 2015


On Fri, Apr 24, 2015 at 10:40:39PM +0300, Pavel Emelyanov wrote:
> On 04/24/2015 09:41 PM, Tycho Andersen wrote:
> > On Fri, Apr 24, 2015 at 09:15:46PM +0300, Cyrill Gorcunov wrote:
> >> On Fri, Apr 24, 2015 at 11:53:41AM -0600, Tycho Andersen wrote:
> >>> Hi all,
> >>>
> >>> When doing a c/r of lxc containers (vivid host, vivid container), I'm
> >>> getting a crash sometimes. It looks like something is dying in the
> >>> restorer blob:
> >>>
> >>> http://paste.ubuntu.com/10880188/
> >>>
> >>> Has anyone seen this? Any idea what's causing it?
> >>
> >> Didn't see like this before. But definitely something goes bad in cgroups at least
> >>
> >>  | (00.084962)    333: Error (files-reg.c:1055): File sys/fs/cgroup/perf_event/lxc/u2-30/cgroup.procs has bad size 17 (expect 11)
> >>
> >> someone write another task there?
> > 
> > Ah, whoops. I missed this somehow in the log. That's probably what's
> > happening. Note that criu hangs on this error; the attached patch
> > fixes the hang.
> 
> > +
> > +err:
> > +	futex_abort_and_wake(&task_entries->nr_in_progress);
> > +	return -1;
> >  }
> 
> But this thing has never been here. Instead, when child gets an error is
> exits and then the sigchld_handler() runs and does futex_abort_and_wake().
> Why hasn't this logic worked this time?

Good question. While trying to reproduce this with some debugging, I
got another hang, this time with no errors in the log:

http://paste.ubuntu.com/10881828/

I have to go for today, but I will try to reproduce this over the
weekend and get some more info.

Tycho

> -- Pavel


More information about the CRIU mailing list