[CRIU] [PATCH] Dismantle cgyard in non-detached restore mode.

Tycho Andersen tycho.andersen at canonical.com
Wed Mar 18 13:49:21 PDT 2015


On Wed, Mar 18, 2015 at 11:47:28PM +0300, Pavel Emelyanov wrote:
> On 03/18/2015 11:39 PM, Tycho Andersen wrote:
> > Hi Pavel,
> > 
> > On Wed, Mar 18, 2015 at 11:17:29PM +0300, Pavel Emelyanov wrote:
> >> On 03/18/2015 10:01 PM, Tycho Andersen wrote:
> >>> Hi Saied,
> >>>
> >>> On Wed, Mar 18, 2015 at 11:14:31AM -0700, Saied Kazemi wrote:
> >>>> Tycho,
> >>>>
> >>>> Thanks for looking into this.  I knew that we can't fini_cgroup() here but
> >>>> because I am not familiar with the internals of cgroup restoration I just
> >>>> put it there as a quick hack to get criu successfully dump and restore my
> >>>> test container multiple times (didn't care about properties, etc.).
> >>>>
> >>>> For a permanent fix, I have to defer to your judgement on the best way to
> >>>> dismantle the cgyard in the non-detached mode.  I agree with you that it's
> >>>> a good idea to always fini_cgroup() as soon as we've fully restored
> >>>> cgroups, regardless of detached/non-detached mode.
> >>>
> >>> Ok, I think the right thing to do is something like this:
> >>>
> >>> https://github.com/tych0/criu/commit/63e138e7b4b1b183d864fb08923176c9cd9ba545
> >>>
> >>> I vaguely recall some discussion about why this wasn't possible
> >>> before, but I don't remember exactly what it was. 
> >>
> >> I can remember only one reason -- cgroup props should be restored at the very
> >> end not to make restoring processes hit potentially strict limits set for memory,
> >> cpu or anything else.
> > 
> > Oh, duh, of course.
> > 
> >> But other than this restoring them earlier than it's done now seems correct :)
> > 
> > The above patch restores the properties just before unlocking the
> > network. A comment around there says:
> > 
> > * Below this line nothing can fail, because network is unlocked
> > 
> > Which I guess is below this line nothing *should* fail? 
> 
> Yup :)
> 
> > Since I think this could potentially fail (e.g. due to kernel version mismatch
> > or hardware change as we've previously discussed), we have to do it above
> > there. However, this means that the network restore would be limited
> > by the cgroups.
> 
> No, the code in question in the criu context, not tasks', and this context
> isn't (shouldn't) be affected by restored cgroups properties. And by that
> time tasks are already in the RESTORE_SIGCHILD state, only creds restoration
> and timers will be done under cgroups constraints, but these are not
> affected by cgroups.
> 
> > Do you have any thoughts on where this should live?
> 
> I believe that the place in your patch _is_ the correct one. Unless someone
> tries and seed that it's not :)

Ok, cool. I will send it to the list once I've verified that it
actually works beyond compiling.

Tycho

> Thanks,
> Pavel
> 


More information about the CRIU mailing list