[CRIU] [PATCH 2/2] Make lacking cgroup properties non-fatal

Thu Aug 14 07:15:10 PDT 2014

On 08/14/2014 06:01 PM, Serge Hallyn wrote:
> Quoting Pavel Emelyanov (xemul at parallels.com):
>> On 08/14/2014 01:46 AM, Garrison Bellack wrote:
>>> On Wed, Aug 13, 2014 at 1:55 PM, Tycho Andersen <tycho.andersen at canonical.com <mailto:tycho.andersen at canonical.com>> wrote:
>>>
>>>     On Wed, Aug 13, 2014 at 01:32:01PM -0700, Garrison Bellack wrote:
>>>     > On Wed, Aug 13, 2014 at 12:43 PM, Andrew Vagin <avagin at parallels.com <mailto:avagin at parallels.com>> wrote:
>>>     >
>>>     > > On Wed, Aug 13, 2014 at 11:59:33AM -0700, gbellack at google.com <mailto:gbellack at google.com> wrote:
>>>     > > > From: Garrison Bellack <gbellack at google.com <mailto:gbellack at google.com>>
>>>     > > >
>>>     > > > Because different kernel versions have different cgroup properties, criu
>>>     > > > shouldn't crash just because the properties statically listed aren't
>>>     > > exact.
>>>     > > > Instead, during dump, ignore properties the kernel doesn't have and
>>>     > > continue.
>>>     > > > In addition, during restore, in the event of migration to a kernel that
>>>     > > is
>>>     > > > missing a property that was dumped, print an error but continue.
>>>     > >
>>>     > > I am not sure that we can continue in this case. Why do you think that
>>>     > > it's safe? I think we need to ask a user about this. So can we add a new
>>>     > > option to allow continuing in this case?
>>>     >
>>>     >
>>>     > I'm a little confused about your concerns of safety. Are you referring to
>>>     > missing properties on the dump or restore side?
>>>
>>>     I think restore side. I also wondered the same thing -- it seems that
>>>     missing properties on restore should be an error (perhaps as Andrew
>>>     says with an --allow-missing-cg-props flag on the restore side to
>>>     allow you to override this error).
>>>
>>>
>>> Here is the reasoning behind this decision. Let say you are migrating properties A,B,C,D where B doesn't exist.
>>>
>>> This is how it currently works:
>>> A will be restored. B prints an error and causes failure because it is missing. However, because cgroup property restoration is the last thing to happen during restore, and because of designs further up the call chain, the restoration of the process still goes through. C,D are not restored because of previous criu failure.
>>> End result -- Process and property A restored, C and D are not restored even though they exist on the new machine
>>>
>>> With this patch:
>>> A will be restored. B prints an error and continues. C and D are restored.
>>> End result --  Process and property A, C, D all restored.
>>>
>>> I think between these two options we clearly prefer the later behavior.
>>
>> Absolutely. AFAIU Andrew and Tycho meant, that there could be the 3rd behavior -- once B 
>> fails we abort all the restore. And they proposed an option for controlling this.
>>
>> I have a question about this -- what does LXC tool do if it meets a cgroup configuration
>> parameter in container's config, that is missing in the kernel?
> 
> Container start will fail.  The difference is that if there is a cgroup
> limit in the container's configuration file, then it was specifically asked
> for.  Ideally for a criu restart, we want the same behavior (overrideable),
> but we want limits which were not asked for to be ignored.

OK, so we need some fine-grained control over what to do with properties.
And since we've started talking about this. In OpenVZ kernel we have much
more cgroup parameters, than those currently listed in CRIU (e.g. kernel 
memory control). In the future we will likely have some of them upstream,
and definitely will have some completely new. Patching criu every time 
just to add new properties seems quite inconvenient. Especially, taking 
into account that this is just the list of strings.

Can we teach the get_known_properties() to accept lists from the caller
and use similar lists on restore to specify which of them can be ignored?

Thoughts?

> I.e. if I've checkpointed a nested container which has no cpuset specified,
> and the parent container had a cpuset limit specified (say cpu 0), then if I
> restart under another parent container which has a different cpuset (say
> cpu 1), then restart shouldn't fail, bc that limit was not inherent to
> the container I checkpointed.

Thanks,
Pavel