[Devel] Re: [PATCH 4/4] cgroup freezer: --- replacement patch 4/4 (a)
Oren Laadan
orenl at cs.columbia.edu
Thu Jun 4 09:32:33 PDT 2009
Matt Helsley wrote:
> On Wed, Jun 03, 2009 at 08:10:24PM -0400, Oren Laadan wrote:
>> From 6cdb7d9504a19ad88e6da8ad85374267c7acb1b2 Mon Sep 17 00:00:00 2001
>> From: Matt Helsley <matthltc at us.ibm.com>
>> Date: Wed, 3 Jun 2009 02:31:21 -0700
>> Subject: [PATCH] cgroup freezer: Add CHECKPOINTING state to safeguard container checkpoint
>>
>> The CHECKPOINTING state prevents userspace from unfreezing tasks until
>> sys_checkpoint() is finished. When doing container checkpoint userspace
>> will do:
>>
>> echo FROZEN > /cgroups/my_container/freezer.state
>> ...
>> rc = sys_checkpoint( <pid of container root> );
>>
>> To ensure a consistent checkpoint image userspace should not be allowed
>> to thaw the cgroup (echo THAWED > /cgroups/my_container/freezer.state)
>> during checkpoint.
>>
>> "CHECKPOINTING" can only be set on a "FROZEN" cgroup using the checkpoint
>> system call. Once in the "CHECKPOINTING" state, the cgroup may not leave until
>> the checkpoint system call is finished and ready to return. Then the
>> freezer state returns to "FROZEN". Writing any new state to freezer.state while
>> checkpointing will return EBUSY. These semantics ensure that userspace cannot
>> unfreeze the cgroup midway through the checkpoint system call.
>>
>> The cgroup_freezer_begin_checkpoint() and cgroup_freezer_end_checkpoint()
>> make relatively few assumptions about the task that is passed in. However the
>> way they are called in do_checkpoint() assumes that the root of the container
>> is in the same freezer cgroup as all the other tasks that will be
>> checkpointed.
>>
>> Matt Helsley's wrote the original patch.
>>
>> Changlog:
>> [2009-Jun-03] change cgroup_freezer_{begin,end}_checkpoint() to take a
>> struct cgroup_subsys_state pointer
>> add struct cgroup_subsys_state *get_task_cgroup_freezer()
>>
>> Signed-off-by: Oren Laadan <orenl at cs.columbia.edu>
>> Cc: Matt Helsley <matthltc at us.ibm.com>
>> Cc: Paul Menage <menage at google.com>
>> Cc: Li Zefan <lizf at cn.fujitsu.com>
>> Cc: Cedric Le Goater <legoater at free.fr>
>>
>> Notes:
>> As a side-effect this prevents the multiple tasks from entering the
>> CHECKPOINTING state simultaneously. All but one will get -EBUSY.
>> ---
>> Documentation/cgroups/freezer-subsystem.txt | 10 ++
>> include/linux/freezer.h | 10 ++
>> kernel/cgroup_freezer.c | 149 +++++++++++++++++++--------
>> 3 files changed, 127 insertions(+), 42 deletions(-)
>>
>> diff --git a/Documentation/cgroups/freezer-subsystem.txt b/Documentation/cgroups/freezer-subsystem.txt
>> index 41f37fe..92b68e6 100644
>> --- a/Documentation/cgroups/freezer-subsystem.txt
>> +++ b/Documentation/cgroups/freezer-subsystem.txt
>> @@ -100,3 +100,13 @@ things happens:
>> and returns EINVAL)
>> 3) The tasks that blocked the cgroup from entering the "FROZEN"
>> state disappear from the cgroup's set of tasks.
>> +
>> +When the cgroup freezer is used to guard container checkpoint operations the
>> +freezer.state may be "CHECKPOINTING". "CHECKPOINTING" can only be set on a
>> +"FROZEN" cgroup using the checkpoint system call. Once in the "CHECKPOINTING"
>> +state, the cgroup may not leave until the checkpoint system call returns the
>> +freezer state to "FROZEN". Writing any new state to freezer.state while
>> +checkpointing will return EBUSY. These semantics ensure that userspace cannot
>> +unfreeze the cgroup midway through the checkpoint system call. Note that,
>> +unlike "FROZEN" and "FREEZING", there is no corresponding "CHECKPOINTED"
>> +state.
>> diff --git a/include/linux/freezer.h b/include/linux/freezer.h
>> index da7e52b..1402911 100644
>> --- a/include/linux/freezer.h
>> +++ b/include/linux/freezer.h
>> @@ -64,12 +64,22 @@ extern bool freeze_task(struct task_struct *p, bool sig_only);
>> extern void cancel_freezing(struct task_struct *p);
>>
>> #ifdef CONFIG_CGROUP_FREEZER
>> +struct cgroup_subsys_state;
>> extern int cgroup_freezing_or_frozen(struct task_struct *task);
>> +extern struct cgroup_subsys_state *get_task_cgroup_freezer(struct task_struct *task);
>> +extern int cgroup_freezer_begin_checkpoint(struct cgroup_subsys_state *css);
>> +extern void cgroup_freezer_end_checkpoint(struct cgroup_subsys_state *css);
>> #else /* !CONFIG_CGROUP_FREEZER */
>> static inline int cgroup_freezing_or_frozen(struct task_struct *task)
>> {
>> return 0;
>> }
>> +static inline int cgroup_freezer_begin_checkpoint(struct task_struct *task)
>> +{
>> + return -ENOTSUPP;
>> +}
>> +static inline void cgroup_freezer_end_checkpoint(struct task_struct *task)
>> +{}
>
> With CONFIG_CHECKPOINT depending on CONFIG_FREEZER I don't see why these
> empty definitions are needed. Maybe it's just too late in the evening...
>
>> #endif /* !CONFIG_CGROUP_FREEZER */
>>
>> /*
>> diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
>> index 05795b7..4790fb9 100644
>> --- a/kernel/cgroup_freezer.c
>> +++ b/kernel/cgroup_freezer.c
>> @@ -25,6 +25,7 @@ enum freezer_state {
>> CGROUP_THAWED = 0,
>> CGROUP_FREEZING,
>> CGROUP_FROZEN,
>> + CGROUP_CHECKPOINTING,
>> };
>>
>> struct freezer {
>> @@ -47,6 +48,18 @@ static inline struct freezer *task_freezer(struct task_struct *task)
>> struct freezer, css);
>> }
>>
>> +struct cgroup_subsys_state *get_task_cgroup_freezer(struct task_struct *task)
>> +{
>> + struct cgroup_subsys_state *css;
>> +
>> + task_lock(task);
>> + css = task_subsys_state(task, freezer_subsys_id);
>> + css_get(css); /* make sure freezer doesn't go away */
>> + task_unlock(task);
>> +
>> + return css;
>> +}
>
> Seems like there should be a better way to do this than grabbing
> this reference. I'd prefer to just introduce:
>
> int in_same_cgroup_freezer(struct task_struct *task1,
> struct task_struct *task2);
>
> In the external checkpoint case the root task is frozen and hence cannot
> change cgroups.
Yeah, you and Serge are right. This reference is not well justified.
So we can revert this change and only add in_same_cgroup_freezer().
I'll change it now in ckpt-v16-dev. Are you going to resubmit the
patchset ? If you leave out the part that adds the calls to begin
and end a checkpoint, it would be relevant regardless of the c/r
patchset in question.
Thanks,
Oren.
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
More information about the Devel
mailing list