[Devel] Re: [PATCH 4/4] cgroup freezer: --- replacement patch 4/4 (a)
Matt Helsley
matthltc at us.ibm.com
Thu Jun 4 03:34:18 PDT 2009
On Wed, Jun 03, 2009 at 08:10:24PM -0400, Oren Laadan wrote:
>
> From 6cdb7d9504a19ad88e6da8ad85374267c7acb1b2 Mon Sep 17 00:00:00 2001
> From: Matt Helsley <matthltc at us.ibm.com>
> Date: Wed, 3 Jun 2009 02:31:21 -0700
> Subject: [PATCH] cgroup freezer: Add CHECKPOINTING state to safeguard container checkpoint
>
> The CHECKPOINTING state prevents userspace from unfreezing tasks until
> sys_checkpoint() is finished. When doing container checkpoint userspace
> will do:
>
> echo FROZEN > /cgroups/my_container/freezer.state
> ...
> rc = sys_checkpoint( <pid of container root> );
>
> To ensure a consistent checkpoint image userspace should not be allowed
> to thaw the cgroup (echo THAWED > /cgroups/my_container/freezer.state)
> during checkpoint.
>
> "CHECKPOINTING" can only be set on a "FROZEN" cgroup using the checkpoint
> system call. Once in the "CHECKPOINTING" state, the cgroup may not leave until
> the checkpoint system call is finished and ready to return. Then the
> freezer state returns to "FROZEN". Writing any new state to freezer.state while
> checkpointing will return EBUSY. These semantics ensure that userspace cannot
> unfreeze the cgroup midway through the checkpoint system call.
>
> The cgroup_freezer_begin_checkpoint() and cgroup_freezer_end_checkpoint()
> make relatively few assumptions about the task that is passed in. However the
> way they are called in do_checkpoint() assumes that the root of the container
> is in the same freezer cgroup as all the other tasks that will be
> checkpointed.
>
> Matt Helsley's wrote the original patch.
>
> Changlog:
> [2009-Jun-03] change cgroup_freezer_{begin,end}_checkpoint() to take a
> struct cgroup_subsys_state pointer
> add struct cgroup_subsys_state *get_task_cgroup_freezer()
>
> Signed-off-by: Oren Laadan <orenl at cs.columbia.edu>
> Cc: Matt Helsley <matthltc at us.ibm.com>
> Cc: Paul Menage <menage at google.com>
> Cc: Li Zefan <lizf at cn.fujitsu.com>
> Cc: Cedric Le Goater <legoater at free.fr>
>
> Notes:
> As a side-effect this prevents the multiple tasks from entering the
> CHECKPOINTING state simultaneously. All but one will get -EBUSY.
> ---
> Documentation/cgroups/freezer-subsystem.txt | 10 ++
> include/linux/freezer.h | 10 ++
> kernel/cgroup_freezer.c | 149 +++++++++++++++++++--------
> 3 files changed, 127 insertions(+), 42 deletions(-)
>
> diff --git a/Documentation/cgroups/freezer-subsystem.txt b/Documentation/cgroups/freezer-subsystem.txt
> index 41f37fe..92b68e6 100644
> --- a/Documentation/cgroups/freezer-subsystem.txt
> +++ b/Documentation/cgroups/freezer-subsystem.txt
> @@ -100,3 +100,13 @@ things happens:
> and returns EINVAL)
> 3) The tasks that blocked the cgroup from entering the "FROZEN"
> state disappear from the cgroup's set of tasks.
> +
> +When the cgroup freezer is used to guard container checkpoint operations the
> +freezer.state may be "CHECKPOINTING". "CHECKPOINTING" can only be set on a
> +"FROZEN" cgroup using the checkpoint system call. Once in the "CHECKPOINTING"
> +state, the cgroup may not leave until the checkpoint system call returns the
> +freezer state to "FROZEN". Writing any new state to freezer.state while
> +checkpointing will return EBUSY. These semantics ensure that userspace cannot
> +unfreeze the cgroup midway through the checkpoint system call. Note that,
> +unlike "FROZEN" and "FREEZING", there is no corresponding "CHECKPOINTED"
> +state.
> diff --git a/include/linux/freezer.h b/include/linux/freezer.h
> index da7e52b..1402911 100644
> --- a/include/linux/freezer.h
> +++ b/include/linux/freezer.h
> @@ -64,12 +64,22 @@ extern bool freeze_task(struct task_struct *p, bool sig_only);
> extern void cancel_freezing(struct task_struct *p);
>
> #ifdef CONFIG_CGROUP_FREEZER
> +struct cgroup_subsys_state;
> extern int cgroup_freezing_or_frozen(struct task_struct *task);
> +extern struct cgroup_subsys_state *get_task_cgroup_freezer(struct task_struct *task);
> +extern int cgroup_freezer_begin_checkpoint(struct cgroup_subsys_state *css);
> +extern void cgroup_freezer_end_checkpoint(struct cgroup_subsys_state *css);
> #else /* !CONFIG_CGROUP_FREEZER */
> static inline int cgroup_freezing_or_frozen(struct task_struct *task)
> {
> return 0;
> }
> +static inline int cgroup_freezer_begin_checkpoint(struct task_struct *task)
> +{
> + return -ENOTSUPP;
> +}
> +static inline void cgroup_freezer_end_checkpoint(struct task_struct *task)
> +{}
With CONFIG_CHECKPOINT depending on CONFIG_FREEZER I don't see why these
empty definitions are needed. Maybe it's just too late in the evening...
> #endif /* !CONFIG_CGROUP_FREEZER */
>
> /*
> diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
> index 05795b7..4790fb9 100644
> --- a/kernel/cgroup_freezer.c
> +++ b/kernel/cgroup_freezer.c
> @@ -25,6 +25,7 @@ enum freezer_state {
> CGROUP_THAWED = 0,
> CGROUP_FREEZING,
> CGROUP_FROZEN,
> + CGROUP_CHECKPOINTING,
> };
>
> struct freezer {
> @@ -47,6 +48,18 @@ static inline struct freezer *task_freezer(struct task_struct *task)
> struct freezer, css);
> }
>
> +struct cgroup_subsys_state *get_task_cgroup_freezer(struct task_struct *task)
> +{
> + struct cgroup_subsys_state *css;
> +
> + task_lock(task);
> + css = task_subsys_state(task, freezer_subsys_id);
> + css_get(css); /* make sure freezer doesn't go away */
> + task_unlock(task);
> +
> + return css;
> +}
Seems like there should be a better way to do this than grabbing
this reference. I'd prefer to just introduce:
int in_same_cgroup_freezer(struct task_struct *task1,
struct task_struct *task2);
In the external checkpoint case the root task is frozen and hence cannot
change cgroups.
Regardless of that reference, I think the interaction between
sys_checkpoint() and the cgroup freezer as we have it right now is broken
in the self-checkpoint case. It doesn't work because the freezer allows a
task to freeze itself. So if it does simply:
echo FROZEN > /my/cgroup/freezer.state
then it will never reach sys_checkpoint. If it moves itself to a different
cgroup to avoid this then we need a new way to determine which cgroup to
move from FROZEN to CHECKPOINTING.
Alternatively, we can populate each group with a new freezer file besides
freezer.state. Depending on what seems best it could record which task(s)
to not freeze (ugh). Or it could be a flags file indicating that if the
task writing FROZEN to freezer.state belongs to this cgroup then it
shouldn't be frozen.
I think removing the ability to freeze oneself may not be a desirable
change to other users of the freezer interface. It also wouldn't be
backward-compatible.
Cheers,
-Matt Helsley
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
More information about the Devel
mailing list