[Devel] Re: [PATCH 5/6] c/r: correctly restore pgid
Oren Laadan
orenl at librato.com
Tue Sep 8 06:33:04 PDT 2009
Serge E. Hallyn wrote:
> Quoting Oren Laadan (orenl at librato.com):
>> The main challenge with restoring the pgid of tasks is that the
>> original "owner" (the process with that pid) might have exited
>> already. I call these "ghost" pgids. 'mktree' does create these
>> processes, but they then exit without participating in the restart.
>>
>> To solve this, this patch introduces a RESTART_GHOST flag, used for
>> "ghost" owners that are created only to pass their pgid to other
>> tasks. ('mktree' now makes them call restart(2) instead of exiting).
>>
>> When a "ghost" task calls restart(2), it will be placed on a wait
>> queue until the restart completes and then exit. This guarantees that
>> the pgid that it owns remains available for all (regular) restarting
>> tasks for when they need it.
>>
>> Regular tasks perform the restart as before, except that they also
>> now restore their old pgrp, which is guaranteed to exist.
>>
>> Changelog [v1]:
>> - Verify that pgid owner is a thread-group-leader.
>> - Handle the case of pgid/sid == 0 using root's parent pid-ns
>>
>> Signed-off-by: Oren Laadan <orenl at cs.colubmia.edu>
>> ---
>> checkpoint/process.c | 106 ++++++++++++++++++++++++-
>> checkpoint/restart.c | 158 ++++++++++++++++++++++++++------------
>> checkpoint/sys.c | 3 +-
>> include/linux/checkpoint.h | 11 ++-
>> include/linux/checkpoint_hdr.h | 3 +
>> include/linux/checkpoint_types.h | 6 +-
>> 6 files changed, 230 insertions(+), 57 deletions(-)
>>
>> diff --git a/checkpoint/process.c b/checkpoint/process.c
>> index 40b2580..5d6bdb9 100644
>> --- a/checkpoint/process.c
>> +++ b/checkpoint/process.c
>> @@ -23,6 +23,57 @@
>> #include <linux/syscalls.h>
>>
>>
>> +pid_t ckpt_pid_nr(struct ckpt_ctx *ctx, struct pid *pid)
>> +{
>> + return pid ? pid_nr_ns(pid, ctx->root_nsproxy->pid_ns) : CKPT_PID_NULL;
>> +}
>> +
>> +/* must be called with tasklist_lock or rcu_read_lock() held */
>> +struct pid *_ckpt_find_pgrp(struct ckpt_ctx *ctx, pid_t pgid)
>> +{
>> + struct task_struct *p;
>> + struct pid *pgrp;
>> +
>> + if (pgid == 0) {
>> + /*
>> + * At checkpoint the pgid owner lived in an ancestor
>> + * pid-ns. The best we can do (sanely and safely) is
>> + * to examine the parent of this restart's root: if in
>> + * a distinct pid-ns, use its pgrp; otherwise fail.
>> + */
>> + p = ctx->root_task->real_parent;
>> + if (p->nsproxy->pid_ns == current->nsproxy->pid_ns)
>> + return NULL;
>> + pgrp = task_pgrp(p);
>> + } else {
>> + /*
>> + * Find the owner process of this pgid (it must exist
>> + * if pgrp exists). It must be a thread group leader.
>> + */
>> + pgrp = find_vpid(pgid);
>> + p = pid_task(pgrp, PIDTYPE_PID);
>> + if (!p || !thread_group_leader(p))
>> + return NULL;
>> + /*
>> + * The pgrp must "belong" to our restart tree (compare
>> + * p->checkpoint_ctx to ours). This prevents malicious
>> + * input from (guessing and) using unrelated pgrps. If
>> + * the owner is dead, then it doesn't have a context,
>> + * so instead compare against its (real) parent's.
>> + */
>> + if (p->exit_state == EXIT_ZOMBIE)
>> + p = p->real_parent;
>> + if (p->checkpoint_ctx != ctx)
>> + return NULL;
>> + }
>> +
>> + if (task_session(current) != task_session(p))
>> + return NULL;
>> +
>> + return pgrp;
>> +}
>> +
>> +
>> #ifdef CONFIG_FUTEX
>> static void save_task_robust_futex_list(struct ckpt_hdr_task *h,
>> struct task_struct *t)
>> @@ -94,8 +145,8 @@ static int checkpoint_task_struct(struct ckpt_ctx *ctx, struct task_struct *t)
>> h->exit_signal = t->exit_signal;
>> h->pdeath_signal = t->pdeath_signal;
>>
>> - h->set_child_tid = t->set_child_tid;
>> - h->clear_child_tid = t->clear_child_tid;
>> + h->set_child_tid = (unsigned long) t->set_child_tid;
>
> note that set_child_tid is an int (signed), not a long. Same on
> x86, but not on other arches. Shouldn't lose info so could be worse.
{set,clear}_child_tid are both pointers to user space: it's an address
in userspace, so we save it as 'unsigned long'.
{clear,set}_child_tid is defined in include/linux/sched.h ... how can
it differ for different archs ?
>
> On the whole,
>
> Acked-by: Serge Hallyn <serue at us.ibm.com>
Thanks. I got a few fixes for the code piles up and now c/r of 'screen'
with a couple of shells is working :)
Oren.
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
More information about the Devel
mailing list