[Devel] Re: [PATCH linux-cr] nsproxy: record ambient namespaces

Serge E. Hallyn serue at us.ibm.com
Tue Mar 2 10:42:53 PST 2010


Quoting Oren Laadan (orenl at cs.columbia.edu):
> 
> Applied.
> 
> Serge E. Hallyn wrote:
> >The nsproxy restore path recognizes that an objref of 0 for
> >ipc or uts ns means don't unshare it.  But the checkpoint side
> >forgot to write down 0 when the ipc or uts ns isn't unshared!
> >
> >Fix that.
> >
> >To test, run a program with a private pidns but shared utsns
> >which does
> >
> >	sleep(5);
> >	sethostname("serge", 6);
> >
> >checkpoint it, reset your hostname (if you let the program
> >complete), then restart the program: without this patch, it
> >will not reset your hostname.  It should, and with this patch
> >it will.
> >
> >Signed-off-by: Serge E. Hallyn <serue at us.ibm.com>
> >---
> > kernel/nsproxy.c |   19 +++++++++++++------
> > 1 files changed, 13 insertions(+), 6 deletions(-)
> >
> >diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
> >index 0da0d83..dcb502c 100644
> >--- a/kernel/nsproxy.c
> >+++ b/kernel/nsproxy.c
> >@@ -280,13 +280,20 @@ static int do_checkpoint_ns(struct ckpt_ctx *ctx, struct nsproxy *nsproxy)
> > 	if (!h)
> > 		return -ENOMEM;
> >-	ret = checkpoint_obj(ctx, nsproxy->uts_ns, CKPT_OBJ_UTS_NS);
> >-	if (ret <= 0)
> >-		goto out;
> >+	ret = 0;
> >+	if (nsproxy->uts_ns != ctx->root_nsproxy->uts_ns) {
> >+		ret = checkpoint_obj(ctx, nsproxy->uts_ns, CKPT_OBJ_UTS_NS);
> >+		if (ret <= 0)
> >+			goto out;
> >+	}
> > 	h->uts_objref = ret;
> >-	ret = checkpoint_obj(ctx, nsproxy->ipc_ns, CKPT_OBJ_IPC_NS);
> >-	if (ret < 0)
> >-		goto out;
> >+
> >+	ret = 0;
> >+	if (nsproxy->ipc_ns != ctx->root_nsproxy->ipc_ns) {
> >+		ret = checkpoint_obj(ctx, nsproxy->ipc_ns, CKPT_OBJ_IPC_NS);
> >+		if (ret < 0)
> >+			goto out;
> >+	}
> > 	h->ipc_objref = ret;
> > 	/* FIXME: for now, only marked visited to pacify leaks */

All right, tihs patch was not right.  What we should be checking
is whether nsproxy->uts_ns != ctx->root_task->parent->nsproxy->uts_ns.
But I don't want to just send the patch to do that until we discuss
whether that is the right thing to do.

Let me give a precise definition:  I call an 'ambient namespace' a
namespace which was not unshared when the container was created.
Unfortunately there isn't really a reliable way to tell whether that
was the case.  Checking container_init->parent may depend upon the
container init not having been reparented.

So as I see it we can do three things:

1. always unshare any namespace which was not empty at checkpoint.
So if the container was not unshared from host, and we checkpoint
members of that namespace, then at restart we will restart in an
unshared namespace and recreate the objects.  That basically means
undo the patch I originally sent.

That means that if the restarted task does 'hostname' it may end
up not affecting the hosts's hostname, even if it was originally
started on the host without separate utsns.  Maybe that's what we
want?

2. use the simple 'nsproxy->uts_ns != ctx->root_task->parent->nsproxy->uts_ns'
test.  I think that would be pretty reliable.

3. for each namespace in ctx->root_nsproxy, check whether there are
any leaks, and, if so, mark it in the checkpoing image header so that
we can give restart a hint that it might not want to unshare those.

-serge
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers




More information about the Devel mailing list