[Devel] Re: [RFC][PATCH] ns: Syscalls for better namespace sharing control.

Eric W. Biederman ebiederm at xmission.com
Thu Mar 4 13:45:23 PST 2010


ebiederm at xmission.com (Eric W. Biederman) writes:

> "Serge E. Hallyn" <serue at us.ibm.com> writes:
>
>> Quoting Eric W. Biederman (ebiederm at xmission.com):
>>> Sukadev Bhattiprolu <sukadev at linux.vnet.ibm.com> writes:
>>> 
>>> > Eric W. Biederman [ebiederm at xmission.com] wrote:
>>> > | 
>>> > | I think replacing a struct pid for another struct pid allocated in
>>> > | descendant pid_namespace (but has all of the same struct upid values
>>> > | as the first struct pid) is a disastrous idea.  It destroys the
>>> >
>>> > True. Sorry, I did not mean we would need a new 'struct pid' for an
>>> > existing process. I think we talked earlier of finding a way of attaching
>>> > additional pid numbers to the same struct pid.
>>> 
>>> I just played with this and if you make the semantics of unshare(CLONE_NEWPID)
>>> to be that you become the idle task aka pid 0, and not the init task pid 1 the
>>> implementation is trivial.
>>
>> Heh, and then (browsing through your copy_process() patch hunks) the next
>> forked task becomes the child reaper for the new pidns?  <shrug>  why not
>> I guess.
>>
>> Now if that child reaper then gets killed, will the idle task get killed too?
>
> No.
>
>> And if not, then idle task can just re-populating the new pidns with new
>> idle tasks...
>
> After zap_pid_namespace interesting...
>
>> If this brought us a step closer to entering an existing pidns that would
>> be one thing, but is there actually any advantage to being able to
>> unshare a new pidns?  Oh, I guess there is - PAM can then use it at
>> login, which might be neat.
>
> I have to say that the semantics of my patch are unworkable for
> unshare.  Unless I am mistaken for PAM to use it requires that the
> current process fully change and become what it needs to be.
> Requiring an extra fork to fully complete the process is a problem.
>
> Scratch one bright idea.

Maybe not.  I just looked and in the vast majority of cases the login
process goes like this.

{
	setup stuff include pam
	child = fork();
	if (!child) {
		setuid()
                exec /bin/bash
        }
        waitpid(child);
        
        pam and other cleanup
}

So an unshare of the pid namespace that doesn't really take effect
until we fork may actually be usable from pam, and in fact is probably
the preferred implementation.  It looks like neither openssh nor login
from util-linux-ng will cope properly with getting any pid back from
wait() except the pid of their child.  It looks like they both with
terminate.  Which means if you login in a new pid namespace (where the
unsharing process becomes pid 1) and call nohup everything will get
killed and you will be logged out.

Eric
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers




More information about the Devel mailing list