[Devel] Re: [RFC][PATCH] ns: Syscalls for better namespace sharing control.

Eric W. Biederman ebiederm at xmission.com
Thu Feb 25 13:49:48 PST 2010


Daniel Lezcano <daniel.lezcano at free.fr> writes:

> Eric W. Biederman wrote:
>> Introduce two new system calls:
>> int nsfd(pid_t pid, unsigned long nstype);
>> int setns(unsigned long nstype, int fd);
>>
>> These two new system calls address three specific problems that can
>> make namespaces hard to work with.
>> - Namespaces require a dedicated process to pin them in memory.
>> - It is not possible to use a namespace unless you are the
>>   child of the original creator.
>> - Namespaces don't have names that userspace can use to talk
>>   about them.
>>
>> The nsfd() system call returns a file descriptor that can
>> be used to talk about a specific namespace, and to keep
>> the specified namespace alive.
>>
>> The fd returned by nsfd() can be bind mounted as:
>> mount --bind /proc/self/fd/N /some/filesystem/path
>> to keep the namespace alive indefinitely as long as
>> it is mounted.
>>
>> open works on the fd returned by nsfd() so another
>> process can get a hold of it and do interesting things.
>>
>> Overall that allows for persistent naming of namespaces
>> according to userspace policy.
>>
>> setns() allows changing the namespace of the current process
>> to a namespace that originates with nsfd().
>>
>> Signed-off-by: Eric W. Biederman <ebiederm at xmission.com>
>> ---
>>   
>
> Is it planned to support all the namespaces for 'nsfd' ?
> I mean will it be possible to specify an Or'ed combination of nstype to grab a
> reference for several namespaces at a time of the targeted process ?
>
> for example : nsfd( 1234, NSTYPE_NET | NSTYPE_IPC, NSTYPE_MNT)

No, the plan is only one namespace at a time.

It would not be much of a change to support multiple namespaces,
but I don't think I want to go there.  Bitmaps filling up are
ugly and I don't see what would be gained.

I does make sense to support all of the namespaces we can support
with unshare, but with nstype as an enumeration not as a bitmap.

This is slightly better than the earlier version that used a netlink
socket as the reference as I can give it the semantics of a deleted
file and only when that file goes away drop the reference on the
namespace.  It is also better in that this interface can support all
of the namespaces, without adding yet another syscall.

Eric
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers




More information about the Devel mailing list