[Devel] Re: [RFC][PATCH] ns: Syscalls for better namespace sharing control.
Daniel Lezcano
daniel.lezcano at free.fr
Thu Feb 25 14:13:00 PST 2010
Eric W. Biederman wrote:
> Daniel Lezcano <daniel.lezcano at free.fr> writes:
>
>
>> Eric W. Biederman wrote:
>>
>>> Introduce two new system calls:
>>> int nsfd(pid_t pid, unsigned long nstype);
>>> int setns(unsigned long nstype, int fd);
>>>
>>> These two new system calls address three specific problems that can
>>> make namespaces hard to work with.
>>> - Namespaces require a dedicated process to pin them in memory.
>>> - It is not possible to use a namespace unless you are the
>>> child of the original creator.
>>> - Namespaces don't have names that userspace can use to talk
>>> about them.
>>>
>>> The nsfd() system call returns a file descriptor that can
>>> be used to talk about a specific namespace, and to keep
>>> the specified namespace alive.
>>>
>>> The fd returned by nsfd() can be bind mounted as:
>>> mount --bind /proc/self/fd/N /some/filesystem/path
>>> to keep the namespace alive indefinitely as long as
>>> it is mounted.
>>>
>>> open works on the fd returned by nsfd() so another
>>> process can get a hold of it and do interesting things.
>>>
>>> Overall that allows for persistent naming of namespaces
>>> according to userspace policy.
>>>
>>> setns() allows changing the namespace of the current process
>>> to a namespace that originates with nsfd().
>>>
>>> Signed-off-by: Eric W. Biederman <ebiederm at xmission.com>
>>> ---
>>>
>>>
>> Is it planned to support all the namespaces for 'nsfd' ?
>> I mean will it be possible to specify an Or'ed combination of nstype to grab a
>> reference for several namespaces at a time of the targeted process ?
>>
>> for example : nsfd( 1234, NSTYPE_NET | NSTYPE_IPC, NSTYPE_MNT)
>>
>
> No, the plan is only one namespace at a time.
>
> It would not be much of a change to support multiple namespaces,
> but I don't think I want to go there. Bitmaps filling up are
> ugly and I don't see what would be gained.
>
The idea I had in mind when I asked this question was if we can "move" a
process inside a container, aka a set of namespaces :)
> I does make sense to support all of the namespaces we can support
> with unshare, but with nstype as an enumeration not as a bitmap.
>
I suppose when you say "to support all of the namespaces we can support
with *unshare*", you exclude the pid namespace which is created only
with clone, right ? Do you think we can extend the concept to all the
namespaces including the pid_namespace ?
> This is slightly better than the earlier version that used a netlink
> socket as the reference as I can give it the semantics of a deleted
> file and only when that file goes away drop the reference on the
> namespace. It is also better in that this interface can support all
> of the namespaces, without adding yet another syscall.
>
I like the idea :)
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
More information about the Devel
mailing list