[Devel] Re: [RFC][PATCH] Improve NFS use of network and mount namespaces
Matt Helsley
matthltc at us.ibm.com
Tue May 12 18:05:45 PDT 2009
On Tue, May 12, 2009 at 05:01:58PM -0700, Eric W. Biederman wrote:
> Matt Helsley <matthltc at us.ibm.com> writes:
>
> > Sun RPC currently opens sockets from the initial network namespace making it
> > impossible to restrict which NFS servers a container may interact with.
> >
> > For example, the NFS server at 10.0.0.3 reachable from the initial namespace
> > will always be used even if an entirely different server with the address
> > 10.0.0.3 is reachable from a container's network namespace. Hence network
> > namespaces cannot be used to restrict the network access of a container as long
> > as the RPC code opens sockets using the initial network namespace. This is
> > in stark contrast to other protocols like HTTP where the sockets are created in
> > their proper namespaces because kernel threads are not used to open sockets for
> > client network IO.
> >
> > We may plausibly end up with namespaces created by:
> > I) The administrator may mount 10.0.0.3:/export_foo from init's
> > container, clone the mount namespace, and unmount from the original
> > mount namespace.
> >
> > II) The administrator may start a task which clones the mount namespace
> > before mounting 10.0.0.3:/export_foo.
> >
> > Proposed Solution:
> >
> > The network namespace of the task that did the mount best defines which server
> > the "administrator", whether in a container or not, expects to work with.
> > When the mount is done inside a container then that is the network namespace
> > to use. When the mount is done prior to creating the container then that's the
> > namespace that should be used.
> >
> > This allows system administrators to isolate network traffic generated by NFS
> > clients by mounting after creating a container. If partial isolation is desired
> > then the administrator may mount before creating a container with a new network
> > namespace. In each case the RPC packets would originate from a consistent
> > namespace.
> >
> > One way to ensure consistent namespace usage would be to hold a reference to
> > the original network namespace as long as the mount exists. This naturally
> > suggests storing the network namespace reference in the NFS superblock.
> > However, it may be better to store it with the RPC transport itself since
> > it is directly responsible for (re)opening the sockets.
> >
> > This patch adds a reference to the network namespace to the RPC
> > transport. When the NFS export is mounted the network namespace of
> > the current task establishes which namespace to reference. That
> > reference is stored in the RPC transport and used to open sockets
> > whenever a new socket is required.
>
> Matt. This may be the basis of something and the problem is real.
> However it is clear you have missed a lot of details.
Well crap. While I did not ignore all the RPC services I noticed
when I tried reading the NFS/RPC code, based on the response from Chuck,
you, and Trond, I clearly fucked up when I thought I had properly understood
how the RPC code works with the services that support NFS.
I figured that since RPC was the core of these services it would be a
good place to start trying to address the problem. It looked like the
RPC transport was a good place to deal with all of these services since
it's responsible for (re)opening the sockets needed to perform RPC IO.
But apparently the transport is not shared the way I thought it was :/..
> So could you first address this problem in nfs_get_sb by
> denying the mount if we are not in the initial network namespace.
>
> I.e.
>
> if (current->nsproxy->net_ns != &init_net)
> return -EINVAL;
>
> That should be a lot simpler to get right and at least give reliable
> and predictable semantics.
Yes, that seems like a reasonable preventitive measure for now.
-Matt
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
More information about the Devel
mailing list