[Devel] Re: strict isolation of net interfaces
Eric W. Biederman
ebiederm at xmission.com
Fri Jun 30 11:09:44 PDT 2006
Daniel Lezcano <dlezcano at fr.ibm.com> writes:
> Serge E. Hallyn wrote:
>> Quoting Cedric Le Goater (clg at fr.ibm.com):
>>
>>>we could work on virtualizing the net interfaces in the host, map them to
>>>eth0 or something in the guest and let the guest handle upper network layers ?
>>>
>>>lo0 would just be exposed relying on skbuff tagging to discriminate traffic
>>>between guests.
>> This seems to me the preferable way. We create a full virtual net
>> device for each new container, and fully virtualize the device
>> namespace.
>
Answers with respect to how I see layer 2 isolation,
with network devices and sockets as well as the associated routing
information given per namespace.
> I have a few questions about all the network isolation stuff:
>
> * What level of isolation is wanted for the network ? network devices ?
> IPv4/IPv6 ? TCP/UDP ?
>
> * How is handled the incoming packets from the network ? I mean what will be
> mecanism to dispatch the packet to the right virtual device ?
Wrong question. A better question is to ask how do you know which namespace
a packet is in.
Answer: By looking at which device or socket it just came from.
How do you get a packet into a non-default namespace?
Either you move a real network interface into that namespace.
Or you use a tunnel device that shows up as two network interfaces in
two different namespaces.
Then you route, or bridge packets between the two. Trivial.
> * How to handle the SO_BINDTODEVICE socket option ?
Just like we do now.
> * Has the virtual device a different MAC address ?
All network devices are abstractions of the hardware so they are all
sort of virtual. My implementation of a tunnel device has a mac
address so I can use it with ethernet bridging but that isn't a hard
requirement. And yes the mac address is different because you can't
do layer 2 switching if everyone has the same mac address.
But there is no special ``virtual'' device.
> How to manage it with the real MAC address on the system ?
Manage?
> How to manage ARP, ICMP, multicasting and IP ?
Like you always do. It would be a terrible implementation if
we had to change that logic. There is a little bit of that
where we need to detect which network namespace we are going to because
the answers can differ but that is pretty straight forward.
> It seems for me, IMHO that will require a lot of translation and browsing
> table. It will probably add a very significant overhead.
Then look at:
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/linux-2.6-ns.git#proof-of-concept
or the OpenVZ implementation.
It isn't serious overhead.
> * How to handle NFS access mounted outside of the container ?
The socket should remember it's network namespace.
It works fine.
> * How to handle ICMP_REDIRECT ?
Just like we always do?
Eric
More information about the Devel
mailing list