[Devel] [PATCH RH7 1/2] net/vxlan: enable support and autoload in a container

Pavel Tikhomirov ptikhomirov at virtuozzo.com
Thu Oct 27 08:47:10 PDT 2016


I managed to create reproducer for the mentioned problem, it fails as 
expected on 4.7.7-200.fc24.x86_64, so ifindex problem is indeed 
mainstream one.

bridge_gatway_cidr='10.0.0.1/24'
container1_ip_cidr='10.0.0.3/24'
container1_mac_addr='02:42:0a:00:00:03'
container2_ip='10.0.0.2'
container2_mac_addr='02:42:0a:00:00:02'
# Some actual address from your hosts local net
container2_host_ip='10.94.72.162'
vxlan_id=42

set -x

ip netns add ct-net
ip netns add vx-net
ip netns exec vx-net ip link add dev br1 type bridge

# vxlan1 created in host netns with port 4789 and moved to vx-net
ip link add dev vxlan-tmp-1 type vxlan id $vxlan_id l2miss l3miss proxy 
learning dstport 4789
ip link set vxlan-tmp-1 netns vx-net
ip netns exec vx-net ip link set dev vxlan-tmp-1 name vxlan1

ip netns exec vx-net brctl addif br1 vxlan1

# veth1:eth1 pair connects vx-net and ct-net
ip link add dev vetha1 mtu 1450 type veth peer name vetha2 mtu 1450
ip link set dev vetha1 netns vx-net
ip netns exec vx-net ip link set dev vetha1 name veth1
ip netns exec vx-net brctl addif br1 veth1
ip netns exec vx-net ip addr add dev br1 $bridge_gatway_cidr
ip netns exec vx-net ip link set vxlan1 up
ip netns exec vx-net ip link set veth1 up
ip netns exec vx-net ip link set br1 up

ip link set dev vetha2 netns ct-net
ip netns exec ct-net ip link set dev vetha2 name eth1 address 
$container1_mac_addr
ip netns exec ct-net ip addr add dev eth1 $container1_ip_cidr
ip netns exec ct-net ip link set dev eth1 up

ip netns exec vx-net ip neighbor add $container2_ip lladdr 
$container2_mac_addr dev vxlan1 nud permanent
# Will see no packets, after remove "via vxlan1" will see VXLAN ICMP 
echo requests.
ip netns exec vx-net bridge fdb add $container2_mac_addr dev vxlan1 self 
dst $container2_host_ip vni $vxlan_id port 4789 via vxlan1

ip netns exec ct-net ping $container2_ip &
tcpdump -i enp0s31f6 dst 10.94.72.162

ip netns del vx-net
ip netns del ct-net

On 10/26/2016 06:14 PM, Pavel Tikhomirov wrote:
> vxlan is safe in CT as:
>
> 1) Udp multicast socket to connect to outer word sits in creation net-
> namespace, and these socket can get packets only forwarded/routed
> in creation ns.
>
> 2) Vxlan device is owned by second netns(could be same as first) as
> any other network device, so same all packets come to it are from
> the same ns.
>
> 3) Vxlans logic works through vxlan_net placed on creation netns,
> vxlan_fdb and vxlan_rdst are per vxlan device. Thus entries can
> not intersec with entries from host and other CTs.
>
> * One problem I can see now is adding fdb with ifindex(index of
> device to route packets from UDP socket through) after vxlan is
> moved to second namespace in vxlan_fdb_parse we use second
> namespace to check ifindex by device lookup, but in
> vxlan_xmit_one->ip_route_output_key->...->__ip_route_output_key
> we use first(creation) namespace to lookup device and probably
> will fail. So all fdb configuration should go before moving to
> ns. Same is in mainstream AFAICS.
>
> https://jira.sw.ru/browse/PSBM-53629
>
> Signed-off-by: Pavel Tikhomirov <ptikhomirov at virtuozzo.com>
>
> ---
>  drivers/net/vxlan.c | 1 +
>  kernel/kmod.c       | 1 +
>  2 files changed, 2 insertions(+)
>
> diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
> index fd2516d..8e89665 100644
> --- a/drivers/net/vxlan.c
> +++ b/drivers/net/vxlan.c
> @@ -2367,6 +2367,7 @@ static void vxlan_setup(struct net_device *dev)
>
>  	dev->vlan_features = dev->features;
>  	dev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX;
> +	dev->features |= NETIF_F_VIRTUAL;
>  	dev->hw_features |= NETIF_F_SG | NETIF_F_HW_CSUM | NETIF_F_RXCSUM;
>  	dev->hw_features |= NETIF_F_GSO_SOFTWARE;
>  	dev->hw_features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX;
> diff --git a/kernel/kmod.c b/kernel/kmod.c
> index e0ef148..63748d4 100644
> --- a/kernel/kmod.c
> +++ b/kernel/kmod.c
> @@ -421,6 +421,7 @@ static const char * const ve0_allowed_mod[] = {
>  	"ip_set_list:set",
>
>  	"rtnl-link-dummy",
> +	"rtnl-link-vxlan",
>  };
>
>  /*
>

-- 
Best regards, Tikhomirov Pavel
Software Developer, Virtuozzo.


More information about the Devel mailing list