[Users] infiniband support in openvz containers

knawnd at gmail.com knawnd at gmail.com
Sat Jul 2 02:04:58 PDT 2016


Hello all!

I've decided to try IP over IB. There are two servers (blade08 and blade09) which have eth0 and ib0 
NICs. The problem is
that two CTs deployed on different servers can't see each other (check by ping command).
CT 152 deployed on host blade08 with ib0 IP address 10.1.36.18 can ping that remote host but can't
ping any CT deployed on that remote host.
Whereas CT 120 deployed on host blade09 with ib0 IP address 10.1.36.19 can't ping neither remote
host blade08 nor CT deployed on that remote host (CT152). Two servers can ping each other via ib0
interface (checked by 'ping -I ib0 <remote server IP>' command). Iptables is turned off on both
servers and all CTs.

The configuration is the following.
[root at blade09 ~]# ifconfig
eth0      Link encap:Ethernet  HWaddr **************************
           inet addr:192.93.36.239  Bcast:192.93.36.255  Mask:255.255.255.0
           inet6 addr: **********************/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:364749 errors:0 dropped:0 overruns:0 frame:0
           TX packets:146504 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:204560009 (195.0 MiB)  TX bytes:148608765 (141.7 MiB)
           Memory:c7420000-c743ffff

ib0       Link encap:InfiniBand  HWaddr *****************************************
           inet addr:10.1.36.19  Bcast:10.1.36.255  Mask:255.255.255.0
           inet6 addr: **********************/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:65520  Metric:1
           RX packets:15567403 errors:0 dropped:0 overruns:0 frame:0
           TX packets:34644634 errors:0 dropped:2098 overruns:0 carrier:0
           collisions:0 txqueuelen:256
           RX bytes:315873847681 (294.1 GiB)  TX bytes:360607196041 (335.8 GiB)

lo        Link encap:Local Loopback
           inet addr:127.0.0.1  Mask:255.0.0.0
           inet6 addr: ::1/128 Scope:Host
           UP LOOPBACK RUNNING  MTU:65536  Metric:1
           RX packets:12 errors:0 dropped:0 overruns:0 frame:0
           TX packets:12 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:1004 (1004.0 b)  TX bytes:1004 (1004.0 b)

venet0    Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
           inet6 addr: fe80::1/128 Scope:Link
           UP BROADCAST POINTOPOINT RUNNING NOARP  MTU:1500  Metric:1
           RX packets:46172203 errors:0 dropped:0 overruns:0 frame:0
           TX packets:46172173 errors:0 dropped:3 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:49753326008 (46.3 GiB)  TX bytes:49753323488 (46.3 GiB)

The same configuration on the second server (blade08) but it has different IPs: eth0 -
192.93.36.238, ib0 - 10.1.36.18.

On each server I deployed CTs:
[root at blade08 ~]# vzlist -a
       CTID      NPROC STATUS    IP_ADDR         HOSTNAME
        152         11 running   10.1.36.52      ct152

[root at blade09 ~]# vzlist -a
       CTID      NPROC STATUS    IP_ADDR         HOSTNAME
        120         11 running   10.1.36.50      ct150
        121         11 running   10.1.36.51      ct151

[root at blade09 ~]# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
10.1.36.51      0.0.0.0         255.255.255.255 UH    0      0        0 venet0
10.1.36.50      0.0.0.0         255.255.255.255 UH    0      0        0 venet0
10.1.36.0       0.0.0.0         255.255.255.0   U     0      0        0 ib0
192.93.36.0     0.0.0.0         255.255.255.0   U     0      0        0 eth0
169.254.0.0     0.0.0.0         255.255.0.0     U     1002   0        0 eth0
0.0.0.0         192.93.36.1     0.0.0.0         UG    0      0        0 eth0

[root at blade08 ~]# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
10.1.36.52      0.0.0.0         255.255.255.255 UH    0      0        0 venet0
10.1.36.0       0.0.0.0         255.255.255.0   U     0      0        0 ib0
192.93.36.0     0.0.0.0         255.255.255.0   U     0      0        0 eth0
169.254.0.0     0.0.0.0         255.255.0.0     U     1002   0        0 eth0
0.0.0.0         192.93.36.1     0.0.0.0         UG    0      0        0 eth0


[root at blade08 ~]# grep -i forward /etc/sysctl.conf
# Controls IP packet forwarding
net.ipv4.ip_forward = 1

[root at blade09 ~]# grep -i forward /etc/sysctl.conf
# Controls IP packet forwarding
net.ipv4.ip_forward = 1

[root at blade08 ~]# grep ^VE_ROUTE_SRC_DEV /etc/vz/vz.conf
VE_ROUTE_SRC_DEV="ib0"

[root at blade09 ~]# grep ^VE_ROUTE_SRC_DEV /etc/vz/vz.conf
VE_ROUTE_SRC_DEV="ib0"


$ rpm -qa|egrep "vzctl|vzkernel"
vzkernel-2.6.32-042stab116.2.x86_64
vzctl-core-4.9.4-1.x86_64
vzctl-4.9.4-1.x86_64

All servers (blade08-09) and all CTs have CentOS 6.8 x86_64 deployed.

I would appreciate any hint on that issue.

Best regards,
Nikolay.


knawnd at gmail.com wrote on 01.07.2016 10:36:
> Hello all!
>
> I am trying to evaluate a possibility to dynamically deploy HPC clusters in the cloud using
> OpenVZ containers and Infiniband (IB). I couldn't found on the web a proper way to do that. What
> I would like to achieve is the same functionality as one can get with ethernet NIC but with IB
> device, i.e. share the same IB device for venet CTs' network interfaces. So I wonder if it's
> possible at all to use IB for such kind of purposes. And if it is possible then what  steps needs
> to be performed.
>
> Best regards, Nikolay.


More information about the Users mailing list