[Users] infiniband support in openvz containers
knawnd at gmail.com
knawnd at gmail.com
Sat Jul 2 02:04:58 PDT 2016
Hello all!
I've decided to try IP over IB. There are two servers (blade08 and blade09) which have eth0 and ib0
NICs. The problem is
that two CTs deployed on different servers can't see each other (check by ping command).
CT 152 deployed on host blade08 with ib0 IP address 10.1.36.18 can ping that remote host but can't
ping any CT deployed on that remote host.
Whereas CT 120 deployed on host blade09 with ib0 IP address 10.1.36.19 can't ping neither remote
host blade08 nor CT deployed on that remote host (CT152). Two servers can ping each other via ib0
interface (checked by 'ping -I ib0 <remote server IP>' command). Iptables is turned off on both
servers and all CTs.
The configuration is the following.
[root at blade09 ~]# ifconfig
eth0 Link encap:Ethernet HWaddr **************************
inet addr:192.93.36.239 Bcast:192.93.36.255 Mask:255.255.255.0
inet6 addr: **********************/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:364749 errors:0 dropped:0 overruns:0 frame:0
TX packets:146504 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:204560009 (195.0 MiB) TX bytes:148608765 (141.7 MiB)
Memory:c7420000-c743ffff
ib0 Link encap:InfiniBand HWaddr *****************************************
inet addr:10.1.36.19 Bcast:10.1.36.255 Mask:255.255.255.0
inet6 addr: **********************/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1
RX packets:15567403 errors:0 dropped:0 overruns:0 frame:0
TX packets:34644634 errors:0 dropped:2098 overruns:0 carrier:0
collisions:0 txqueuelen:256
RX bytes:315873847681 (294.1 GiB) TX bytes:360607196041 (335.8 GiB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:12 errors:0 dropped:0 overruns:0 frame:0
TX packets:12 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:1004 (1004.0 b) TX bytes:1004 (1004.0 b)
venet0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
inet6 addr: fe80::1/128 Scope:Link
UP BROADCAST POINTOPOINT RUNNING NOARP MTU:1500 Metric:1
RX packets:46172203 errors:0 dropped:0 overruns:0 frame:0
TX packets:46172173 errors:0 dropped:3 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:49753326008 (46.3 GiB) TX bytes:49753323488 (46.3 GiB)
The same configuration on the second server (blade08) but it has different IPs: eth0 -
192.93.36.238, ib0 - 10.1.36.18.
On each server I deployed CTs:
[root at blade08 ~]# vzlist -a
CTID NPROC STATUS IP_ADDR HOSTNAME
152 11 running 10.1.36.52 ct152
[root at blade09 ~]# vzlist -a
CTID NPROC STATUS IP_ADDR HOSTNAME
120 11 running 10.1.36.50 ct150
121 11 running 10.1.36.51 ct151
[root at blade09 ~]# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
10.1.36.51 0.0.0.0 255.255.255.255 UH 0 0 0 venet0
10.1.36.50 0.0.0.0 255.255.255.255 UH 0 0 0 venet0
10.1.36.0 0.0.0.0 255.255.255.0 U 0 0 0 ib0
192.93.36.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
169.254.0.0 0.0.0.0 255.255.0.0 U 1002 0 0 eth0
0.0.0.0 192.93.36.1 0.0.0.0 UG 0 0 0 eth0
[root at blade08 ~]# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
10.1.36.52 0.0.0.0 255.255.255.255 UH 0 0 0 venet0
10.1.36.0 0.0.0.0 255.255.255.0 U 0 0 0 ib0
192.93.36.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
169.254.0.0 0.0.0.0 255.255.0.0 U 1002 0 0 eth0
0.0.0.0 192.93.36.1 0.0.0.0 UG 0 0 0 eth0
[root at blade08 ~]# grep -i forward /etc/sysctl.conf
# Controls IP packet forwarding
net.ipv4.ip_forward = 1
[root at blade09 ~]# grep -i forward /etc/sysctl.conf
# Controls IP packet forwarding
net.ipv4.ip_forward = 1
[root at blade08 ~]# grep ^VE_ROUTE_SRC_DEV /etc/vz/vz.conf
VE_ROUTE_SRC_DEV="ib0"
[root at blade09 ~]# grep ^VE_ROUTE_SRC_DEV /etc/vz/vz.conf
VE_ROUTE_SRC_DEV="ib0"
$ rpm -qa|egrep "vzctl|vzkernel"
vzkernel-2.6.32-042stab116.2.x86_64
vzctl-core-4.9.4-1.x86_64
vzctl-4.9.4-1.x86_64
All servers (blade08-09) and all CTs have CentOS 6.8 x86_64 deployed.
I would appreciate any hint on that issue.
Best regards,
Nikolay.
knawnd at gmail.com wrote on 01.07.2016 10:36:
> Hello all!
>
> I am trying to evaluate a possibility to dynamically deploy HPC clusters in the cloud using
> OpenVZ containers and Infiniband (IB). I couldn't found on the web a proper way to do that. What
> I would like to achieve is the same functionality as one can get with ethernet NIC but with IB
> device, i.e. share the same IB device for venet CTs' network interfaces. So I wonder if it's
> possible at all to use IB for such kind of purposes. And if it is possible then what steps needs
> to be performed.
>
> Best regards, Nikolay.
More information about the Users
mailing list