<div dir="ltr">On Wed, Jan 30, 2013 at 11:55 AM, P J <span dir="ltr"><<a href="mailto:pauljflists@gmail.com" target="_blank">pauljflists@gmail.com</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div dir="ltr">Hey Guys,<div><br></div><div>I've spent days and countless hours trying to figure out what is going on here and I'm totally out of ideas, I've tried posting on the openvz forum but for some reason the post is not approved, so now I'm reaching out to you all.</div>
<div><br></div><div>I've even emailed OpenVZ for paid commercial support, I'm happy to pay for support for this issue as we have been using OpenVZ for many years problem free on many servers.</div><div>
<br></div><div>Here is the issue:</div><div><br></div><div>One of oru CentOS 5.9 x86_64 servers, last weekend I upgraded the kernel from 2.6.18-194.8.1.el5.028stab070.4 to the latest</div><div>2.6.18-308.8.2.el5.028stab101.1. (we use ksplice so I don't have to reboot the HN too often)</div>
<div><br></div><div><div>Installed VZ packages:</div><div><br></div><div><div>vzctl-core-4.1.2-1</div><div>ovzkernel-2.6.18-308.8.2.el5.028stab101.1</div><div>vzctl-4.1.2-1</div><div>ovzkernel-2.6.18-194.8.1.el5.028stab070.4</div>
<div>ovzkernel-devel-2.6.18-308.8.2.el5.028stab101.1</div><div>vzquota-3.1-1</div><div>ovzkernel-2.6.18-308.el5.028stab099.3</div><div>ovzkernel-2.6.18-238.9.1.el5.028stab089.1</div></div></div><div><br></div><div>
Upon booting the new 2.6.18-308.8.2.el5.028stab101.1 kernel, I started seeing strange kernel errors when starting the containers:</div><div><br></div><div><br></div><div>"Jan 26 20:46:06 vz02 kernel: CT: 103: started</div>
<div>Jan 26 20:46:08 vz02 kernel: CPT ERR: ffff81031e46a000,103 :NLMERR: -22</div><div>Jan 26 20:46:08 vz02 last message repeated 8 times</div><div>Jan 26 20:46:12 vz02 kernel: CT: 104: started</div><div>Jan 26 20:46:14 vz02 kernel: CPT ERR: ffff81031e46a000,104 :open_listening_socket: sock_create_kern: -97</div>
<div>Jan 26 20:46:14 vz02 kernel: CPT ERR: ffff81031e46a000,104 :rst_sockets: open_listening_socket: -97</div><div>Jan 26 20:46:14 vz02 kernel: CPT ERR: ffff81031e46a000,104 :rst_sockets: -97</div><div>Jan 26 20:46:14 vz02 kernel: CT: 104: stopped</div>
<div>Jan 26 20:46:15 vz02 kernel: CT: 104: started</div><div>Jan 26 20:46:17 vz02 kernel: CT: 105: started</div><div>Jan 26 20:46:19 vz02 kernel: CPT ERR: ffff81032156a000,105 :NLMERR: -22</div><div>Jan 26 20:46:19 vz02 last message repeated 8 times</div>
<div>Jan 26 20:46:19 vz02 kernel: CPT ERR: ffff81032156a000,105 :open_listening_socket: sock_create_kern: -97</div><div>Jan 26 20:46:19 vz02 kernel: CPT ERR: ffff81032156a000,105 :rst_sockets: open_listening_socket: -97</div>
<div>Jan 26 20:46:19 vz02 kernel: CPT ERR: ffff81032156a000,105 :rst_sockets: -97</div><div>"<br></div><div><br></div><div>--snip-- this goes on and on...<br></div><div><br>The containers did start up, but networking was not working. I could not ping them, they could not ping out.</div>
<div>Their interfaces were up, most of them use venet.</div><div><br></div><div>So I did a vzctl restart on each container, they threw out the same error messages, but networking started working again.</div>
<div><br></div><div>What makes even less sense is, we have other identical servers, same hardware, same version of CentOS, same VZ kernel - no issues. What am I missing here?</div><div><br></div><div>
Here is a copy of our vz.conf, nothing fancy:</div><div><br></div><div>"</div><div><div>VIRTUOZZO=yes</div><div>LOCKDIR=/vz/lock</div><div>DUMPDIR=/vz/dump</div><div>VE0CPUUNITS=1000</div><div><br></div>
<div>## Logging parameters</div><div>LOGGING=yes</div><div>LOGFILE=/var/log/vzctl.log</div><div>LOG_LEVEL=0</div><div>VERBOSE=0</div><div><br></div><div>## Disk quota parameters</div><div>DISK_QUOTA=yes</div><div>VZFASTBOOT=no</div>
<div><br></div><div># Disable module loading. If set, vz initscript do not load any modules.</div><div>#MODULES_DISABLED=yes</div><div><br></div><div># The name of the device whose IP address will be used as source IP for CT.</div>
<div># By default automatically assigned.</div><div>#VE_ROUTE_SRC_DEV="eth0"</div><div><br></div><div># Controls which interfaces to send ARP requests and modify APR tables on.</div><div>NEIGHBOUR_DEVS=all</div>
<div>ERROR_ON_ARPFAIL="no"</div><div><br></div><div><br></div><div>## Template parameters</div><div>TEMPLATE=/vz/template</div><div><br></div><div>## Defaults for containers</div><div>VE_ROOT=/vz/root/$VEID</div>
<div>VE_PRIVATE=/vz/private/$VEID</div><div>CONFIGFILE="vps.basic"</div><div>DEF_OSTEMPLATE="fedora-core-4"</div><div><br></div><div>## Load vzwdog module</div><div>VZWDOG="no"</div><div><br>
</div><div>## IPv4 iptables kernel modules</div><div>#IPTABLES="ipt_REJECT ipt_tos ipt_limit ipt_multiport iptable_filter iptable_mangle ipt_TCPMSS ipt_tcpmss ipt_ttl ipt_length ipt_state iptable_nat "</div><div>
<br></div><div>IPTABLES="iptable_filter iptable_mangle ipt_limit ipt_multiport ipt_tos ipt_TOS ipt_REJECT ipt_TCPMSS ipt_tcpmss ipt_ttl ipt_LOG ipt_length ip_conntrack ip_conntrack_ftp ip_conntrack_irc ipt_conntrack ipt_state ipt_helper iptable_nat ip_nat_ftp ip_nat_irc ipt_REDIRECT ipt_MASQUERADE"</div>
<div><br></div><div>## Enable IPv6</div><div>IPV6="no"</div><div><br></div><div>## IPv6 ip6tables kernel modules</div><div>IP6TABLES="ip6_tables ip6table_filter ip6table_mangle ip6t_REJECT"</div></div>
<div>"</div><div><br></div><div>Here is a example container config.</div><div><br></div><div>"</div><div><div>ONBOOT="yes"</div><div><br></div><div>NUMPROC="5102:5102"</div>
<div>AVNUMPROC="2551:2551"</div><div>NUMTCPSOCK="5102:5102"</div><div>NUMOTHERSOCK="5102:5102"</div><div>VMGUARPAGES="262144:9223372036854775807"</div><div><br></div><div># Secondary parameters</div>
<div>KMEMSIZE="209012940:229914234"</div><div>TCPSNDBUF="48773188:69670980"</div><div>TCPRCVBUF="48773188:69670980"</div><div>OTHERSOCKBUF="24386594:45284386"</div><div>DGRAMRCVBUF="24386594:24386594"</div>
<div>OOMGUARPAGES="151485:9223372036854775807"</div><div>PRIVVMPAGES="256000:262140"</div><div><br></div><div># Auxiliary parameters</div><div>LOCKEDPAGES="10205:10205"</div><div>SHMPAGES="90891:90891"</div>
<div>PHYSPAGES="0:9223372036854775807"</div><div>NUMFILE="81632:81632"</div><div>NUMFLOCK="1000:1100"</div><div>NUMPTY="510:510"</div><div>NUMSIGINFO="1024:1024"</div><div>
DCACHESIZE="45650516:47020032"</div><div><br></div><div>NUMIPTENT="3072:3072"</div><div># Disk Resource Limits</div><div>DISKINODES="4560000:4800000"</div><div>DISKSPACE="39845888:41943040"</div>
<div><br></div><div># Quota Resource Limits</div><div>QUOTATIME="0"</div><div>QUOTAUGIDLIMIT="3000"</div><div><br></div><div># CPU Resource Limits</div><div>CPUUNITS="1000"</div><div>#RATE="eth0:1:6000"</div>
<div><br></div><div># IPTables config</div><div>IPTABLES="ipt_REJECT ipt_tos ipt_limit ipt_multiport iptable_filter iptable_mangle ipt_TCPMSS ipt_tcpmss ipt_ttl ipt_length ip_conntrack ip_conntrack_ftp ipt_LOG ipt_conntrack ipt_helper ipt_state iptable_nat ip_nat_ftp ipt_TOS ipt_REDIRECT"</div>
<div><br></div><div># Default Devices</div><div>#DEVICES="c:10:229:rw c:10:200:rw "</div><div><br></div><div>IP_ADDRESS="1.2.3.4"</div><div>HOSTNAME="<a href="http://nscache01.xxx.com" target="_blank">nscache01.xxx.com</a>"</div>
<div>VE_ROOT="/vz/root/$VEID"</div><div>VE_PRIVATE="/vz/private/$VEID"</div><div>OSTEMPLATE="centos-5-x86_64"</div><div>ORIGIN_SAMPLE="512mb"</div><div>NAMESERVER="1.2.3.4"</div>
<div>NAME="nscache01-xxx"</div><div>"</div></div><div><br></div><div>Any ideas? Configuration issue? Kernel bug?</div><div><br></div><div>I also noticed a container that used to run OpenVPN no longer works, so even though networking is now "working" something is still going on...</div>
<div><br></div><div>The one thing I did do before upgrading the kernel is I had to remove ploop as there were some dependency issues.</div><div>We don't use any ploop based containers so I don't believe this should affect us? We did the same on another server and had no issue...</div>
<div><br></div><div>If anyone wants to respond directly to me that provides paid support, OpenVZ developers or anyone else I'm of course happy to pay, I don't expect someone to spend hours troubleshooting something for free.</div>
<div><br></div><div>Or, of course if any other list readers have any ideas please let me know! :)</div><div><br>Thanks in advance for your help.</div><div><br></div><div>-PJF</div></div>
</blockquote></div><br></div><div class="gmail_extra" style>I posted on OpenVZ's bugzilla last week, was hoping for a response from someone on the list on maybe where I should be looking.</div><div class="gmail_extra" style>
<br></div><div class="gmail_extra" style>Nobody has any ideas?</div><div class="gmail_extra" style><br></div><div class="gmail_extra" style>Most importantly what the:</div><div class="gmail_extra" style><br></div><div class="gmail_extra" style>
<pre class="" id="comment_text_0" style="white-space:pre-wrap;width:50em;color:rgb(0,0,0)">Jan 26 20:46:08 vz02 kernel: CPT ERR: ffff81031e46a000,103 :NLMERR: -22
Jan 26 20:46:08 vz02 last message repeated 8 times
Jan 26 20:46:12 vz02 kernel: CT: 104: started
Jan 26 20:46:14 vz02 kernel: CPT ERR: ffff81031e46a000,104 :open_listening_socket: sock_create_kern: -97
Jan 26 20:46:14 vz02 kernel: CPT ERR: ffff81031e46a000,104 :rst_sockets: open_listening_socket: -97
Jan 26 20:46:14 vz02 kernel: CPT ERR: ffff81031e46a000,104 :rst_sockets: -97</pre><pre class="" id="comment_text_0" style="white-space:pre-wrap;width:50em;color:rgb(0,0,0)"><span style="font-family:arial">Errors are?</span><br>
</pre><pre class="" id="comment_text_0" style="white-space:pre-wrap;width:50em;color:rgb(0,0,0)">Thanks in advance.</pre><pre class="" id="comment_text_0" style="white-space:pre-wrap;width:50em;color:rgb(0,0,0)">-PJF</pre>
</div></div>