[Users] Need help with hanging servers

Scott Dowdle dowdle at montanalinux.org
Tue Jul 6 13:43:17 EDT 2010


Greetings,


----- Original Message -----
> Been lurking on the list for a bit before I posted. We are relatively
> new and light OpenVZ users. We have three physical boxes that use
> OpenVZ. One is the server that is home to our developers' environment.
> Each developer has his own container. We have the occasional container
> stop responding due to too many resources used, but the entire server
> is fine. That is almost always the devs fault.
> 
> The other two installs we have are in production. They are sort of
> miscellaneous installation boxes. Things like cacti, nagios, misc web
> apps (web mail, etc.) as well as having containers for custom outgoing
> SMTP servers and running Gearman workers written in PHP on a dedicated
> container.
> 
> The management of OpenVZ is great. We love it. We just have one problem.
> On no regular schedule, the two production servers will hang. And it is
> a weird hang. They still respond to ping. And TCP connnections answer
> (connect) but don't respond. So, our monitoring hangs for a while
> waiting on an answer. Likewise our load balancers don't see them as down
> for a while after they are not responding. It is just weird. I am hoping
> that is some clue for someone. There is nothing in syslog on the host
> server or any containers. There is nothing on the console. It sounds
> like a resource issue. We have tried moving containers around, leaving
> some off for a while, and other stuff to find the offending container.
> But, nothing has worked. One or the other locks up every 5-6 days. Not
> on a schedule like it is a particular cron job causing the problem.
> 
> I am sure it is something we have done. We have allocated something
> wrong most likely and just need to be slapped one good time and told NO!
> But, I don't know where to look. I will jump into the IRC channel too in
> case someone is willing to help me and wants some real time data.
> 
> Thanks in advance for any help.
> 
> System information below. If there is more information that may help
> solve this problem, let me know what to look for.
> 
> # uname -a
> Linux atl-vz1 2.6.18-028stab056 #1 SMP Tue Jun 30 07:50:32 EDT 2009
> x86_64 Intel(R) Xeon(R) CPU E5420 @ 2.50GHz GenuineIntel GNU/Linux
> 
> * sys-kernel/openvz-sources
> Latest version installed: 2.6.27.5.3
> 
> System Information
> Manufacturer: Dell Inc.
> Product Name: PowerEdge 2950
> 
> # free
> total used free shared buffers cached
> Mem: 32872312 26336688 6535624 0 12 20952484
> -/+ buffers/cache: 5384192 27488120
> Swap: 8388656 0 8388656
> 
> # vzlist -o ctid,kmemsize,kmemsize.l -s kmemsize
> CTID KMEMSIZE KMEMSIZE.L
> 119 2025130 115710537
> 116 2649072 231421075
> 118 3145806 28927633
> 111 3518587 115710537
> 112 8613133 57855268
> 121 8779664 57855268
> 120 10341711 115710537
> 122 10931070 231421075
> 117 11024345 231421075
> 113 22290970 231421075

Like... you didn't mention if you had any failcnts in the containers.   Do you?

You probably already know this but it doesn't hurt to mention, 2.6.27.x is not a "stable" OpenVZ kernel branch.

TYL,
-- 
Scott Dowdle
704 Church Street
Belgrade, MT 59714
(406)388-0827 [home]
(406)994-3931 [work]


More information about the Users mailing list