[Users] occasional high loadavg without any noticeable cpu/memory/io load

Tue May 22 05:06:45 EDT 2012

Actually I made a small shell script that loops through the list of active
containers and outputs the content of each containers /proc/loadavg.  It
started out as a bit more elaborate script that was intended to provide
some of the functionality of a script vzstat, that I used to use with
Virtuozzo.

You can download both scripts from
https://www.ourhelpdesk.net/downloads/z.tgz

On Tue, May 22, 2012 at 3:15 PM, Steffan <general at ziggo.nl> wrote:

> Sorry dont have the answer for you****
>
> But can you tell me what command you used to see all loads on your node ?*
> ***
>
> ** **
>
> Thanxs Steffan****
>
> ** **
>
> *Van:* users-bounces at openvz.org [mailto:users-bounces at openvz.org] *Namens
> *Rene Dokbua
> *Verzonden:* maandag 21 mei 2012 20:07
> *Aan:* users at openvz.org
> *Onderwerp:* [Users] occasional high loadavg without any noticeable
> cpu/memory/io load****
>
> ** **
>
> Hello,****
>
> ** **
>
> I occasionally get this extreme load on one of our VPS servers. It is
> quite large, 4 full E31230 cores, 4 GB RAM and hosting ca. 400 websites +
> parked/addon/subdomains.****
>
> ** **
>
> The hardware node has 12 active VPS servers and most of the time things
> are chugging along just fine, something like this.****
>
> ** **
>
> 1401: 0.00 0.00 0.00 1/23 4561****
>
> 1402: 0.02 0.05 0.05 1/57 16991****
>
> 1404: 0.01 0.02 0.00 1/73 18863****
>
> 1406: 0.07 0.13 0.06 1/39 31189****
>
> 1407: 0.86 1.03 1.14 1/113 31460****
>
> 1408: 0.17 0.17 0.18 1/79 32579****
>
> 1409: 0.00 0.00 0.02 1/77 21784****
>
> 1410: 0.01 0.02 0.00 1/60 7454****
>
> 1413: 0.00 0.00 0.00 1/46 18579****
>
> 1414: 0.00 0.00 0.00 1/41 23812****
>
> 1415: 0.00 0.00 0.00 1/45 9831****
>
> 1416: 0.05 0.02 0.00 1/59 11332****
>
> 12 active****
>
> ** **
>
> The problem VPS is 1407. As you can see below it only uses a bit of the
> cpu and memory. ****
>
> ** **
>
> top - 17:34:12 up 32 days, 12:21,  0 users,  load average: 0.78, 0.95, 1.09
> ****
>
> Tasks: 102 total,   4 running,  90 sleeping,   0 stopped,   8 zombie****
>
> Cpu(s): 16.3%us,  2.9%sy,  0.4%ni, 78.5%id,  1.8%wa,  0.0%hi,  0.0%si,
>  0.1%st****
>
> Mem:   4194304k total,  2550572k used,  1643732k free,        0k buffers**
> **
>
> Swap:  8388608k total,   105344k used,  8283264k free,  1793828k cached***
> *
>
> ** **
>
> Also iostat and vmstat shows no particular io or swap activity.****
>
> ** **
>
> Now for the problem. Every once in a while the loadavg of this particular
> VPS shoots up to like crazy values, 30 or more and it becomes completely
> sluggish. The odd thing is load goes up for the VPS server, and starts
> spilling into other VPS serers on the same hardware node - but there are
> still no particular cpu/memory/io usage going on that I can se.  No
> particular network activity.   In this example load has fallen back to
> around 10 but it was much higher earlier.****
>
> ** **
>
>  16:19:44 up 32 days, 11:19,  3 users,  load average: 12.87, 19.11, 18.87*
> ***
>
> ** **
>
> 1401: 0.01 0.03 0.00 1/23 2876****
>
> 1402: 0.00 0.11 0.13 1/57 15334****
>
> 1404: 0.02 0.20 0.16 1/77 14918****
>
> 1406: 0.01 0.13 0.10 1/39 29595****
>
> 1407: 10.95 15.71 15.05 1/128 13950****
>
> 1408: 0.36 0.52 0.57 1/81 27167****
>
> 1409: 0.09 0.26 0.43 1/78 17851****
>
> 1410: 0.09 0.17 0.18 1/61 4344****
>
> 1413: 0.00 0.03 0.00 1/46 16539****
>
> 1414: 0.01 0.01 0.00 1/41 22372****
>
> 1415: 0.00 0.01 0.00 1/45 8404****
>
> 1416: 0.05 0.10 0.11 1/58 9292****
>
> 12 active****
>
> ** **
>
> top - 16:20:02 up 32 days, 11:07,  0 users,  load average: 9.14, 14.97,
> 14.82****
>
> Tasks: 135 total,   1 running, 122 sleeping,   0 stopped,  12 zombie****
>
> Cpu(s): 16.3%us,  2.9%sy,  0.4%ni, 78.5%id,  1.8%wa,  0.0%hi,  0.0%si,
>  0.1%st****
>
> Mem:   4194304k total,  1173844k used,  3020460k free,        0k buffers**
> **
>
> Swap:  8388608k total,   115576k used,  8273032k free,   725144k cache****
>
> ** **
>
> Notice how cpu is plenty idle, and only 1/4 of the available memory is
> being used.****
>
> ** **
>
> http://wiki.openvz.org/Ploop/Why explains "One such property that
> deserves a special item in this list is file system journal. While journal
> is a good thing to have, because it helps to maintain file system integrity
> and improve reboot times (by eliminating fsck in many cases), it is also a
> bottleneck for containers. If one container will fill up in-memory journal
> (with lots of small operations leading to file metadata updates, e.g. file
> truncates), all the other containers I/O will block waiting for the journal
> to be written to disk. In some extreme cases we saw up to 15 seconds of
> such blockage.".   The problem I noticed last much longer than 15 seconds
> though - typically 15-30 minutes, then load goes back where it should be.*
> ***
>
> ** **
>
> Any suggestions where I could look for the cause of this?  It's not like
> it happens everyday, maybe once or twice per month, but it's enough to
> cause customers to complain.****
>
> ** **
>
> Regards,
> Rene****
>
> ** **
>
> _______________________________________________
> Users mailing list
> Users at openvz.org
> https://openvz.org/mailman/listinfo/users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://openvz.org/pipermail/users/attachments/20120522/267b59a1/attachment.html