[Users] occasional high loadavg without any noticeable
cpu/memory/io load
Kir Kolyshkin
kir at openvz.org
Wed May 30 11:09:20 EDT 2012
On 05/22/2012 01:06 PM, Rene C. wrote:
>
> Actually I made a small shell script that loops through the list of
> active containers and outputs the content of each containers
> /proc/loadavg. It started out as a bit more elaborate script that was
> intended to provide some of the functionality of a script vzstat, that
> I used to use with Virtuozzo.
>
> You can download both scripts from
> https://www.ourhelpdesk.net/downloads/z.tgz
vzlist have laverage field that might be of use. I.e.
vzlist -o ctid,laverage
>
>
>
> On Tue, May 22, 2012 at 3:15 PM, Steffan <general at ziggo.nl
> <mailto:general at ziggo.nl>> wrote:
>
> Sorry dont have the answer for you
>
> But can you tell me what command you used to see all loads on your
> node ?
>
> Thanxs Steffan
>
> *Van:*users-bounces at openvz.org <mailto:users-bounces at openvz.org>
> [mailto:users-bounces at openvz.org
> <mailto:users-bounces at openvz.org>] *Namens *Rene Dokbua
> *Verzonden:* maandag 21 mei 2012 20:07
> *Aan:* users at openvz.org <mailto:users at openvz.org>
> *Onderwerp:* [Users] occasional high loadavg without any
> noticeable cpu/memory/io load
>
> Hello,
>
> I occasionally get this extreme load on one of our VPS servers. It
> is quite large, 4 full E31230 cores, 4 GB RAM and hosting ca. 400
> websites + parked/addon/subdomains.
>
> The hardware node has 12 active VPS servers and most of the time
> things are chugging along just fine, something like this.
>
> 1401: 0.00 0.00 0.00 1/23 4561
>
> 1402: 0.02 0.05 0.05 1/57 16991
>
> 1404: 0.01 0.02 0.00 1/73 18863
>
> 1406: 0.07 0.13 0.06 1/39 31189
>
> 1407: 0.86 1.03 1.14 1/113 31460
>
> 1408: 0.17 0.17 0.18 1/79 32579
>
> 1409: 0.00 0.00 0.02 1/77 21784
>
> 1410: 0.01 0.02 0.00 1/60 7454
>
> 1413: 0.00 0.00 0.00 1/46 18579
>
> 1414: 0.00 0.00 0.00 1/41 23812
>
> 1415: 0.00 0.00 0.00 1/45 9831
>
> 1416: 0.05 0.02 0.00 1/59 11332
>
> 12 active
>
> The problem VPS is 1407. As you can see below it only uses a bit
> of the cpu and memory.
>
> top - 17:34:12 up 32 days, 12:21, 0 users, load average: 0.78,
> 0.95, 1.09
>
> Tasks: 102 total, 4 running, 90 sleeping, 0 stopped, 8 zombie
>
> Cpu(s): 16.3%us, 2.9%sy, 0.4%ni, 78.5%id, 1.8%wa, 0.0%hi,
> 0.0%si, 0.1%st
>
> Mem: 4194304k total, 2550572k used, 1643732k free, 0k
> buffers
>
> Swap: 8388608k total, 105344k used, 8283264k free, 1793828k
> cached
>
> Also iostat and vmstat shows no particular io or swap activity.
>
> Now for the problem. Every once in a while the loadavg of this
> particular VPS shoots up to like crazy values, 30 or more and it
> becomes completely sluggish. The odd thing is load goes up for the
> VPS server, and starts spilling into other VPS serers on the same
> hardware node - but there are still no particular cpu/memory/io
> usage going on that I can se. No particular network activity.
> In this example load has fallen back to around 10 but it was much
> higher earlier.
>
> 16:19:44 up 32 days, 11:19, 3 users, load average: 12.87,
> 19.11, 18.87
>
> 1401: 0.01 0.03 0.00 1/23 2876
>
> 1402: 0.00 0.11 0.13 1/57 15334
>
> 1404: 0.02 0.20 0.16 1/77 14918
>
> 1406: 0.01 0.13 0.10 1/39 29595
>
> 1407: 10.95 15.71 15.05 1/128 13950
>
> 1408: 0.36 0.52 0.57 1/81 27167
>
> 1409: 0.09 0.26 0.43 1/78 17851
>
> 1410: 0.09 0.17 0.18 1/61 4344
>
> 1413: 0.00 0.03 0.00 1/46 16539
>
> 1414: 0.01 0.01 0.00 1/41 22372
>
> 1415: 0.00 0.01 0.00 1/45 8404
>
> 1416: 0.05 0.10 0.11 1/58 9292
>
> 12 active
>
> top - 16:20:02 up 32 days, 11:07, 0 users, load average: 9.14,
> 14.97, 14.82
>
> Tasks: 135 total, 1 running, 122 sleeping, 0 stopped, 12 zombie
>
> Cpu(s): 16.3%us, 2.9%sy, 0.4%ni, 78.5%id, 1.8%wa, 0.0%hi,
> 0.0%si, 0.1%st
>
> Mem: 4194304k total, 1173844k used, 3020460k free, 0k
> buffers
>
> Swap: 8388608k total, 115576k used, 8273032k free, 725144k cache
>
> Notice how cpu is plenty idle, and only 1/4 of the available
> memory is being used.
>
> http://wiki.openvz.org/Ploop/Why explains "One such property that
> deserves a special item in this list is file system journal. While
> journal is a good thing to have, because it helps to maintain file
> system integrity and improve reboot times (by eliminating fsck in
> many cases), it is also a bottleneck for containers. If one
> container will fill up in-memory journal (with lots of small
> operations leading to file metadata updates, e.g. file truncates),
> all the other containers I/O will block waiting for the journal to
> be written to disk. In some extreme cases we saw up to 15 seconds
> of such blockage.". The problem I noticed last much longer than
> 15 seconds though - typically 15-30 minutes, then load goes back
> where it should be.
>
> Any suggestions where I could look for the cause of this? It's
> not like it happens everyday, maybe once or twice per month, but
> it's enough to cause customers to complain.
>
> Regards,
> Rene
>
>
> _______________________________________________
> Users mailing list
> Users at openvz.org <mailto:Users at openvz.org>
> https://openvz.org/mailman/listinfo/users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://openvz.org/pipermail/users/attachments/20120530/4ef770a6/attachment.html
More information about the Users
mailing list