[Users] Host IO delay high - Beancounters failure

Wed Dec 9 22:58:45 PST 2015

Herzlich willkommen, herr Spanka.

In such cases it's important to understand what exactly
blocks another processes outside affected container.
It does not look like your issue was related to memory,
privvmpage messages and failcounters should not cause described problem,

Probably your container had some other activity too,
and ate all DiskIO or whole network bandwidth.
Also container could consume all CPU resource.
It isn't limited by beancounters.
Do you have any such statistic on your node, can you check it?

On native Virtuozzo we have traffic shaping, it allows to limit outgoing container's traffic.
Also I hope you can limit DiskIO for affected container, ploop allows to do it.
Containers CPU can be limited too -- please check it

Unfortunately we do not know what is 2.6.32-166 Proxmox kernel, probably it is based on our old kernel.
In general I would like to advise you last version of our openVZ kernel 2.6.32-042stab113.10
https://openvz.org/Download/kernel/rhel6-testing/042stab113.10

Thank you,
	Vasily Averin

PS. If you'll observe the problem next time -- please try to get list of blocked processes by using Magic Sysrq key (alt+sysrq+W)
you can press it on local console (if you have direct access to affected server)
or via "echo w > /proc/sysrq-trigger"  if you have working shell.

On 09.12.2015 17:31, Henry Spanka wrote:
> Hey OpenVZ users,
> 
> I’m currently encountering a weird issue but don’t know how to fix it.
> Some containers on our node are freaking out sometimes. That’s not the issue. It’s customer related.
> However they take the node almost down. The host node has an IO delay of 30-50% at that time and needs about 5 minutes to
> Become stable again. SSH login is (almost) impossible. Shell is not running properly.
> 
> Dmesg reports the following:
> __ratelimit: 1892 callbacks suppressed
> Fatal resource shortage: privvmpages, UB 371.
> Fatal resource shortage: privvmpages, UB 371.
> Fatal resource shortage: privvmpages, UB 371.
> Fatal resource shortage: privvmpages, UB 371.
> 
> After taking a look at the bean counters of that container I got the following:
> ----------------------------------------------------------------
> CT 371       | HELD Bar% Lim%| MAXH Bar% Lim%| BAR | LIM | FAIL
> -------------+---------------+---------------+-----+-----+------
>      kmemsize|45.8M   -    - | 223M   -    - |   - |   - |    -
>   lockedpages|   -    -    - |  32K   -    - |   4G|   4G|    -
>   privvmpages|3.25G  27%  27%|  12G 100% 100%|  12G|  12G|  303K
>      shmpages| 114M   -    - | 147M   -    - |   - |   - |    -
>       numproc| 127    -    - | 318    -    - |   - |   - |    -
>     physpages|1.08G   -   26%|   4G   -  100%|   - |   4G|    -
>   vmguarpages|   -    -    - |   -    -    - |   8G|   - |    -
> oomguarpages|1008M  24%   - |2.45G  61%   - |   4G|   - |    -
>    numtcpsock|  31    -    - | 173    -    - |   - |   - |    -
>      numflock|  21    -    - |  48    -    - |   - |   - |    -
>        numpty|   -    -    - |   1    -    - |   - |   - |    -
>    numsiginfo|   -    -    - | 102    -    - |   - |   - |    -
>     tcpsndbuf|1.22M   -    - |16.6M   -    - |   - |   - |    -
>     tcprcvbuf| 496K   -    - | 2.7M   -    - |   - |   - |    -
> othersockbuf|81.3K   -    - |5.72M   -    - |   - |   - |    -
>   dgramrcvbuf|   -    -    - | 117K   -    - |   - |   - |    -
> numothersock|  66    -    - | 284    -    - |   - |   - |    -
>    dcachesize|24.8M   -    - | 178M   -    - |   - |   - |    -
>       numfile|2.69K   -    - |3.13K   -    - |   - |   - |    -
>     numiptent|  62    -    - |  62    -    - |   - |   - |    -
>     swappages| 367M   -    9%| 986M   -   24%|   - |   4G|    -
> 
> A failure count of 300.000 on privvmpages is not normal. However I’m using vSWAP and the RAM is limited to 12G.
> Node has 30GB of 64GB free, so that’s not the issue.
> 
> Anyone has a clue, why the host is almost going down? A single container shouldn’t affect the hosts performance.
> 
> Currently running pve-kernel-2.6.32-43-pve: 2.6.32-166(pve-kernel-2.6.32-43-pve: 2.6.32-166) with vzctl 4.9-4.
> 
> Containers are using the ploop layout and are laying on a LVM(root lvm partition).
> 
> Thank you for your time.
> -----------------------------------------------------------------------------------------
> 
> If you have any further questions, please let us know.
> 
> Mit freundlichen Grüßen / With best regards
> Henry Spanka | myVirtualserver Development Team