[Users] Can't create directory /sys/fs/cgroup/memory/machine.slice/ Cannot allocate memory

Fri Feb 12 04:05:18 MSK 2021

On Thu, Feb 11, 2021 at 12:59 AM Сергей Мамонов <mrqwer88 at gmail.com> wrote:
>
> And after migrate all containers to another node it still shows 63745 cgroups -
>
> cat /proc/cgroups
> #subsys_name hierarchy num_cgroups enabled
> cpuset 7 2 1
> cpu 10 2 1
> cpuacct 10 2 1
> memory 2 63745 1

Looks like a leakage (or a bug in memory accounting which prevents
cgroup from being released).
You can check the number of memory cgroups with something like

find /sys/fs/cgroup/memory -type d | wc -l

If you see a large number, go explore those cgroups (check
cgroup.procs, memory.usage_in_bytes).

> devices 11 2 1
> freezer 17 2 1
> net_cls 12 2 1
> blkio 1 4 1
> perf_event 13 2 1
> hugetlb 14 2 1
> pids 3 68 1
> ve 6 1 1
> beancounter 4 3 1
> net_prio 12 2 1
>
> On Wed, 10 Feb 2021 at 18:47, Сергей Мамонов <mrqwer88 at gmail.com> wrote:
>>
>> And it is definitely it -
>> grep -E "memory|num_cgroups" /proc/cgroups
>> #subsys_name hierarchy num_cgroups enabled
>> memory 2 65534 1
>>
>> After migration some of containers to another node num_cgroups goes down to 65365 and it allowed to start stopped container without `
>> Can't create directory /sys/fs/cgroup/memory/machine.slice/1000133882: Cannot allocate memory` error.
>>
>> But I don't understand why num_cgroups for memory so big, yet.
>>
>> Like ~460 per container instead of  60 and less per container on other nodes (with the same kernel version too).
>>
>> On Wed, 10 Feb 2021 at 17:48, Сергей Мамонов <mrqwer88 at gmail.com> wrote:
>>>
>>> Hello!
>>>
>>> Looks like we reproduced this problem too.
>>>
>>> kernel - 3.10.0-1127.18.2.vz7.163.46
>>>
>>> Same error -
>>> Can't create directory /sys/fs/cgroup/memory/machine.slice/1000133882: Cannot allocate memory
>>>
>>> Same ok output for
>>> /sys/fs/cgroup/memory/*limit_in_bytes
>>> /sys/fs/cgroup/memory/machine.slice/*limit_in_bytes
>>>
>>> Have a lot of free memory on node (per numa too).
>>>
>>> Only that looks really strange -
>>> grep -E "memory|num_cgroups" /proc/cgroups
>>> #subsys_name hierarchy num_cgroups enabled
>>> memory 2 65534 1
>>>
>>> huge nub_cgroups only on this node
>>>
>>> cat /proc/cgroups
>>> #subsys_name hierarchy num_cgroups enabled
>>> cpuset 7 144 1
>>> cpu 10 263 1
>>> cpuacct 10 263 1
>>> memory 2 65534 1
>>> devices 11 1787 1
>>> freezer 17 144 1
>>> net_cls 12 144 1
>>> blkio 1 257 1
>>> perf_event 13 144 1
>>> hugetlb 14 144 1
>>> pids 3 2955 1
>>> ve 6 143 1
>>> beancounter 4 143 1
>>> net_prio 12 144 1
>>>
>>> On Thu, 28 Jan 2021 at 14:22, Konstantin Khorenko <khorenko at virtuozzo.com> wrote:
>>>>
>>>> May be you hit memory shortage in a particular NUMA node only, for example.
>>>>
>>>> # numactl --hardware
>>>> # numastat -m
>>>>
>>>>
>>>> Or go hard way - trace kernel where exactly do we get -ENOMEM:
>>>>
>>>> trace the kernel function cgroup_mkdir() using /sys/kernel/debug/tracing/
>>>> with function_graph tracer.
>>>>
>>>>
>>>> https://lwn.net/Articles/370423/
>>>>
>>>> --
>>>> Best regards,
>>>>
>>>> Konstantin Khorenko,
>>>> Virtuozzo Linux Kernel Team
>>>>
>>>> On 01/28/2021 12:43 PM, Joe Dougherty wrote:
>>>>
>>>> I checked that, doesn't appear to be the case.
>>>>
>>>> # pwd
>>>> /sys/fs/cgroup/memory
>>>> # cat *limit_in_bytes
>>>> 9223372036854771712
>>>> 9223372036854767616
>>>> 2251799813685247
>>>> 2251799813685247
>>>> 9223372036854771712
>>>> 9223372036854771712
>>>> 9223372036854771712
>>>> # cat *failcnt
>>>> 0
>>>> 0
>>>> 0
>>>> 0
>>>> 0
>>>>
>>>> # pwd
>>>> /sys/fs/cgroup/memory/machine.slice
>>>> # cat *limit_in_bytes
>>>> 9223372036854771712
>>>> 9223372036854767616
>>>> 9223372036854771712
>>>> 9223372036854771712
>>>> 9223372036854771712
>>>> 9223372036854771712
>>>> 9223372036854771712
>>>> # cat *failcnt
>>>> 0
>>>> 0
>>>> 0
>>>> 0
>>>> 0
>>>>
>>>>
>>>>
>>>> On Thu, Jan 28, 2021 at 2:47 AM Konstantin Khorenko <khorenko at virtuozzo.com> wrote:
>>>>>
>>>>> Hi Joe,
>>>>>
>>>>> i'd suggest to check memory limits for root and "machine.slice" memory cgroups
>>>>>
>>>>> /sys/fs/cgroup/memory/*limit_in_bytes
>>>>> /sys/fs/cgroup/memory/machine.slice/*limit_in_bytes
>>>>>
>>>>> All of them should be unlimited.
>>>>>
>>>>> If not - search who limit them.
>>>>>
>>>>> --
>>>>> Best regards,
>>>>>
>>>>> Konstantin Khorenko,
>>>>> Virtuozzo Linux Kernel Team
>>>>>
>>>>> On 01/27/2021 10:28 PM, Joe Dougherty wrote:
>>>>>
>>>>> I'm running into an issue on only 1 of my OpenVZ 7 nodes where it's unable to create a directory on /sys/fs/cgroup/memory/machine.slice due to "Cannot allocate memory" whenever I try to start a new container or restart and existing one. I've been trying to research this but I'm unable to find any concrete info on what could cause this. It appears to be memory related because sometimes if I issue "echo 1 /proc/sys/vm/drop_caches" it allows me to start a container (this only works sometimes) but my RAM usage is extremely low with no swapping (swappiness even set to 0 for testing). Thank you in advance for your help.
>>>>>
>>>>>
>>>>> Example:
>>>>> # vzctl start 9499
>>>>> Starting Container ...
>>>>> Mount image: /vz/private/9499/root.hdd
>>>>> Container is mounted
>>>>> Can't create directory /sys/fs/cgroup/memory/machine.slice/9499: Cannot allocate memory
>>>>> Unmount image: /vz/private/9499/root.hdd (190)
>>>>> Container is unmounted
>>>>> Failed to start the Container
>>>>>
>>>>>
>>>>> Node Info:
>>>>> Uptime:      10 days
>>>>> OS:          Virtuozzo 7.0.15
>>>>> Kernel:      3.10.0-1127.18.2.vz7.163.46 GNU/Linux
>>>>> System Load: 3.1
>>>>> /vz Usage:   56% of 37T
>>>>> Swap Usage:  0%
>>>>> RAM Free:    84% of 94.2GB
>>>>>
>>>>> # free -m
>>>>>                     total        used        free            shared   buff/cache   available
>>>>> Mem:          96502       14259     49940         413         32303           80990
>>>>> Swap:         32767       93           32674
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Users mailing list
>>>>> Users at openvz.org
>>>>> https://lists.openvz.org/mailman/listinfo/users
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Users mailing list
>>>>> Users at openvz.org
>>>>> https://lists.openvz.org/mailman/listinfo/users
>>>>
>>>>
>>>>
>>>> --
>>>> -Joe Dougherty
>>>> Chief Operating Officer
>>>> Secure Dragon LLC
>>>> www.SecureDragon.net
>>>>
>>>>
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users at openvz.org
>>>> https://lists.openvz.org/mailman/listinfo/users
>>>>
>>>>
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users at openvz.org
>>>> https://lists.openvz.org/mailman/listinfo/users
>>>
>>>
>>>
>>> --
>>> Best Regards,
>>> Sergei Mamonov
>>
>>
>>
>> --
>> Best Regards,
>> Sergei Mamonov
>
>
>
> --
> Best Regards,
> Sergei Mamonov
> _______________________________________________
> Users mailing list
> Users at openvz.org
> https://lists.openvz.org/mailman/listinfo/users