[Users] Can't create directory /sys/fs/cgroup/memory/machine.slice/ Cannot allocate memory

Сергей Мамонов mrqwer88 at gmail.com
Wed Feb 10 18:47:17 MSK 2021


And it is definitely it -
grep -E "memory|num_cgroups" /proc/cgroups
#subsys_name hierarchy num_cgroups enabled
memory 2 65534 1

After migration some of containers to another node num_cgroups goes down to
65365 and it allowed to start stopped container without `
Can't create directory /sys/fs/cgroup/memory/machine.slice/1000133882:
Cannot allocate memory` error.

But I don't understand why num_cgroups for memory so big, yet.

Like ~460 per container instead of  60 and less per container on other
nodes (with the same kernel version too).

On Wed, 10 Feb 2021 at 17:48, Сергей Мамонов <mrqwer88 at gmail.com> wrote:

> Hello!
>
> Looks like we reproduced this problem too.
>
> kernel - 3.10.0-1127.18.2.vz7.163.46
>
> Same error -
> Can't create directory /sys/fs/cgroup/memory/machine.slice/1000133882:
> Cannot allocate memory
>
> Same ok output for
> /sys/fs/cgroup/memory/*limit_in_bytes
> /sys/fs/cgroup/memory/machine.slice/*limit_in_bytes
>
> Have a lot of free memory on node (per numa too).
>
> Only that looks really strange -
> grep -E "memory|num_cgroups" /proc/cgroups
> #subsys_name hierarchy num_cgroups enabled
> memory 2 65534 1
>
> huge nub_cgroups only on this node
>
> cat /proc/cgroups
> #subsys_name hierarchy num_cgroups enabled
> cpuset 7 144 1
> cpu 10 263 1
> cpuacct 10 263 1
> memory 2 65534 1
> devices 11 1787 1
> freezer 17 144 1
> net_cls 12 144 1
> blkio 1 257 1
> perf_event 13 144 1
> hugetlb 14 144 1
> pids 3 2955 1
> ve 6 143 1
> beancounter 4 143 1
> net_prio 12 144 1
>
> On Thu, 28 Jan 2021 at 14:22, Konstantin Khorenko <khorenko at virtuozzo.com>
> wrote:
>
>> May be you hit memory shortage in a particular NUMA node only, for
>> example.
>>
>> # numactl --hardware
>> # numastat -m
>>
>>
>> Or go hard way - trace kernel where exactly do we get -ENOMEM:
>>
>> trace the kernel function cgroup_mkdir() using /sys/kernel/debug/tracing/
>> with function_graph tracer.
>>
>>
>> https://lwn.net/Articles/370423/
>>
>> --
>> Best regards,
>>
>> Konstantin Khorenko,
>> Virtuozzo Linux Kernel Team
>>
>> On 01/28/2021 12:43 PM, Joe Dougherty wrote:
>>
>> I checked that, doesn't appear to be the case.
>>
>> *# pwd*
>> */sys/fs/cgroup/memory*
>> *# cat *limit_in_bytes*
>> *9223372036854771712*
>> *9223372036854767616*
>> *2251799813685247*
>> *2251799813685247*
>> *9223372036854771712*
>> *9223372036854771712*
>> *9223372036854771712*
>> *# cat *failcnt*
>> *0*
>> *0*
>> *0*
>> *0*
>> *0*
>>
>> # pwd
>> /sys/fs/cgroup/memory/machine.slice
>> *# cat *limit_in_bytes*
>> *9223372036854771712*
>> *9223372036854767616*
>> *9223372036854771712*
>> *9223372036854771712*
>> *9223372036854771712*
>> *9223372036854771712*
>> *9223372036854771712*
>> *# cat *failcnt*
>> *0*
>> *0*
>> *0*
>> *0*
>> *0*
>>
>>
>>
>> On Thu, Jan 28, 2021 at 2:47 AM Konstantin Khorenko <
>> khorenko at virtuozzo.com> wrote:
>>
>>> Hi Joe,
>>>
>>> i'd suggest to check memory limits for root and "machine.slice" memory
>>> cgroups
>>>
>>> /sys/fs/cgroup/memory/*limit_in_bytes
>>> /sys/fs/cgroup/memory/machine.slice/*limit_in_bytes
>>>
>>> All of them should be unlimited.
>>>
>>> If not - search who limit them.
>>>
>>> --
>>> Best regards,
>>>
>>> Konstantin Khorenko,
>>> Virtuozzo Linux Kernel Team
>>>
>>> On 01/27/2021 10:28 PM, Joe Dougherty wrote:
>>>
>>> I'm running into an issue on only 1 of my OpenVZ 7 nodes where it's
>>> unable to create a directory on /sys/fs/cgroup/memory/machine.slice due to
>>> "Cannot allocate memory" whenever I try to start a new container or restart
>>> and existing one. I've been trying to research this but I'm unable to find
>>> any concrete info on what could cause this. It appears to be memory related
>>> because sometimes if I issue "echo 1 /proc/sys/vm/drop_caches" it allows me
>>> to start a container (this only works sometimes) but my RAM usage is
>>> extremely low with no swapping (swappiness even set to 0 for testing).
>>> Thank you in advance for your help.
>>>
>>>
>>> Example:
>>> *# vzctl start 9499*
>>> *Starting Container ...*
>>> *Mount image: /vz/private/9499/root.hdd*
>>> *Container is mounted*
>>> *Can't create directory /sys/fs/cgroup/memory/machine.slice/9499: Cannot
>>> allocate memory*
>>> *Unmount image: /vz/private/9499/root.hdd (190)*
>>> *Container is unmounted*
>>> *Failed to start the Container*
>>>
>>>
>>> Node Info:
>>> *Uptime:      10 days*
>>> *OS:          Virtuozzo 7.0.15*
>>> *Kernel:      3.10.0-1127.18.2.vz7.163.46 GNU/Linux*
>>> *System Load: 3.1*
>>> */vz Usage:   56% of 37T*
>>> *Swap Usage:  0%*
>>> *RAM Free:    84% of 94.2GB*
>>>
>>> *# free -m*
>>> *                    total        used        free            shared
>>>  buff/cache   available*
>>> *Mem:          96502       14259     49940         413         32303
>>>        80990*
>>> *Swap:         32767       93           32674*
>>>
>>>
>>> _______________________________________________
>>> Users mailing listUsers at openvz.orghttps://lists.openvz.org/mailman/listinfo/users
>>>
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users at openvz.org
>>> https://lists.openvz.org/mailman/listinfo/users
>>>
>>
>>
>> --
>> *-Joe Dougherty*
>> *Chief Operating Officer*
>> *Secure Dragon LLC *
>> *www.SecureDragon.net <http://www.SecureDragon.net>*
>>
>>
>> _______________________________________________
>> Users mailing listUsers at openvz.orghttps://lists.openvz.org/mailman/listinfo/users
>>
>>
>> _______________________________________________
>> Users mailing list
>> Users at openvz.org
>> https://lists.openvz.org/mailman/listinfo/users
>>
>
>
> --
> Best Regards,
> Sergei Mamonov
>


-- 
Best Regards,
Sergei Mamonov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/users/attachments/20210210/4febe885/attachment-0001.html>


More information about the Users mailing list