[Devel] [PATCH RHEL7 COMMIT] netlink: Make all in-cg memory be kmem accounted

Konstantin Khorenko khorenko at virtuozzo.com
Fri Jun 5 12:56:05 PDT 2015


The commit is pushed to "branch-rh7-3.10.0-123.1.2-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-123.1.2.vz7.5.10
------>
commit f82248c72ab367c1bf37844a5457e8255237e9b6
Author: Pavel Emelyanov <xemul at parallels.com>
Date:   Fri Jun 5 23:56:05 2015 +0400

    netlink: Make all in-cg memory be kmem accounted
    
    So, this one is tricky. Right now most (all but one place) of the
    memory allocations in netlink code happen in process context and
    are done via kmalloc/slub. Thus they are auto-accounted into kmem.
    
    The single exceptional place is in netlink_alloc_large_skb where
    big sending packets are allocated with vmalloc. The good news
    about it is that the only use case for it right now seem to be in
    newest netfilter user space code that tries to load HUGE netfilter
    tables into kernel via netlink API.
    
    Since this is very likely not to case for our containers, we can
    just disable this newest (appeared in 3.10 with c05cdb1b86)
    feature for everyone but host.
    
    One more pain here is in mapped sockets. It's also relatively new and pages
    that are kernel, but mapped into process VM are out of track. This is like
    memory that is vmsplice-d into pipe and then unmapped. It's also gets
    unaccounted, but occupies place. Both issues worth revisiting.
    
    https://jira.sw.ru/browse/PSBM-33584
    
    Signed-off-by: Pavel Emelyanov <xemul at parallels.com>
---
 net/netlink/af_netlink.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 94d635f..734a68a 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1561,7 +1561,13 @@ static struct sk_buff *netlink_alloc_large_skb(unsigned int size,
 	struct sk_buff *skb;
 	void *data;
 
-	if (size <= NLMSG_GOODSIZE || broadcast)
+	if (size <= NLMSG_GOODSIZE || broadcast ||
+			/*
+			 * Once we have vmalloc_kmem() that would account
+			 * allocated pages into memcg, this check can be
+			 * removed.
+			 */
+			!ve_is_super(get_exec_env()))
 		return alloc_skb(size, GFP_KERNEL);
 
 	size = SKB_DATA_ALIGN(size) +



More information about the Devel mailing list