[Devel] [PATCH RHEL7 COMMIT] ms/memcg: only account kmem allocations marked as __GFP_ACCOUNT
Konstantin Khorenko
khorenko at virtuozzo.com
Mon May 2 07:06:01 PDT 2016
The commit is pushed to "branch-rh7-3.10.0-327.10.1.vz7.12.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-327.10.1.vz7.12.16
------>
commit 668dbf3c539364009cc44f1e32023dae35df8ccb
Author: Vladimir Davydov <vdavydov at virtuozzo.com>
Date: Mon May 2 18:06:01 2016 +0400
ms/memcg: only account kmem allocations marked as __GFP_ACCOUNT
Black-list kmem accounting policy (aka __GFP_NOACCOUNT) turned out to be
fragile and difficult to maintain, because there seem to be many more
allocations that should not be accounted than those that should be.
Besides, false accounting an allocation might result in much worse
consequences than not accounting at all, namely increased memory
consumption due to pinned dead kmem caches.
So this patch switches kmem accounting to the white-policy: now only
those kmem allocations that are marked as __GFP_ACCOUNT are accounted to
memcg. Currently, no kmem allocations are marked like this. The
following patches will mark several kmem allocations that are known to
be easily triggered from userspace and therefore should be accounted to
memcg.
Signed-off-by: Vladimir Davydov <vdavydov at virtuozzo.com>
Acked-by: Johannes Weiner <hannes at cmpxchg.org>
Acked-by: Michal Hocko <mhocko at suse.com>
Cc: Tejun Heo <tj at kernel.org>
Cc: Greg Thelen <gthelen at google.com>
Cc: Christoph Lameter <cl at linux.com>
Cc: Pekka Enberg <penberg at kernel.org>
Cc: David Rientjes <rientjes at google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim at lge.com>
Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
(cherry picked from commit a9bb7e620efdfd29b6d1c238041173e411670996)
Signed-off-by: Vladimir Davydov <vdavydov at virtuozzo.com>
Conflicts:
include/linux/gfp.h
include/linux/memcontrol.h
---
include/linux/gfp.h | 3 +++
include/linux/memcontrol.h | 4 ++++
mm/page_alloc.c | 3 ++-
3 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 6c982d5..b454948 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -31,6 +31,7 @@ struct vm_area_struct;
#define ___GFP_HARDWALL 0x20000u
#define ___GFP_THISNODE 0x40000u
#define ___GFP_RECLAIMABLE 0x80000u
+#define ___GFP_ACCOUNT 0x100000u
#define ___GFP_NOTRACK 0x200000u
#define ___GFP_NO_KSWAPD 0x400000u
#define ___GFP_OTHER_NODE 0x800000u
@@ -86,6 +87,7 @@ struct vm_area_struct;
#define __GFP_HARDWALL ((__force gfp_t)___GFP_HARDWALL) /* Enforce hardwall cpuset memory allocs */
#define __GFP_THISNODE ((__force gfp_t)___GFP_THISNODE)/* No fallback, no policies */
#define __GFP_RECLAIMABLE ((__force gfp_t)___GFP_RECLAIMABLE) /* Page is reclaimable */
+#define __GFP_ACCOUNT ((__force gfp_t)___GFP_ACCOUNT) /* Account to kmemcg */
#define __GFP_NOTRACK ((__force gfp_t)___GFP_NOTRACK) /* Don't track with kmemcheck */
#define __GFP_NO_KSWAPD ((__force gfp_t)___GFP_NO_KSWAPD)
@@ -108,6 +110,7 @@ struct vm_area_struct;
#define GFP_NOIO (__GFP_WAIT)
#define GFP_NOFS (__GFP_WAIT | __GFP_IO)
#define GFP_KERNEL (__GFP_WAIT | __GFP_IO | __GFP_FS)
+#define GFP_KERNEL_ACCOUNT (GFP_KERNEL | __GFP_ACCOUNT)
#define GFP_TEMPORARY (__GFP_WAIT | __GFP_IO | __GFP_FS | \
__GFP_RECLAIMABLE)
#define GFP_USER (__GFP_WAIT | __GFP_IO | __GFP_FS | __GFP_HARDWALL)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 26c2078..c21f78b 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -572,6 +572,8 @@ memcg_kmem_newpage_charge(gfp_t gfp, struct mem_cgroup **memcg, int order)
{
if (!memcg_kmem_enabled())
return true;
+ if (!(gfp & __GFP_ACCOUNT))
+ return true;
/*
* __GFP_NOFAIL allocations will move on even if charging is not
@@ -636,6 +638,8 @@ memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
{
if (!memcg_kmem_enabled())
return cachep;
+ if (!(gfp & __GFP_ACCOUNT))
+ return cachep;
if (gfp & __GFP_NOFAIL)
return cachep;
if (in_interrupt() || (!current->mm) || (current->flags & PF_KTHREAD))
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 9a7ac88..c090025 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2870,7 +2870,8 @@ EXPORT_SYMBOL(free_pages);
/*
* alloc_kmem_pages charges newly allocated pages to the kmem resource counter
- * of the current memory cgroup.
+ * of the current memory cgroup if __GFP_ACCOUNT is set, other than that it is
+ * equivalent to alloc_pages.
*
* It should be used when the caller would like to use kmalloc, but since the
* allocation is large, it has to fall back to the page allocator.
More information about the Devel
mailing list