[Devel] [PATCH RHEL7 COMMIT] ms/mm: memcontrol: only mark charged pages with PageKmemcg

Konstantin Khorenko khorenko at virtuozzo.com
Mon Jan 16 08:27:17 PST 2017


The commit is pushed to "branch-rh7-3.10.0-514.vz7.27.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-514.vz7.27.10
------>
commit cd4b4b8807ac3545017b3153ced5e81b27aa9346
Author: Vladimir Davydov <vdavydov at virtuozzo.com>
Date:   Mon Jan 16 20:27:17 2017 +0400

    ms/mm: memcontrol: only mark charged pages with PageKmemcg
    
    To distinguish non-slab pages charged to kmemcg we mark them PageKmemcg,
    which sets page->_mapcount to -512.  Currently, we set/clear PageKmemcg
    in __alloc_pages_nodemask()/free_pages_prepare() for any page allocated
    with __GFP_ACCOUNT, including those that aren't actually charged to any
    cgroup, i.e. allocated from the root cgroup context.  To avoid overhead
    in case cgroups are not used, we only do that if memcg_kmem_enabled() is
    true.  The latter is set iff there are kmem-enabled memory cgroups
    (online or offline).  The root cgroup is not considered kmem-enabled.
    
    As a result, if a page is allocated with __GFP_ACCOUNT for the root
    cgroup when there are kmem-enabled memory cgroups and is freed after all
    kmem-enabled memory cgroups were removed, e.g.
    
      # no memory cgroups has been created yet, create one
      mkdir /sys/fs/cgroup/memory/test
      # run something allocating pages with __GFP_ACCOUNT, e.g.
      # a program using pipe
      dmesg | tail
      # remove the memory cgroup
      rmdir /sys/fs/cgroup/memory/test
    
    we'll get bad page state bug complaining about page->_mapcount != -1:
    
      BUG: Bad page state in process swapper/0  pfn:1fd945c
      page:ffffea007f651700 count:0 mapcount:-511 mapping:          (null) index:0x0
      flags: 0x1000000000000000()
    
    To avoid that, let's mark with PageKmemcg only those pages that are
    actually charged to and hence pin a non-root memory cgroup.
    
    Fixes: 4949148ad433 ("mm: charge/uncharge kmemcg from generic page allocator paths")
    Reported-and-tested-by: Eric Dumazet <eric.dumazet at gmail.com>
    Signed-off-by: Vladimir Davydov <vdavydov at virtuozzo.com>
    
    Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
    
    https://jira.sw.ru/browse/PSBM-51558
    (cherry picked from commit c4159a75b64c0e67caededf4d7372c1b58a5f42a)
    Signed-off-by: Andrey Ryabinin <aryabinin at virtuozzo.com>
---
 mm/memcontrol.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 0183a9c..dc83f4e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -7001,8 +7001,10 @@ static void uncharge_list(struct list_head *page_list)
 			else
 				nr_file += nr_pages;
 			pgpgout++;
-		} else
+		} else {
 			nr_kmem += 1 << compound_order(page);
+			__ClearPageKmemcg(page);
+		}
 
 		if (pc->flags & PCG_MEM)
 			nr_mem += nr_pages;


More information about the Devel mailing list