[Devel] [PATCH RHEL7 COMMIT] mm: page_idle: look up page anon_vma carefully when checking references

Konstantin Khorenko khorenko at virtuozzo.com
Tue Dec 8 06:15:57 PST 2015


The commit is pushed to "branch-rh7-3.10.0-229.7.2.vz7.9.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-229.7.2.vz7.9.15
------>
commit b653eb7dd123fb0315efc08808128bc1b6fa7b93
Author: Vladimir Davydov <vdavydov at virtuozzo.com>
Date:   Tue Dec 8 18:15:56 2015 +0400

    mm: page_idle: look up page anon_vma carefully when checking references
    
    Patchset description:
    
    rmap_walk() present in RH7 requires the caller to either hold mmap_sem
    or pin the page's anon_vma. page_idle_clear_pte_refs does neither. As a
    result, it might end up trying to lock/unlock anon_vma which has already
    been freed and possibly reallocated. This won't do any good.
    
    Let's pull the new version of rmap_walk() from upstream, which allows to
    specify a custom anon_vma lock function and use it in page_idle code to
    avoid this issue. This patch puts page_idle in sync with upstream.
    
    I hope this will fix:
    
    https://jira.sw.ru/browse/PSBM-42015
    
    Joonsoo Kim (3):
      mm/rmap: factor lock function out of rmap_walk_anon()
      mm/rmap: make rmap_walk to get the rmap_walk_control argument
      mm/rmap: extend rmap_walk_xxx() to cope with different cases
    
    Vladimir Davydov (1):
      mm: page_idle: look up page anon_vma carefully when checking references
    
    ============================
    This patch description:
    
    Since we don't hold mmap_sem when checking references, we can't just
    read anon_vma from page->mapping and call down_read on its rwsem,
    becuase the page can be unmapped in the meantime and the anon_vma can be
    freed so that we will end up messing with already freed and perhaps
    reallocated anon_vma. This can result in memory corruptions and dead
    locks.
    
    To fix this issue, this patch makes use of the rmap_walk infrastructure
    pulled by previous patches, which allows to use page_lock_anon_vma_read
    for locking page anon_vma. This function handles the race described
    above carefully, so make page_idle_clear_pte_refs use it, as it is done
    upstream.
    
    https://jira.sw.ru/browse/PSBM-42015
    
    Fixes: 577afeb24634 ("ms/mm: introduce idle page tracking")
    Signed-off-by: Vladimir Davydov <vdavydov at virtuozzo.com>
---
 mm/page_idle.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/mm/page_idle.c b/mm/page_idle.c
index c09a5a2..6470ba0 100644
--- a/mm/page_idle.c
+++ b/mm/page_idle.c
@@ -88,15 +88,28 @@ static int page_idle_clear_pte_refs_one(struct page *page,
 
 static void page_idle_clear_pte_refs(struct page *page)
 {
+	/*
+	 * Since rwc.arg is unused, rwc is effectively immutable, so we
+	 * can make it static const to save some cycles and stack.
+	 */
+	static const struct rmap_walk_control rwc = {
+		.rmap_one = page_idle_clear_pte_refs_one,
+		.anon_lock = page_lock_anon_vma_read,
+	};
+	bool need_lock;
+
 	if (!page_mapped(page) ||
 	    !page_rmapping(page))
 		return;
 
-	if (!trylock_page(page))
+	need_lock = !PageAnon(page) || PageKsm(page);
+	if (need_lock && !trylock_page(page))
 		return;
 
-	rmap_walk(page, page_idle_clear_pte_refs_one, NULL);
-	unlock_page(page);
+	rmap_walk(page, (struct rmap_walk_control *)&rwc);
+
+	if (need_lock)
+		unlock_page(page);
 }
 
 static ssize_t page_idle_bitmap_read(struct file *file, struct kobject *kobj,


More information about the Devel mailing list