[Devel] [PATCH RHEL7 v2] ms/vmpressure: make sure there are no events queued after memcg is offlined

Konstantin Khorenko khorenko at virtuozzo.com
Wed Apr 17 13:23:59 MSK 2019


From: Michal Hocko <mhocko at suse.cz>

vmpressure is called synchronously from reclaim where the target_memcg
is guaranteed to be alive but the eventfd is signaled from the work
queue context.  This means that memcg (along with vmpressure structure
which is embedded into it) might go away while the work item is pending
which would result in use-after-release bug.

We have two possible ways how to fix this.  Either vmpressure pins memcg
before it schedules vmpr->work and unpin it in vmpressure_work_fn or
explicitely flush the work item from the css_offline context (as
suggested by Tejun).

This patch implements the later one and it introduces vmpressure_cleanup
which flushes the vmpressure work queue item item.  It hooks into
mem_cgroup_css_offline after the memcg itself is cleaned up.

[akpm at linux-foundation.org: coding-style fixes]
Signed-off-by: Michal Hocko <mhocko at suse.cz>
Reported-by: Tejun Heo <tj at kernel.org>
Cc: Anton Vorontsov <anton.vorontsov at linaro.org>
Cc: Johannes Weiner <hannes at cmpxchg.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu at jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro at jp.fujitsu.com>
Cc: Li Zefan <lizefan at huawei.com>
Acked-by: Tejun Heo <tj at kernel.org>
Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>

(cherry picked from commit 33cb876e947b9ddda8dca3fb99234b743a597ef9)
Backport notes: vmpressure_cleanup() has been moved from
mem_cgroup_css_offline() to mem_cgroup_css_free() because

 - in current mainstream it's already there after global cleanup/rework
 - it's generally safer. 100% cover case when the work is scheduled
   after cgroup is offlined (even if it's caused by another bug).

https://jira.sw.ru/browse/PSBM-93884

Signed-off-by: Konstantin Khorenko <khorenko at virtuozzo.com>
Signed-off-by: Andrey Ryabinin <aryabinin at virtuozzo.com>
---
 Makefile                   |  2 +-
 include/linux/vmpressure.h |  1 +
 mm/memcontrol.c            |  1 +
 mm/vmpressure.c            | 16 ++++++++++++++++
 4 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index ac336685b60e..50479a9d756b 100644
--- a/Makefile
+++ b/Makefile
@@ -14,7 +14,7 @@ RHEL_DRM_VERSION = 4
 RHEL_DRM_PATCHLEVEL = 17
 RHEL_DRM_SUBLEVEL = 19
 # VZVERSION = ovz.94.15
-VZVERSION = ovz.custom
+VZVERSION = ovz.finist
 
 ifeq ($(VZVERSION), ovz.custom)
   GIT_DIR := .git
diff --git a/include/linux/vmpressure.h b/include/linux/vmpressure.h
index 76be077340ea..a9021c358a9c 100644
--- a/include/linux/vmpressure.h
+++ b/include/linux/vmpressure.h
@@ -30,6 +30,7 @@ extern void vmpressure(gfp_t gfp, struct mem_cgroup *memcg,
 extern void vmpressure_prio(gfp_t gfp, struct mem_cgroup *memcg, int prio);
 
 extern void vmpressure_init(struct vmpressure *vmpr);
+extern void vmpressure_cleanup(struct vmpressure *vmpr);
 extern struct vmpressure *memcg_to_vmpressure(struct mem_cgroup *memcg);
 extern struct cgroup_subsys_state *vmpressure_to_css(struct vmpressure *vmpr);
 extern struct vmpressure *css_to_vmpressure(struct cgroup_subsys_state *css);
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index f2a81d72d3bf..4d520b570687 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -6618,6 +6618,7 @@ static void mem_cgroup_css_free(struct cgroup *cont)
 	 */
 	mem_cgroup_reparent_charges(memcg);
 
+	vmpressure_cleanup(&memcg->vmpressure);
 	memcg_destroy_kmem(memcg);
 	memcg_free_shrinker_maps(memcg);
 	__mem_cgroup_free(memcg);
diff --git a/mm/vmpressure.c b/mm/vmpressure.c
index 736a6011c2c8..7a3da89f790a 100644
--- a/mm/vmpressure.c
+++ b/mm/vmpressure.c
@@ -372,3 +372,19 @@ void vmpressure_init(struct vmpressure *vmpr)
 	INIT_LIST_HEAD(&vmpr->events);
 	INIT_WORK(&vmpr->work, vmpressure_work_fn);
 }
+
+/**
+ * vmpressure_cleanup() - shuts down vmpressure control structure
+ * @vmpr:	Structure to be cleaned up
+ *
+ * This function should be called before the structure in which it is
+ * embedded is cleaned up.
+ */
+void vmpressure_cleanup(struct vmpressure *vmpr)
+{
+	/*
+	 * Make sure there is no pending work before eventfd infrastructure
+	 * goes away.
+	 */
+	flush_work(&vmpr->work);
+}
-- 
2.15.1



More information about the Devel mailing list