[Devel] [PATCH rh7] radix-tree: do not account radix_tree_nodes to memcg

Vladimir Davydov vdavydov at parallels.com
Mon Aug 3 01:47:59 PDT 2015


There are two problems if they are accounted.

First, radix_tree_nodes allocated by tcache/tswap for storing their
internal data will be accounted to the container that issued a store,
which is wrong, because they can only get reclaimed on global pressure.
Using __GFP_NOACCOUNT in tcache/tswap wouldn't help due to per cpu
radix_tree_node preloads.

Second, workingset detection logic (see mm/workingset.c) is still not
memory cgroup aware. In particular, this means that shadow
radix_tree_nodes can only be reclaimed on global memory pressure
although they are accounted to a memory cgroup. As a result, after
reading a huge file, all the container's memory can get filled with
shadow entries, which won't be reclaimed on local memory pressure,
making the container unusable.

This is a quick-fix which makes radix_tree_nodes unaccountable. This is
acceptable for now, because we had never accounted radix_tree_nodes
before Vz7 anyway. The true fix would be (a) making radix_tree_node
preloads unaccountable (or per memory cgroup) and (b) making workingset
detection logic memory cgroup aware. This should and will be done
upstream first.

https://jira.sw.ru/browse/PSBM-35205

Signed-off-by: Vladimir Davydov <vdavydov at parallels.com>
---
 lib/radix-tree.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index d5c2fa1a4102..0c62aca85591 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -222,7 +222,8 @@ radix_tree_node_alloc(struct radix_tree_root *root)
 		}
 	}
 	if (ret == NULL)
-		ret = kmem_cache_alloc(radix_tree_node_cachep, gfp_mask);
+		ret = kmem_cache_alloc(radix_tree_node_cachep,
+				       gfp_mask | __GFP_NOACCOUNT);
 
 	BUG_ON(radix_tree_is_indirect_ptr(ret));
 	return ret;
@@ -273,7 +274,8 @@ int radix_tree_preload(gfp_t gfp_mask)
 	rtp = &__get_cpu_var(radix_tree_preloads);
 	while (rtp->nr < ARRAY_SIZE(rtp->nodes)) {
 		preempt_enable();
-		node = kmem_cache_alloc(radix_tree_node_cachep, gfp_mask);
+		node = kmem_cache_alloc(radix_tree_node_cachep,
+					gfp_mask | __GFP_NOACCOUNT);
 		if (node == NULL)
 			goto out;
 		preempt_disable();
-- 
2.1.4




More information about the Devel mailing list