[Devel] [PATCH vz8] memcg: access beyond end of shrinker_map

Konstantin Khorenko khorenko at virtuozzo.com
Mon Jun 7 19:14:14 MSK 2021


Valera, Vasya, do we need this patch in vz8?

--
Best regards,

Konstantin Khorenko,
Virtuozzo Linux Kernel Team

On 06/04/2021 06:18 PM, Valeriy Vdovin wrote:
> From: Vasily Averin <vvs at virtuozzo.com>
>
> vz7 kernel can remove in-use memcg from parent-chield hierarchy tree,
> and skip resize of its shrinker_map during registration of new shrinkers.
> It allows memcg user to access memory beyond of end non-resized
> shrinker_map, and corrupt next slab element.
>
> CPU A                    CPU B                       CPU C
> got memcg            cgroup_rmdir()                 sget_userns()
> shrink_slab_memcg()  cgroup_destroy_locked()        register_shrinker()
>   .                  cgroup_css_killed()            register_memcg_shrinker()
>   .                   cgroup_offline_fn()           down_write(&shrinker_rwsem)
>   .                   list_del_rcu(&cgrp->sibling)  memcg_expand_shrinker_maps()
> wait for shrinker_rwsem                             re-allocs shrinker_maps of all memcg
>     .                                                but ignores memcg used on CPU A
>     .                                                because it was removed by CPU B
>     .                                               releases shrinker_rwsem
> accesses and modifies memory
>  beyond end of own shrinker_map
>   while using of new memcg_shrinkers
>   registered on CPU C
>
> v4:
> - corrected WARN_ON condition
>
> v3:
> - corrected map->nr_max value, "- 1" was removed
> - nr_max update inside shrink_slab_memcg()
>
> v2 changes:
> - map_size was in bytes, shrinker_nr_max was in bytes (thanks to ktkhai@)
> - map_size was replaced by nr_max -- maximal allowed bit number
> - added WARN_ONs to detect another possible corruption places
>
> https://jira.sw.ru/browse/PSBM-100509
> https://jira.sw.ru/browse/PSBM-100694
>
> Signed-off-by: Vasily Averin <vvs at virtuozzo.com>
> Acked-by: Kirill Tkhai <ktkhai at virtuozzo.com>
> (cherry-picked from 2ad49c8b916379544fa9dd35f5059dd9ffb0ce1b)
> https://jira.sw.ru/browse/PSBM-127849
> Signed-off-by: Valeriy Vdovin <valeriy.vdovin at virtuozzo.com>
> ---
>  mm/vmscan.c | 8 ++++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 95e23c6b4590..23113db62eef 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -540,7 +540,7 @@ static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int nid,
>  {
>  	struct memcg_shrinker_map *map;
>  	unsigned long freed = 0;
> -	int ret, i;
> +	int ret, i, nr_max;
>
>  	if (!memcg_kmem_enabled() || !mem_cgroup_online(memcg))
>  		return 0;
> @@ -553,7 +553,9 @@ static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int nid,
>  	if (unlikely(!map))
>  		goto unlock;
>
> -	for_each_set_bit(i, map->map, shrinker_nr_max) {
> +	nr_max = min(shrinker_nr_max, map->nr_max);
> +
> +	for_each_set_bit(i, map->map, nr_max) {
>  		struct shrink_control sc = {
>  			.gfp_mask = gfp_mask,
>  			.nid = nid,
> @@ -594,6 +596,7 @@ static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int nid,
>  				memcg_set_shrinker_bit(memcg, nid, i);
>  		}
>  		freed += ret;
> +		nr_max = min(shrinker_nr_max, map->nr_max);
>  	}
>  unlock:
>  	up_read(&shrinker_rwsem);
> @@ -663,6 +666,7 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
>  		if (ret == SHRINK_EMPTY)
>  			ret = 0;
>  		freed += ret;
>  		/*
>  		 * Bail out if someone want to register a new shrinker to
>  		 * prevent the regsitration from being stalled for long periods
>


More information about the Devel mailing list