[Devel] [PATCH vz8 19/19] mm: use special value SHRINKER_REGISTERING instead of list_empty() check

Kirill Tkhai ktkhai at virtuozzo.com
Mon Mar 30 10:15:13 MSK 2020


Also, this one is needed: 8df4a44cc46b "mm: check shrinker is memcg-aware in register_shrinker_prepared()"

On 27.03.2020 19:45, Andrey Ryabinin wrote:
> From: Kirill Tkhai <ktkhai at virtuozzo.com>
> 
> The patch introduces a special value SHRINKER_REGISTERING to use instead
> of list_empty() to differ a registering shrinker from unregistered
> shrinker.  Why we need that at all?
> 
> Shrinker registration is split in two parts.  The first one is
> prealloc_shrinker(), which allocates shrinker memory and reserves ID in
> shrinker_idr.  This function can fail.  The second is
> register_shrinker_prepared(), and it finalizes the registration.  This
> function actually makes shrinker available to be used from
> shrink_slab(), and it can't fail.
> 
> One shrinker may be based on more then one LRU lists.  So, we never
> clear the bit in memcg shrinker maps, when (one of) corresponding LRU
> list becomes empty, since other LRU lists may be not empty.  See
> superblock shrinker for example: it is based on two LRU lists:
> s_inode_lru and s_dentry_lru.  We do not want to clear shrinker bit,
> when there are no inodes in s_inode_lru, as s_dentry_lru may contain
> dentries.
> 
> Instead of that, we use special algorithm to detect shrinkers having no
> elements at all its LRU lists, and this is made in shrink_slab_memcg().
> See the comment in this function for the details.
> 
> Also, in shrink_slab_memcg() we clear shrinker bit in the map, when we
> meet unregistered shrinker (bit is set, while there is no a shrinker in
> IDR).  Otherwise, we would have done that at the moment of shrinker
> unregistration for all memcgs (and this looks worse, since iteration
> over all memcg may take much time).  Also this would have imposed
> restrictions on shrinker unregistration order for its users: they would
> have had to guarantee, there are no new elements after
> unregister_shrinker() (otherwise, a new added element would have set a
> bit).
> 
> So, if we meet a set bit in map and no shrinker in IDR when we're
> iterating over the map in shrink_slab_memcg(), this means the
> corresponding shrinker is unregistered, and we must clear the bit.
> 
> Another case is shrinker registration.  We want two things there:
> 
> 1) do_shrink_slab() can be called only for completely registered
>    shrinkers;
> 
> 2) shrinker internal lists may be populated in any order with
>    register_shrinker_prepared() (let's talk on the example with sb).  Both
>    of:
> 
>   a)list_lru_add(&inode->i_sb->s_inode_lru, &inode->i_lru); [cpu0]
>     memcg_set_shrinker_bit();                               [cpu0]
>     ...
>     register_shrinker_prepared();                           [cpu1]
> 
>   and
> 
>   b)register_shrinker_prepared();                           [cpu0]
>     ...
>     list_lru_add(&inode->i_sb->s_inode_lru, &inode->i_lru); [cpu1]
>     memcg_set_shrinker_bit();                               [cpu1]
> 
>    are legitimate.  We don't want to impose restriction here and to
>    force people to use only (b) variant.  We don't want to force people to
>    care, there is no elements in LRU lists before the shrinker is
>    completely registered.  Internal users of LRU lists and shrinker code
>    are two different subsystems, and they have to be closed in themselves
>    each other.
> 
> In (a) case we have the bit set before shrinker is completely
> registered.  We don't want do_shrink_slab() is called at this moment, so
> we have to detect such the registering shrinkers.
> 
> Before this patch list_empty() (shrinker is not linked to the list)
> check was used for that.  So, in (a) there could be a bit set, but we
> don't call do_shrink_slab() unless shrinker is linked to the list.  It's
> just an indicator, I just overloaded linking to the list.
> 
> This was not the best solution, since it's better not to touch the
> shrinker memory from shrink_slab_memcg() before it's completely
> registered (this also will be useful in the future to make shrink_slab()
> completely lockless).
> 
> So, this patch introduces better way to detect registering shrinker,
> which allows not to dereference shrinker memory.  It's just a ~0UL
> value, which we insert into the IDR during ID allocation.  After
> shrinker is ready to be used, we insert actual shrinker pointer in the
> IDR, and it becomes available to shrink_slab_memcg().
> 
> We can't use NULL instead of this new value for this purpose as:
> shrink_slab_memcg() already uses NULL to detect unregistered shrinkers,
> and we don't want the function sees NULL and clears the bit, otherwise
> (a) won't work.
> 
> This is the only thing the patch makes: the better way to detect
> registering shrinker.  Nothing else this patch makes.
> 
> Also this gives a better assembler, but it's minor side of the patch:
> 
> Before:
>   callq  <idr_find>
>   mov    %rax,%r15
>   test   %rax,%rax
>   je     <shrink_slab_memcg+0x1d5>
>   mov    0x20(%rax),%rax
>   lea    0x20(%r15),%rdx
>   cmp    %rax,%rdx
>   je     <shrink_slab_memcg+0xbd>
>   mov    0x8(%rsp),%edx
>   mov    %r15,%rsi
>   lea    0x10(%rsp),%rdi
>   callq  <do_shrink_slab>
> 
> After:
>   callq  <idr_find>
>   mov    %rax,%r15
>   lea    -0x1(%rax),%rax
>   cmp    $0xfffffffffffffffd,%rax
>   ja     <shrink_slab_memcg+0x1cd>
>   mov    0x8(%rsp),%edx
>   mov    %r15,%rsi
>   lea    0x10(%rsp),%rdi
>   callq  ffffffff810cefd0 <do_shrink_slab>
> 
> [ktkhai at virtuozzo.com: add #ifdef CONFIG_MEMCG_KMEM around idr_replace()]
>   Link: http://lkml.kernel.org/r/758b8fec-7573-47eb-b26a-7b2847ae7b8c@virtuozzo.com
> Link: http://lkml.kernel.org/r/153355467546.11522.4518015068123480218.stgit@localhost.localdomain
> Signed-off-by: Kirill Tkhai <ktkhai at virtuozzo.com>
> Reviewed-by: Andrew Morton <akpm at linux-foundation.org>
> Cc: Vladimir Davydov <vdavydov.dev at gmail.com>
> Cc: Michal Hocko <mhocko at suse.com>
> Cc: Andrey Ryabinin <aryabinin at virtuozzo.com>
> Cc: "Huang, Ying" <ying.huang at intel.com>
> Cc: Tetsuo Handa <penguin-kernel at I-love.SAKURA.ne.jp>
> Cc: Matthew Wilcox <willy at infradead.org>
> Cc: Shakeel Butt <shakeelb at google.com>
> Cc: Josef Bacik <jbacik at fb.com>
> Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
> Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
> (cherry picked from commit 7e010df53c80197b23119e7d7b95892aa13629df)
> Signed-off-by: Andrey Ryabinin <aryabinin at virtuozzo.com>
> ---
>  mm/vmscan.c | 43 +++++++++++++++++++++----------------------
>  1 file changed, 21 insertions(+), 22 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 18e4f74e1ab3..580c8c6843b9 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -183,6 +183,20 @@ static LIST_HEAD(shrinker_list);
>  static DECLARE_RWSEM(shrinker_rwsem);
>  
>  #ifdef CONFIG_MEMCG_KMEM
> +
> +/*
> + * We allow subsystems to populate their shrinker-related
> + * LRU lists before register_shrinker_prepared() is called
> + * for the shrinker, since we don't want to impose
> + * restrictions on their internal registration order.
> + * In this case shrink_slab_memcg() may find corresponding
> + * bit is set in the shrinkers map.
> + *
> + * This value is used by the function to detect registering
> + * shrinkers and to skip do_shrink_slab() calls for them.
> + */
> +#define SHRINKER_REGISTERING ((struct shrinker *)~0UL)
> +
>  static DEFINE_IDR(shrinker_idr);
>  static int shrinker_nr_max;
>  
> @@ -192,7 +206,7 @@ static int prealloc_memcg_shrinker(struct shrinker *shrinker)
>  
>  	down_write(&shrinker_rwsem);
>  	/* This may call shrinker, so it must use down_read_trylock() */
> -	id = idr_alloc(&shrinker_idr, shrinker, 0, 0, GFP_KERNEL);
> +	id = idr_alloc(&shrinker_idr, SHRINKER_REGISTERING, 0, 0, GFP_KERNEL);
>  	if (id < 0)
>  		goto unlock;
>  
> @@ -332,21 +346,6 @@ int prealloc_shrinker(struct shrinker *shrinker)
>  	if (!shrinker->nr_deferred)
>  		return -ENOMEM;
>  
> -	/*
> -	 * There is a window between prealloc_shrinker()
> -	 * and register_shrinker_prepared(). We don't want
> -	 * to clear bit of a shrinker in such the state
> -	 * in shrink_slab_memcg(), since this will impose
> -	 * restrictions on a code registering a shrinker
> -	 * (they would have to guarantee, their LRU lists
> -	 * are empty till shrinker is completely registered).
> -	 * So, we differ the situation, when 1)a shrinker
> -	 * is semi-registered (id is assigned, but it has
> -	 * not yet linked to shrinker_list) and 2)shrinker
> -	 * is not registered (id is not assigned).
> -	 */
> -	INIT_LIST_HEAD(&shrinker->list);
> -
>  	if (shrinker->flags & SHRINKER_MEMCG_AWARE) {
>  		if (prealloc_memcg_shrinker(shrinker))
>  			goto free_deferred;
> @@ -376,6 +375,9 @@ void register_shrinker_prepared(struct shrinker *shrinker)
>  {
>  	down_write(&shrinker_rwsem);
>  	list_add_tail(&shrinker->list, &shrinker_list);
> +#ifdef CONFIG_MEMCG_KMEM
> +	idr_replace(&shrinker_idr, shrinker, shrinker->id);
> +#endif
>  	up_write(&shrinker_rwsem);
>  }
>  
> @@ -557,15 +559,12 @@ static unsigned long shrink_slab_memcg(gfp_t gfp_mask, int nid,
>  		struct shrinker *shrinker;
>  
>  		shrinker = idr_find(&shrinker_idr, i);
> -		if (unlikely(!shrinker)) {
> -			clear_bit(i, map->map);
> +		if (unlikely(!shrinker || shrinker == SHRINKER_REGISTERING)) {
> +			if (!shrinker)
> +				clear_bit(i, map->map);
>  			continue;
>  		}
>  
> -		/* See comment in prealloc_shrinker() */
> -		if (unlikely(list_empty(&shrinker->list)))
> -			continue;
> -
>  		ret = do_shrink_slab(&sc, shrinker, priority);
>  		if (ret == SHRINK_EMPTY) {
>  			clear_bit(i, map->map);
> 



More information about the Devel mailing list