[Devel] [PATCH rh7] tcache: fix use-after-free in tcache_invalidate_node_pages()

Konstantin Khlebnikov khlebnikov at openvz.org
Sun Jan 10 08:08:14 PST 2016


On Tue, Dec 8, 2015 at 7:39 PM, Vladimir Davydov <vdavydov at virtuozzo.com> wrote:
> On Tue, Dec 08, 2015 at 06:52:50PM +0300, Andrey Ryabinin wrote:
>> tcache_invalidate_node_pages() temporarly drops/takes back node->tree_lock.
>> Once lock was dropped, another thread might remove and free the next slot.
>> Don't drop the looks.
>>
>> https://jira.sw.ru/browse/PSBM-42104
>>
>> Signed-off-by: Andrey Ryabinin <aryabinin at virtuozzo.com>
>> ---
>>  mm/tcache.c | 13 +++----------
>>  1 file changed, 3 insertions(+), 10 deletions(-)
>>
>> diff --git a/mm/tcache.c b/mm/tcache.c
>> index b8757cf..9bb88b4 100644
>> --- a/mm/tcache.c
>> +++ b/mm/tcache.c
>> @@ -121,8 +121,9 @@ static struct tcache_lru *tcache_lru_node;
>>  /*
>>   * Locking rules:
>>   *
>> - * - tcache_node_tree->lock nests inside tcache_node->tree_lock
>> - * - tcache_lru->lock is independent
>> + *   tcache_node->tree_lock
>> + *        tcache_node_tree->lock
>> + *        tcache_lru->lock
>>   */
>>
>>  /* Enable/disable tcache backend (set at boot time) */
>> @@ -677,16 +678,8 @@ tcache_invalidate_node_pages(struct tcache_node *node)
>>       radix_tree_for_each_slot(slot, &node->page_tree, &iter, 0) {
>>               page = radix_tree_deref_slot_protected(slot, &node->tree_lock);
>>               BUG_ON(!__tcache_page_tree_delete(node, page->index, page));
>> -             spin_unlock(&node->tree_lock);
>> -
>>               tcache_lru_del(page);
>>               put_page(page);
>> -
>> -             local_irq_enable();
>> -             cond_resched();
>> -             local_irq_disable();
>> -
>> -             spin_lock(&node->tree_lock);
>
> This might be OK as a temporary solution, but I think we still should
> drop the lock periodically, because this iteration can take long (think
> of a 10G file cached in tcache) so performing it w/o rescheduling and,
> what is worse, with irqs disabled is a no-go IMO.
>
> I would propose to make use of the ability of radix_tree_for_each_slot
> to continue iteration from the specified index. Something like this
> would do the trick I suppose (NOT TESTED):
>
> diff --git a/mm/tcache.c b/mm/tcache.c
> index b8757cf354a9..6a4c45970293 100644
> --- a/mm/tcache.c
> +++ b/mm/tcache.c
> @@ -662,6 +662,10 @@ tcache_invalidate_node_pages(struct tcache_node *node)
>         struct radix_tree_iter iter;
>         struct page *page;
>         void **slot;
> +       pgoff_t index = 0;
> +       int batch;
> +
> +#define TCACHE_INVALIDATE_BATCH                128
>
>         spin_lock_irq(&node->tree_lock);
>
> @@ -674,19 +678,26 @@ tcache_invalidate_node_pages(struct tcache_node *node)
>          * Now truncate all pages. Be careful, because pages can still be
>          * deleted from this node by the shrinker or by concurrent lookups.
>          */
> -       radix_tree_for_each_slot(slot, &node->page_tree, &iter, 0) {
> +restart:
> +       batch = 0;
> +       radix_tree_for_each_slot(slot, &node->page_tree, &iter, index) {
>                 page = radix_tree_deref_slot_protected(slot, &node->tree_lock);
> -               BUG_ON(!__tcache_page_tree_delete(node, page->index, page));
> -               spin_unlock(&node->tree_lock);
> -
> +               index = page->index;
> +               BUG_ON(!__tcache_page_tree_delete(node, index, page));
>                 tcache_lru_del(page);
>                 put_page(page);
>
> -               local_irq_enable();
> -               cond_resched();
> -               local_irq_disable();
> -
> -               spin_lock(&node->tree_lock);
> +               if (++batch > TCACHE_INVALIDATE_BATCH) {
> +                       spin_unlock_irq(&node->tree_lock);
> +                       cond_resched();
> +                       spin_lock_irq(&node->tree_lock);
> +                       /*
> +                        * Restart iteration over the radix tree, because the
> +                        * current node could have been freed when we dropped
> +                        * the lock.
> +                        */
> +                       goto restart;
> +               }
>         }
>
>         BUG_ON(node->nr_pages != 0);


or

radix_tree_for_each_chunk() {
   radix_tree_for_each_chunk_slot() {
   }
   cond_resched_lock_irq
}


> _______________________________________________
> Devel mailing list
> Devel at openvz.org
> https://lists.openvz.org/mailman/listinfo/devel


More information about the Devel mailing list