[Devel] [PATCH rh7] mm/memcontrol: memcg_move_account() get rid of lock_page dependency.

Andrey Ryabinin aryabinin at virtuozzo.com
Wed Mar 4 15:26:52 MSK 2020



On 3/4/20 12:32 PM, Kirill Tkhai wrote:
> On 03.03.2020 18:48, Andrey Ryabinin wrote:
>> Instead of trylock'ing the page, use the lru_lock to prevent the
>> racing against mem_cgroup_migrate(). So the memcg_move_account() doesn't
>> depend on hung IO.
>>
>> The trylock seems to be needed to prevent mem_cgroup_migrate() to see
>> disbalanced pc->mem_cgroup && pc->flags state. We could achieve the same
>> by using lru_lock instead. memcg_move_account() only works with isolated
>> from LRU pages, so it can race only with mem_cgroup_migrate() called
>> with lrucare=true, otherwise the page is not on LRU so the memcg_move_account()
>> can't see it.
>>
>> https://jira.sw.ru/browse/PSBM-101639
>> https://jira.sw.ru/browse/PSBM-101757
>> https://jira.sw.ru/browse/PSBM-94117
>> Signed-off-by: Andrey Ryabinin <aryabinin at virtuozzo.com>
>> ---
>>  mm/memcontrol.c | 15 ++++++++-------
>>  1 file changed, 8 insertions(+), 7 deletions(-)
>>
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index f4629123e7fb..bfb5bbdf9ecb 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -3939,7 +3939,7 @@ static int mem_cgroup_move_account(struct page *page,
>>  				   struct mem_cgroup *to)
>>  {
>>  	unsigned long flags;
>> -	int ret;
>> +	int ret, isolated;
>>  
>>  	VM_BUG_ON(from == to);
>>  	VM_BUG_ON_PAGE(PageLRU(page), page);
>> @@ -3958,8 +3958,7 @@ static int mem_cgroup_move_account(struct page *page,
>>  	 * of its source page while we change it: page migration takes
>>  	 * both pages off the LRU, but page cache replacement doesn't.
>>  	 */
>> -	if (!trylock_page(page))
>> -		goto out;
>> +	lock_page_lru(page, &isolated);
> 
> Does this race with try_get_mem_cgroup_from_page()?
>   

It might be possible when mem_cgroup_move_account() called for online cgroup.
This happens when task moved between cgroups and memory.move_charge_at_immigrate set at certain value.
We normally don't use it. I'm not sure if such race is an actual problem per se, so I'm going add something
like bellow to avoid the race, just in case.

For offlined cgroup the race agains try_get_mem_cgroup_from_page() shouldn't be possible as it will bail
out on failed css_tryget().

---
 mm/memcontrol.c | 25 +++++++++++++++++--------
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index bfb5bbdf9ecb..dfde78a90ebf 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3939,7 +3939,7 @@ static int mem_cgroup_move_account(struct page *page,
 				   struct mem_cgroup *to)
 {
 	unsigned long flags;
-	int ret, isolated;
+	int ret;
 
 	VM_BUG_ON(from == to);
 	VM_BUG_ON_PAGE(PageLRU(page), page);
@@ -3953,12 +3953,18 @@ static int mem_cgroup_move_account(struct page *page,
 	if (nr_pages > 1 && !PageTransHuge(page))
 		goto out;
 
-	/*
-	 * Prevent mem_cgroup_migrate() from looking at pc->mem_cgroup
-	 * of its source page while we change it: page migration takes
-	 * both pages off the LRU, but page cache replacement doesn't.
-	 */
-	lock_page_lru(page, &isolated);
+	if (!from->is_offline) {
+		/* Protect us from racing against try_get_mem_cgroup_from_page() */
+		if (!trylock_page(page))
+			goto out;
+	} else {
+		/*
+		 * Prevent mem_cgroup_migrate() from looking at pc->mem_cgroup
+		 * of its source page while we change it: page migration takes
+		 * both pages off the LRU, but page cache replacement doesn't.
+		 */
+		spin_lock_irq(&page_zone(page)->lru_lock);
+	}
 
 	ret = -EINVAL;
 	if (!PageCgroupUsed(pc) || pc->mem_cgroup != from)
@@ -3991,7 +3997,10 @@ static int mem_cgroup_move_account(struct page *page,
 	memcg_check_events(from, page);
 	local_irq_restore(flags);
 out_unlock:
-	unlock_page_lru(page, isolated);
+	if (!from->is_offline)
+		unlock_page(page);
+	else
+		spin_unlock_irq(&page_zone(page)->lru_lock);
 out:
 	return ret;
 }
-- 
2.24.1



More information about the Devel mailing list