[Devel] [PATCH RHEL7 COMMIT] ms/Don't trigger congestion wait on dirty-but-not-writeout pages

Konstantin Khorenko khorenko at virtuozzo.com
Thu Jul 13 18:40:42 MSK 2017


The commit is pushed to "branch-rh7-3.10.0-514.26.1.vz7.33.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-514.26.1.vz7.33.6
------>
commit e897b7fe90ade95ec744d62721678ff3c3c6c921
Author: Linus Torvalds <torvalds at linux-foundation.org>
Date:   Thu Jul 13 19:40:42 2017 +0400

    ms/Don't trigger congestion wait on dirty-but-not-writeout pages
    
    commit b738d764652dc5aab1c8939f637112981fce9e0e upstream.
    
    shrink_inactive_list() used to wait 0.1s to avoid congestion when all
    the pages that were isolated from the inactive list were dirty but not
    under active writeback.  That makes no real sense, and apparently causes
    major interactivity issues under some loads since 3.11.
    
    The ostensible reason for it was to wait for kswapd to start writing
    pages, but that seems questionable as well, since the congestion wait
    code seems to trigger for kswapd itself as well.  Also, the logic behind
    delaying anything when we haven't actually started writeback is not
    clear - it only delays actually starting that writeback.
    
    We'll still trigger the congestion waiting if
    
     (a) the process is kswapd, and we hit pages flagged for immediate
         reclaim
    
     (b) the process is not kswapd, and the zone backing dev writeback is
         actually congested.
    
    This probably needs to be revisited, but as it is this fixes a reported
    regression.
    
    [mhocko at suse.cz: backport to 3.12 stable tree]
    Fixes: e2be15f6c3ee ('mm: vmscan: stall page reclaim and writeback pages based on dirty/writepage pages encountered')
    Reported-by: Felipe Contreras <felipe.contreras at gmail.com>
    Pinpointed-by: Hillf Danton <dhillf at gmail.com>
    Cc: Michal Hocko <mhocko at suse.cz>
    Cc: Andrew Morton <akpm at linux-foundation.org>
    Cc: Mel Gorman <mgorman at suse.de>
    Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
    Signed-off-by: Michal Hocko <mhocko at suse.cz>
    Signed-off-by: Jiri Slaby <jslaby at suse.cz>
    
    Applied in the scope of
    https://jira.sw.ru/browse/PSBM-68029
    
    Signed-off-by: Andrey Ryabinin <aryabinin at virtuozzo.com>
---
 mm/vmscan.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 6bf978f..1b4471e 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1618,19 +1618,18 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
 		 * If dirty pages are scanned that are not queued for IO, it
 		 * implies that flushers are not keeping up. In this case, flag
 		 * the zone ZONE_TAIL_LRU_DIRTY and kswapd will start writing
-		 * pages from reclaim context. It will forcibly stall in the
-		 * next check.
+		 * pages from reclaim context.
 		 */
 		if (nr_unqueued_dirty == nr_taken)
 			zone_set_flag(zone, ZONE_TAIL_LRU_DIRTY);
 
 		/*
-		 * In addition, if kswapd scans pages marked marked for
-		 * immediate reclaim and under writeback (nr_immediate), it
-		 * implies that pages are cycling through the LRU faster than
+		 * If kswapd scans pages marked marked for immediate
+		 * reclaim and under writeback (nr_immediate), it implies
+		 * that pages are cycling through the LRU faster than
 		 * they are written so also forcibly stall.
 		 */
-		if (nr_unqueued_dirty == nr_taken || nr_immediate)
+		if (nr_immediate)
 			congestion_wait(BLK_RW_ASYNC, HZ/10);
 	}
 


More information about the Devel mailing list