[Devel] [PATCH rh7] fs/mm: writeback: fix per bdi dirty background threshold calculation

Vladimir Davydov vdavydov at virtuozzo.com
Mon Apr 11 09:49:42 PDT 2016


After patch [1] introduced upper and lower boundaries for per bdi dirty
threshold (see bdi->min_dirty_pages and max_dirty_pages), it is
incorrect to use bdi_dirty_limit() helper for calculating background
threshold. E.g. on a 16 GB host, bdi_dirty_limit() would return the
following values for a FUSE device if the upper boundary was unset:

  bdi_thresh = (16 GB * 20 / 100) * 20 / 100 = 655 MB
                ^^^^^   ^^^^^^^^    ^^^^^^^^
              RAM size           bdi->max_ratio

                    vm.dirty_ratio

  bdi_bg_thresh = (16 GB * 10 / 100) * 20 / 100 = 327 MB
                   ^^^^^   ^^^^^^^^    ^^^^^^^^
                 RAM size           bdi->max_ratio

                   vm.dirty_background_ratio

which looks fine.

However, with the default upper threshold of 256 MB for FUSE devices,
both dirty and background thresholds will be equal to 256 MB. As a
result the background flusher will only wake up once the writer is
throttled. This obviously results in a huge write rate degradation.

To fix this issue, let's use bdi_dirty_limit() helper only for
calculating the throttle threshold, and compute the background threshold
as follows:

  bdi_bg_thresh = bdi_thresh * global_background_thresh / global_thresh

https://jira.sw.ru/browse/PSBM-45497

Fixes: 2f5b9552e256d ("fuse: improve bdi dirty memory limits for fuse") [1]
Signed-off-by: Vladimir Davydov <vdavydov at virtuozzo.com>
---
 fs/fs-writeback.c   | 8 ++++++--
 mm/page-writeback.c | 7 +++++--
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index b6f2e3f6dd8f..55eca543f921 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -835,6 +835,7 @@ long writeback_inodes_wb(struct bdi_writeback *wb, long nr_pages,
 static bool over_bground_thresh(struct backing_dev_info *bdi)
 {
 	unsigned long background_thresh, dirty_thresh;
+	unsigned long bdi_thresh, bdi_bg_thresh;
 
 	global_dirty_limits(&background_thresh, &dirty_thresh);
 
@@ -842,8 +843,11 @@ static bool over_bground_thresh(struct backing_dev_info *bdi)
 	    global_page_state(NR_UNSTABLE_NFS) > background_thresh)
 		return true;
 
-	if (bdi_stat(bdi, BDI_RECLAIMABLE) >
-				bdi_dirty_limit(bdi, background_thresh))
+	bdi_thresh = bdi_dirty_limit(bdi, dirty_thresh);
+	bdi_bg_thresh = div_u64((u64)bdi_thresh * background_thresh,
+				dirty_thresh);
+
+	if (bdi_stat(bdi, BDI_RECLAIMABLE) > bdi_bg_thresh)
 		return true;
 
 	return false;
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 64a64f330a27..35e3ba8ac566 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1129,12 +1129,15 @@ static void bdi_update_dirty_ratelimit(struct backing_dev_info *bdi,
 	 * of backing device (see the implementation of bdi_dirty_limit()).
 	 */
 	if (unlikely(bdi->capabilities & BDI_CAP_STRICTLIMIT)) {
+		unsigned long bdi_bg_thresh;
+
+		bdi_bg_thresh = div_u64((u64)bdi_thresh * bg_thresh, thresh);
+
 		dirty = bdi_dirty;
 		if (bdi_dirty < 8)
 			setpoint = bdi_dirty + 1;
 		else
-			setpoint = (bdi_thresh +
-				    bdi_dirty_limit(bdi, bg_thresh)) / 2;
+			setpoint = (bdi_thresh + bdi_bg_thresh) / 2;
 	}
 
 	if (dirty < setpoint) {
-- 
2.1.4



More information about the Devel mailing list