[Devel] [PATCH rh7] fs/mm: writeback: fix per bdi dirty background threshold calculation
Maxim Patlasov
mpatlasov at virtuozzo.com
Mon Apr 11 16:52:23 PDT 2016
Acked-by: Maxim Patlasov <mpatlasov at virtuozzo.com>
On 04/11/2016 09:49 AM, Vladimir Davydov wrote:
> After patch [1] introduced upper and lower boundaries for per bdi dirty
> threshold (see bdi->min_dirty_pages and max_dirty_pages), it is
> incorrect to use bdi_dirty_limit() helper for calculating background
> threshold. E.g. on a 16 GB host, bdi_dirty_limit() would return the
> following values for a FUSE device if the upper boundary was unset:
>
> bdi_thresh = (16 GB * 20 / 100) * 20 / 100 = 655 MB
> ^^^^^ ^^^^^^^^ ^^^^^^^^
> RAM size bdi->max_ratio
>
> vm.dirty_ratio
>
> bdi_bg_thresh = (16 GB * 10 / 100) * 20 / 100 = 327 MB
> ^^^^^ ^^^^^^^^ ^^^^^^^^
> RAM size bdi->max_ratio
>
> vm.dirty_background_ratio
>
> which looks fine.
>
> However, with the default upper threshold of 256 MB for FUSE devices,
> both dirty and background thresholds will be equal to 256 MB. As a
> result the background flusher will only wake up once the writer is
> throttled. This obviously results in a huge write rate degradation.
>
> To fix this issue, let's use bdi_dirty_limit() helper only for
> calculating the throttle threshold, and compute the background threshold
> as follows:
>
> bdi_bg_thresh = bdi_thresh * global_background_thresh / global_thresh
>
> https://jira.sw.ru/browse/PSBM-45497
>
> Fixes: 2f5b9552e256d ("fuse: improve bdi dirty memory limits for fuse") [1]
> Signed-off-by: Vladimir Davydov <vdavydov at virtuozzo.com>
> ---
> fs/fs-writeback.c | 8 ++++++--
> mm/page-writeback.c | 7 +++++--
> 2 files changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index b6f2e3f6dd8f..55eca543f921 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -835,6 +835,7 @@ long writeback_inodes_wb(struct bdi_writeback *wb, long nr_pages,
> static bool over_bground_thresh(struct backing_dev_info *bdi)
> {
> unsigned long background_thresh, dirty_thresh;
> + unsigned long bdi_thresh, bdi_bg_thresh;
>
> global_dirty_limits(&background_thresh, &dirty_thresh);
>
> @@ -842,8 +843,11 @@ static bool over_bground_thresh(struct backing_dev_info *bdi)
> global_page_state(NR_UNSTABLE_NFS) > background_thresh)
> return true;
>
> - if (bdi_stat(bdi, BDI_RECLAIMABLE) >
> - bdi_dirty_limit(bdi, background_thresh))
> + bdi_thresh = bdi_dirty_limit(bdi, dirty_thresh);
> + bdi_bg_thresh = div_u64((u64)bdi_thresh * background_thresh,
> + dirty_thresh);
> +
> + if (bdi_stat(bdi, BDI_RECLAIMABLE) > bdi_bg_thresh)
> return true;
>
> return false;
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 64a64f330a27..35e3ba8ac566 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -1129,12 +1129,15 @@ static void bdi_update_dirty_ratelimit(struct backing_dev_info *bdi,
> * of backing device (see the implementation of bdi_dirty_limit()).
> */
> if (unlikely(bdi->capabilities & BDI_CAP_STRICTLIMIT)) {
> + unsigned long bdi_bg_thresh;
> +
> + bdi_bg_thresh = div_u64((u64)bdi_thresh * bg_thresh, thresh);
> +
> dirty = bdi_dirty;
> if (bdi_dirty < 8)
> setpoint = bdi_dirty + 1;
> else
> - setpoint = (bdi_thresh +
> - bdi_dirty_limit(bdi, bg_thresh)) / 2;
> + setpoint = (bdi_thresh + bdi_bg_thresh) / 2;
> }
>
> if (dirty < setpoint) {
More information about the Devel
mailing list