[Devel] [PATCH rh7] fs/mm: writeback: fix per bdi dirty background threshold calculation

Maxim Patlasov mpatlasov at virtuozzo.com
Mon Apr 11 16:52:23 PDT 2016


Acked-by: Maxim Patlasov <mpatlasov at virtuozzo.com>

On 04/11/2016 09:49 AM, Vladimir Davydov wrote:
> After patch [1] introduced upper and lower boundaries for per bdi dirty
> threshold (see bdi->min_dirty_pages and max_dirty_pages), it is
> incorrect to use bdi_dirty_limit() helper for calculating background
> threshold. E.g. on a 16 GB host, bdi_dirty_limit() would return the
> following values for a FUSE device if the upper boundary was unset:
>
>    bdi_thresh = (16 GB * 20 / 100) * 20 / 100 = 655 MB
>                  ^^^^^   ^^^^^^^^    ^^^^^^^^
>                RAM size           bdi->max_ratio
>
>                      vm.dirty_ratio
>
>    bdi_bg_thresh = (16 GB * 10 / 100) * 20 / 100 = 327 MB
>                     ^^^^^   ^^^^^^^^    ^^^^^^^^
>                   RAM size           bdi->max_ratio
>
>                     vm.dirty_background_ratio
>
> which looks fine.
>
> However, with the default upper threshold of 256 MB for FUSE devices,
> both dirty and background thresholds will be equal to 256 MB. As a
> result the background flusher will only wake up once the writer is
> throttled. This obviously results in a huge write rate degradation.
>
> To fix this issue, let's use bdi_dirty_limit() helper only for
> calculating the throttle threshold, and compute the background threshold
> as follows:
>
>    bdi_bg_thresh = bdi_thresh * global_background_thresh / global_thresh
>
> https://jira.sw.ru/browse/PSBM-45497
>
> Fixes: 2f5b9552e256d ("fuse: improve bdi dirty memory limits for fuse") [1]
> Signed-off-by: Vladimir Davydov <vdavydov at virtuozzo.com>
> ---
>   fs/fs-writeback.c   | 8 ++++++--
>   mm/page-writeback.c | 7 +++++--
>   2 files changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index b6f2e3f6dd8f..55eca543f921 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -835,6 +835,7 @@ long writeback_inodes_wb(struct bdi_writeback *wb, long nr_pages,
>   static bool over_bground_thresh(struct backing_dev_info *bdi)
>   {
>   	unsigned long background_thresh, dirty_thresh;
> +	unsigned long bdi_thresh, bdi_bg_thresh;
>   
>   	global_dirty_limits(&background_thresh, &dirty_thresh);
>   
> @@ -842,8 +843,11 @@ static bool over_bground_thresh(struct backing_dev_info *bdi)
>   	    global_page_state(NR_UNSTABLE_NFS) > background_thresh)
>   		return true;
>   
> -	if (bdi_stat(bdi, BDI_RECLAIMABLE) >
> -				bdi_dirty_limit(bdi, background_thresh))
> +	bdi_thresh = bdi_dirty_limit(bdi, dirty_thresh);
> +	bdi_bg_thresh = div_u64((u64)bdi_thresh * background_thresh,
> +				dirty_thresh);
> +
> +	if (bdi_stat(bdi, BDI_RECLAIMABLE) > bdi_bg_thresh)
>   		return true;
>   
>   	return false;
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 64a64f330a27..35e3ba8ac566 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -1129,12 +1129,15 @@ static void bdi_update_dirty_ratelimit(struct backing_dev_info *bdi,
>   	 * of backing device (see the implementation of bdi_dirty_limit()).
>   	 */
>   	if (unlikely(bdi->capabilities & BDI_CAP_STRICTLIMIT)) {
> +		unsigned long bdi_bg_thresh;
> +
> +		bdi_bg_thresh = div_u64((u64)bdi_thresh * bg_thresh, thresh);
> +
>   		dirty = bdi_dirty;
>   		if (bdi_dirty < 8)
>   			setpoint = bdi_dirty + 1;
>   		else
> -			setpoint = (bdi_thresh +
> -				    bdi_dirty_limit(bdi, bg_thresh)) / 2;
> +			setpoint = (bdi_thresh + bdi_bg_thresh) / 2;
>   	}
>   
>   	if (dirty < setpoint) {



More information about the Devel mailing list