[Devel] [PATCH rh7 2/2] mm/page-writeback: Introduce per-CT dirty memory limit.

Mon Jan 18 03:42:37 PST 2016

On Fri, Jan 15, 2016 at 06:17:25PM +0300, Andrey Ryabinin wrote:

> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 91c1b07..836ce88 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -195,6 +195,28 @@ void bdi_start_background_writeback(struct backing_dev_info *bdi)
>  	bdi_wakeup_thread(bdi);
>  }
>  
> +/**
> + * bdi_start_background_writeback_ub - start background writeback for ub
> + * @bdi: the backing device to write from
> + * @ub: taks's io beancounter
> + *
> + * Description:
> + *   This makes sure WB_SYNC_NONE background writeback happens. When
> + *   this function returns, it is only guaranteed that for given BDI
> + *   some IO is happening if we are over background dirty threshold.
> + *   Caller need not hold sb s_umount semaphore.
> + */
> +void bdi_start_background_writeback_ub(struct backing_dev_info *bdi,
> +				struct user_beancounter *ub)
> +{
> +	/*
> +	 * We just wake up the flusher thread. It will perform background
> +	 * writeback as soon as there is no other work to do.
> +	 */
> +	trace_writeback_wake_background(bdi);
> +	__bdi_start_writeback(bdi, LONG_MAX, true, WB_REASON_BACKGROUND, ub);
> +}
> +
>  /*
>   * Remove the inode from the writeback list it is on.
>   */
> @@ -708,6 +730,15 @@ static long writeback_sb_inodes(struct super_block *sb,
>  		 * kind writeout is handled by the freer.
>  		 */
>  		spin_lock(&inode->i_lock);
> +		/* Filter ub inodes if bdi dirty limit isn't exceeded */
> +		if (work->ub && !wb->bdi->dirty_exceeded &&
> +		    (inode->i_state & I_DIRTY) == I_DIRTY_PAGES &&
> +			ub_should_skip_writeback(work->ub, inode)) {
> +			spin_unlock(&inode->i_lock);
> +			redirty_tail(inode, wb);
> +			continue;
> +		}
> +
>  		if (inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE)) {
>  			spin_unlock(&inode->i_lock);
>  			redirty_tail(inode, wb);

I think the two hunks above should go to patch #1.

> diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
> index b7668cf..ae0e828 100644
> --- a/include/linux/backing-dev.h
> +++ b/include/linux/backing-dev.h
...
> @@ -1654,6 +1894,12 @@ void balance_dirty_pages_ratelimited(struct address_space *mapping)
>  	struct backing_dev_info *bdi = mapping->backing_dev_info;
>  	int ratelimit;
>  	int *p;
> +	struct user_beancounter *ub = get_io_ub();
> +
> +	if (ub != get_ub0()) {
> +		balance_dirty_pages_ratelimited_nr(mapping, 1);
> +		return;
> +	}

I don't think it's a good idea to skip global background writeback
altogether in case of containers. I'd use per-ub writeback in
conjunction with the global writeback, i.e. if we exceed per-ub dirty
limit do per-ub writeback, then check the global dirty limit and perform
global writeback if needed. The point is that the global writeback code
will be invoked more often, I guess, and it works better, so we'd better
make use of it whenever possible.

>  
>  	if (!bdi_cap_account_dirty(bdi))
>  		return;