[Devel] Re: [PATCH] io-controller: Add io group reference handling for request
Andrea Righi
righi.andrea at gmail.com
Sun May 17 03:26:06 PDT 2009
On Fri, May 15, 2009 at 10:06:43AM -0400, Vivek Goyal wrote:
> On Fri, May 15, 2009 at 09:48:40AM +0200, Andrea Righi wrote:
> > On Fri, May 15, 2009 at 01:15:24PM +0800, Gui Jianfeng wrote:
> > > Vivek Goyal wrote:
> > > ...
> > > > }
> > > > @@ -1462,20 +1462,27 @@ struct io_cgroup *get_iocg_from_bio(stru
> > > > /*
> > > > * Find the io group bio belongs to.
> > > > * If "create" is set, io group is created if it is not already present.
> > > > + * If "curr" is set, io group is information is searched for current
> > > > + * task and not with the help of bio.
> > > > + *
> > > > + * FIXME: Can we assume that if bio is NULL then lookup group for current
> > > > + * task and not create extra function parameter ?
> > > > *
> > > > - * Note: There is a narrow window of race where a group is being freed
> > > > - * by cgroup deletion path and some rq has slipped through in this group.
> > > > - * Fix it.
> > > > */
> > > > -struct io_group *io_get_io_group_bio(struct request_queue *q, struct bio *bio,
> > > > - int create)
> > > > +struct io_group *io_get_io_group(struct request_queue *q, struct bio *bio,
> > > > + int create, int curr)
> > >
> > > Hi Vivek,
> > >
> > > IIUC we can get rid of curr, and just determine iog from bio. If bio is not NULL,
> > > get iog from bio, otherwise get it from current task.
> >
> > Consider also that get_cgroup_from_bio() is much more slow than
> > task_cgroup() and need to lock/unlock_page_cgroup() in
> > get_blkio_cgroup_id(), while task_cgroup() is rcu protected.
> >
>
> True.
>
> > BTW another optimization could be to use the blkio-cgroup functionality
> > only for dirty pages and cut out some blkio_set_owner(). For all the
> > other cases IO always occurs in the same context of the current task,
> > and you can use task_cgroup().
> >
>
> Yes, may be in some cases we can avoid setting page owner. I will get
> to it once I have got functionality going well. In the mean time if
> you have a patch for it, it will be great.
>
> > However, this is true only for page cache pages, for IO generated by
> > anonymous pages (swap) you still need the page tracking functionality
> > both for reads and writes.
> >
>
> Right now I am assuming that all the sync IO will belong to task
> submitting the bio hence use task_cgroup() for that. Only for async
> IO, I am trying to use page tracking functionality to determine the owner.
> Look at elv_bio_sync(bio).
>
> You seem to be saying that there are cases where even for sync IO, we
> can't use submitting task's context and need to rely on page tracking
> functionlity? In case of getting page (read) from swap, will it not happen
> in the context of process who will take a page fault and initiate the
> swap read?
No, for example in read_swap_cache_async():
@@ -308,6 +309,7 @@ struct page *read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
*/
__set_page_locked(new_page);
SetPageSwapBacked(new_page);
+ blkio_cgroup_set_owner(new_page, current->mm);
err = add_to_swap_cache(new_page, entry, gfp_mask & GFP_KERNEL);
if (likely(!err)) {
/*
This is a read, but the current task is not always the owner of this
swap cache page, because it's a readahead operation.
Anyway, this is a minor corner case I think. And probably it is safe to
consider this like any other read IO and get rid of the
blkio_cgroup_set_owner().
I wonder if it would be better to attach the blkio_cgroup to the
anonymous page only when swap-out occurs. I mean, just put the
blkio_cgroup_set_owner() hook in try_to_umap() in order to keep track of
the IO generated by direct reclaim of anon memory. For all the other
cases we can simply use the submitting task's context.
BTW, O_DIRECT is another case that is possible to optimize, because all
the bios generated by direct IO occur in the same context of the current
task.
-Andrea
_______________________________________________
Containers mailing list
Containers at lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
More information about the Devel
mailing list