[Devel] [PATCH v7 09/11] sched: record per-cgroup number of context switches

Tejun Heo tj at kernel.org
Thu Jun 6 17:04:52 PDT 2013


Hello,

Maybe we should break off addition of switch stats to a separate set?
They are two separate things.

On Wed, May 29, 2013 at 03:03:20PM +0400, Glauber Costa wrote:
> @@ -3642,6 +3642,8 @@ pick_next_task_fair(struct rq *rq, struct task_struct *prev)
>  		prev->sched_class->put_prev_task(rq, prev);
>  
>  	do {
> +		if (likely(prev))
> +			cfs_rq->nr_switches++;
>  		se = pick_next_entity(cfs_rq);
>  		set_next_entity(cfs_rq, se);
>  		cfs_rq = group_cfs_rq(se);
> @@ -3651,6 +3653,22 @@ pick_next_task_fair(struct rq *rq, struct task_struct *prev)
>  	if (hrtick_enabled(rq))
>  		hrtick_start_fair(rq, p);
>  
> +	/*
> +	 * This condition is extremely unlikely, and most of the time will just
> +	 * consist of this unlikely branch, which is extremely cheap. But we
> +	 * still need to have it, because when we first loop through cfs_rq's,
> +	 * we can't possibly know which task we will pick. The call to
> +	 * set_next_entity above is not meant to mess up the tree in this case,
> +	 * so this should give us the same chain, in the same order.
> +	 */
> +	if (unlikely(p == prev)) {
> +		se = &p->se;
> +		for_each_sched_entity(se) {
> +			cfs_rq = cfs_rq_of(se);
> +			cfs_rq->nr_switches--;
> +		}
> +	}
> +

This concern may be fringe but the above breaks the monotonically
increasing property of the stat.  Depending on the timing, a very
unlucky consumer of the stat may see the counter going backward which
can lead to nasty things.  I'm not sure whether the fact that it'd be
very difficult to trigger is a pro or con.

Thanks.

-- 
tejun



More information about the Devel mailing list