[Devel] [PATCH 1/2] sched: calculate_imbalance: Fix local->avg_load > sds->avg_load case
Vladimir Davydov
vdavydov at parallels.com
Mon Sep 16 01:06:08 PDT 2013
On 09/16/2013 09:52 AM, Peter Zijlstra wrote:
> On Sun, Sep 15, 2013 at 05:49:13PM +0400, Vladimir Davydov wrote:
>> In busiest->group_imb case we can come to calculate_imbalance() with
>> local->avg_load >= busiest->avg_load >= sds->avg_load. This can result
>> in imbalance overflow, because it is calculated as follows
>>
>> env->imbalance = min(
>> max_pull * busiest->group_power,
>> (sds->avg_load - local->avg_load) * local->group_power
>> ) / SCHED_POWER_SCALE;
>>
>> As a result we can end up constantly bouncing tasks from one cpu to
>> another if there are pinned tasks.
>>
>> Fix this by skipping the assignment and assuming imbalance=0 in case
>> local->avg_load > sds->avg_load.
>> --
>> The bug can be caught by running 2*N cpuhogs pinned to two logical cpus
>> belonging to different cores on an HT-enabled machine with N logical
>> cpus: just look at se.nr_migrations growth.
>>
>> Signed-off-by: Vladimir Davydov<vdavydov at parallels.com>
>> ---
>> kernel/sched/fair.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 9b3fe1c..507a8a9 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -4896,7 +4896,8 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s
>> * max load less than avg load(as we skip the groups at or below
>> * its cpu_power, while calculating max_load..)
>> */
>> - if (busiest->avg_load < sds->avg_load) {
>> + if (busiest->avg_load <= sds->avg_load ||
>> + local->avg_load >= sds->avg_load) {
>> env->imbalance = 0;
>> return fix_small_imbalance(env, sds);
>> }
> Why the = part? Surely 'busiest->avg_load < sds->avg_load ||
> local->avg_load > sds->avg_load' avoids both underflows?
Of course it does, but env->imbalance will be assigned to 0 anyway in =
case, so why not go shortcut?
More information about the Devel
mailing list