[Devel] [PATCH rh7 1/3] ve: Attach a ve cgroup from task_work

Wed Oct 28 02:53:44 PDT 2015

On 27.10.2015 19:13, Vladimir Davydov wrote:
> On Fri, Oct 16, 2015 at 07:24:49PM +0300, Kirill Tkhai wrote:
> ...
>> @@ -733,6 +742,13 @@ static int ve_can_attach(struct cgroup *cg, struct cgroup_taskset *tset)
>>  		}
>>  	}
>>  
>> +	cgroup_taskset_for_each(task, cg, tset) {
>> +		/* Attaching is in process */
>> +		if (task->task_ve_work.func != NULL)
>> +			return -EAGAIN;
> 
> Failing write to cgroup.procs with EAGAIN looks unexpected. For sure,
> nobody will retry.

This is a protection from circular attaches, and real vzctl never bumped in this
case. But kernel should be safe in any userspace behaviour.

EAGAIN is correct return value for writing to cgroup.procs from userspace. For example,
the latest mainstream pids_can_fork() uses it.

>> +	}
>> +	smp_mb(); /* Pairs with smp_mb() in ve_attach_work() */
>> +
>>  	return 0;
>>  }
>>  
>> @@ -753,7 +769,8 @@ static void ve_attach(struct cgroup *cg, struct cgroup_taskset *tset)
>>  		/* Leave parent exec domain */
>>  		task->parent_exec_id--;
>>  
>> -		task->task_ve = ve;
>> +		init_task_work(&task->task_ve_work, ve_attach_work);
> 
> Why do you need to keep the work on task_struct? It's already huge, so
> let's please try not to make it even bigger. I don't see any reason why
> kmalloc-ing a task_work wouldn't suffice in this case. Moreover, AFAICS
> it would liberate us from the necessity to use memory barriers and allow
> us to avoid EAGAIN failures.

Current task_struct size is 3232 bytes, and the patchset adds 16 byte more:

16/3232 = 0.00495(495).

It does not look the set dramatically increases task_struct size and affects
the performance in any way.

At the same time kmalloc() complicates the code, because it forces us
to introduce .cancel_attach method which now is not existed. It's need
to handle the ENOMEM situation (imagine a big multi-thread process).

> Hell, I placed ub's cgroup_attach_work in task_struct either, what a
> mess...

Probably, we may use the only attach work in task_struct for ve and ub
and overload it.

> Besides, I don't see you handle a race with fork (see ub_cgroup_fork).
> Why?

Because the race is a BUG existed long before my patches, and it's not connected
with them. Don't afraid, I hasn't forgotten it.

The series is about delayed attaching. Let's do not mix everything here.

>> +		task_work_add(task, &task->task_ve_work, true);
>>  	}
>>  }
>>  
>>