[CRIU] [PATCH 3/3] sysctl: move sysctl calls to usernsd
Pavel Emelyanov
xemul at parallels.com
Fri Aug 21 11:08:46 PDT 2015
On 08/18/2015 06:44 PM, Tycho Andersen wrote:
> On Tue, Aug 18, 2015 at 06:23:05PM +0300, Pavel Emelyanov wrote:
>>
>>> @@ -172,17 +218,44 @@ int sysctl_op(struct sysctl_req *req, size_t nr_req, int op)
>>> {
>>> int ret = 0;
>>> int dir = -1;
>>> + struct sysctl_userns_req *userns_req;
>>>
>>> - dir = open("/proc/sys", O_RDONLY);
>>> - if (dir < 0) {
>>> - pr_perror("Can't open sysctl dir");
>>> - return -1;
>>> - }
>>> + userns_req = alloca(MAX_MSG_SIZE);
>>
>> I've found no place where this gets free()-ed.
>
> alloca is just on the stack, so it doesn't need to be freed (although
> I could move it to xmalloc() if you like).
Ouch, indeed.
>>> + userns_req->name = (char *) (&userns_req[1]);
>>>
>>> while (nr_req--) {
>>> - ret = __sysctl_op(dir, req, op);
>>> + int arg_len = sysctl_userns_arg_size(req->type);
>>> + int name_len = strlen(req->name) + 1;
>>> + int total_len = sizeof(*userns_req) + arg_len + name_len;
>>> +
>>> + if (total_len > MAX_MSG_SIZE) {
>>> + pr_err("sysctl msg too big: %s\n", req->name);
>>> + return -1;
>>> + }
>>> +
>>> + strcpy(userns_req->name, req->name);
>>> +
>>> + userns_req->arg = userns_req->name + name_len + 1;
>>> + if (op == CTL_WRITE)
>>> + memcpy(userns_req->arg, req->arg, arg_len);
>>> +
>>> + userns_req->type = req->type;
>>> + userns_req->flags = req->flags;
>>> + userns_req->op = op;
>>> +
>>> + ret = userns_call(__sysctl_op, UNS_ASYNC, userns_req, total_len, 0);
>>
>> The __sysctl_op will open("/proc/sys" ...) on every request. This is not quite fast,
>> can we do all the sysctl_op-s it in one userns_call?
>
> I tried this earlier, and it is a little ugly in terms of sending all
> of the messages at once to usernsd.
>
> Can we instead have usernsd install a service FD on the first such
> request, and then close it when it shuts down?
Well, there are places when we do sysctl_op on a big array (e.g. the ipv4_conf_op
with 27 options) and sending each one via unix socket would result in quite a big
load on a socket :)
-- Pavel
More information about the CRIU
mailing list