[CRIU] [PATCH 18/18] SCM: Dump and restore SCM_RIGHTs
Pavel Emelyanov
xemul at virtuozzo.com
Wed Jul 12 15:10:01 MSK 2017
>>>> + for (i = 0; i < n_ids; i++) {
>>>> + struct file_desc *tgt;
>>>> + struct fdinfo_list_entry *fle;
>>>> + struct scm_fle *sfle;
>>>> + FdinfoEntry *e;
>>>> + int fd;
>>>> +
>>>> + tgt = find_file_desc_raw(FD_TYPES__UND, file_ids[i]);
>>>> + if (!tgt) {
>>>> + pr_err("Can't find fdesc to send\n");
>>>> + return -1;
>>>> + }
>>>> +
>>>> + pr_info("`- Found %d file\n", file_ids[i]);
>>>> + fd = find_unused_fd(owner, -1);
>>>
>>> If I haven't missed something, I don't see, where you try
>>> to found the fle in task's file descriptor to use it for
>>> queuing. It seems, we add a fake fle for every packet in
>>> a queue. If it's so, it's not performance-good, and firstly
>>> we should try to check, that task already owns the fle instead
>>> of that.
>>
>> Yes, I made this deliberately. Adding one more descriptor for a file
>> will take us several syscalls more. Fortunately queues with scms will
>> not be a frequent case.
>
> Your series restore queues with file descriptors in scm, and one of
> the queuers is used to to that. Task of this queuer obtains additional
> file descriptors. Right?
Right. Task of the master fle of the queuer does so.
> If so, why we don't try to analyze the task's
> descriptors, if it already has the one, we add?
We can. In that case we'll need to add a flag on the fle, saying that
this fle can be closed by scm-sending code. I thought about it, but
wanted to make the first version simpler.
> I'm asking about that,
> and this seems to not require excess file descriptors or system calls
> (it's possible to just analyze fd identifiers). I'm not sure we understood
> each other in question %) Do you mean the same?
You're right. If a task that is to send a descriptor already owns the
relevant fle, we can just reuse that. I wanted to do it incrementally,
as this would require more changes and would complicate maintainers'
review :P
At the same time, I don't expect tons of scm-rights messages and think
that saving a couple of syscalls and one descriptors might not worth the
efforts. But ... anyway.
>>>> +
>>>> + fle = try_file_master(tgt);
>>>> + if (fle) {
>>>> + e = dup_fdinfo(fle->fe, fd, 0);
>>>> + if (!e) {
>>>> + pr_err("Can't duplicate fdinfo for scm\n");
>>>> + return -1;
>>>> + }
>>>> + } else {
>>>> + /*
>>>> + * This can happen if the file in question is
>>>> + * sent over the socket and closed. In this case
>>>> + * we need to ... invent a new one!
>>>> + */
>>>> +
>>>> + e = xmalloc(sizeof(*e));
>>>> + if (!e)
>>>> + return -1;
>>>> +
>>>> + fdinfo_entry__init(e);
>>>> + e->id = tgt->id;
>>>> + e->type = tgt->ops->type;
>>>> + e->fd = fd;
>>>> + e->flags = 0;
>>>
>>> I'd add the code around fdinfo_entry__init() in separate function
>>> as we already have 3 places like it.
>>
>> OK, will do. What are the others?
>
> I mean the places, where fdinfo_entry__init() is used:
> $ git grep fdinfo_entry__init
> criu/files.c: fdinfo_entry__init(e);
> criu/files.c: fdinfo_entry__init(e);
> criu/sk-unix.c: fdinfo_entry__init(e);
>
> Something like
>
> e = create_fdinfo_entry(tgt->id, tgt->ops->type, fd, 0, false);
> if (!e)
> return -1;
>
> "false" is to use shmem or not.
Ah, I see. So the options we have now is having N lines of assignments vs
having a routine with M arguments. I would chose the 2nd only if M < N,
but your proposal is 4 assignments vs 5 arguments.
>>>> + }
>>>> +
>>>> + pr_info("scm: add %d -> %d.fd[%d]\n", tgt->id, vpid(owner), fd);
>>>> + sfle = xmalloc(sizeof(*sfle));
>>>> + if (!sfle)
>>>> + return -1;
>>>> +
>>>> + sfle->fle = collect_fd_to(vpid(owner), e, rsti(owner), tgt, false);
>>>> + if (!sfle->fle) {
>>>> + pr_err("Can't request new fle for scm\n");
>>>> + return -1;
>>>> + }
>>>> +
>>>> + list_add_tail(&sfle->l, &ui->scm_fles);
>>>> + fds[i] = fd;
>>>> + }
>>>> +
>>>> + return 0;
>>>> +}
>>>> +
>>>> +static int chk_restored_scms(struct unix_sk_info *ui)
>>>> +{
>>>> + struct scm_fle *sf, *n;
>>>> +
>>>> + list_for_each_entry_safe(sf, n, &ui->scm_fles, l) {
>>>> + if (sf->fle->stage != FLE_RESTORED)
>>>> + return 1;
>>>
>>> It's need to wait for sf->fle->bound here instead.
>>
>> Why?
>
> Then, what is scms? If these are the files, which contain in the
> queue messages, we just need to wait till they open, don't we?
Here we wait for a file descriptor to appear in the task's fdtable, so that
we could reference one. Do you want to say that waiting for FLE_OPEN is
enough? I need make sure that fle->fd does reference the correct files,
in other words, file has been put into proper descriptor already. Is that
so? If yes, I will fix this place, but what's the difference between
FLE_OPEN and FLE_RESTORED then?
>>> I suppose, the above introduces some circular dependencies for cases
>>> with several sockets having epoll, which is listening them.
>>
>> Nope :) On dump I explicitly ban unix sockets and epolls sent via unix
>> sockets. So we can only wait here for some other files, that cannot wait us.
>>
>>>> +
>>>> + /* Optimization for the next pass */
>>>> + list_del(&sf->l);
>>>> + xfree(sf);
>>>> + }
>>>> +
>>>> + return 0;
>>>> +}
>>>> +
>>>> static int wake_connected_sockets(struct unix_sk_info *ui)
>>>> {
>>>> struct fdinfo_list_entry *fle;
>>>> @@ -1322,12 +1438,18 @@ static int open_unix_sk(struct file_desc *d, int *new_fd)
>>>> struct unix_sk_info *ui;
>>>> int ret;
>>>>
>>>> + ui = container_of(d, struct unix_sk_info, d);
>>>> +
>>>> + /* FIXME -- only queue restore may be postponed */
>>>> + if (chk_restored_scms(ui)) {
>>>> + pr_info("scm: Wait for tgt to restore\n");
>>>> + return 1;
>>>> + }
>>>
>>> It's too strong limitation, we delay another tasks here.
>>
>> Another tasks? No, we just ask files restore loop to go and try to open
>> some other files.
>
> But another tasks may wait file descriptors you're serving out.
> This tasks will be delayed.
Mm... Yes, it will. Isn't it the cost of synchronization anyway? The FIXME
comment above it is exactly about it -- complicating the code for the sake
of (hopefully) performance.
>>> We may open the socket and to serve it out, and we do
>>> not need wait queuers for these actions.
>>
>> Not always. There are several cases when we do socketpair() then restore
>> a queue then close one of the ends immediately.
>
> Even if the second end is closed, anither tasks may want this
> slave fle, and their eventpolls may depend on it.
> I think, we should wait only in the moments, when we must wait, and do
> not do that in another cases.
This is valid, but must is to be replaced with should :)
Is anything broken with the current approach? Or just suboptimal?
-- Pavel
More information about the CRIU
mailing list