[CRIU] Does CRIU handle Futex system calls ?
Dorian Goepp
goepp at i3s.unice.fr
Fri Jul 8 15:12:57 MSK 2022
Hello Alex,
Thanks again for your time and response. It is now all clear to me.
I am amazed at the work put in to ensure CRIU does everything right.
Best regards,
Dorian Goepp
Le 2022-07-07 16:54, Alexander Mikhalitsyn a écrit :
> Hello, Dorian!
>
>> There is one thing I would just like to be sure to have understood properly from your message. Do you mean that the process will redo the futex_wait system calls on restoration ?
> yes.
>
> More detailed:
> when CRIU comes to the dump process it uses ptrace() to "seize" them.
> This procedure acts as a signal on the processes, so all (almost)
> syscalls which was executed by the collectable
> processes at the moment of dump get interrupted.
> You can observe that behaviour if you write a "buggy" program that
> incorrectly uses the sleep() call and does not handle EINTR. In this
> case the program will sleep less than it has to.
> But for futex() syscall handling is different, CRIU will restart
> syscall for you because from the kernel side futex will return
> -ERESTARTSYS which is handled by CRIU in compel_get_task_regs and
> leads to "automatic" syscall restart.
> In case of nanosleep syscall the kernel will return
> -ERESTART_RESTARTBLOCK which means that kernel should not perform the
> autorestart for this syscall, so in this case CRIU will "fixup" the
> "ax" register to -EINTR value to be fully transparent to mimic the
> generic kernel behavior for this syscalls group.
>
> General idea here is to be fully invisible to the userspace. If
> syscall has to return EINTR on signal, then CRIU will do the same, if
> syscall has to be restarted after execution of a signal handler then
> CRIU
> will restore the process with the syscall restarted.
>
> I've omitted details about the SA_RESTART flag, it's not so important
> to understand the basic idea here :)
> You can refer to the handle_signal() kernel function to get a better
> understanding of the details.
>
> References:
> https://github.com/torvalds/linux/blob/8cb1ae19bfae92def42c985417cd6e894ddaa047/kernel/futex/waitwake.c#L670
> https://github.com/torvalds/linux/blob/8cb1ae19bfae92def42c985417cd6e894ddaa047/kernel/futex/waitwake.c#L552
> https://github.com/torvalds/linux/blob/d6ecaa0024485effd065124fe774de2e22095f2d/arch/x86/kernel/signal.c#L796
> signal(7) man
>
> Best regards,
> Alex
>
> On Thu, Jul 7, 2022 at 5:11 PM Dorian Goepp <goepp at i3s.unice.fr> wrote:
>
>> Hello,
>>
>> Thanks a lot Alex for your response.
>>
>> There is one thing I would just like to be sure to have understood properly from your message. Do you mean that the process will redo the futex_wait system calls on restoration ?
>>
>> Best regards,
>>
>> Dorian Goepp
>>
>> Le 2022-07-07 14:34, Alexander Mikhalitsyn a écrit :
>>
>> Hello, dear friends,
>>
>> Yep, CRIU definitely handles futexes carefully. It's one of the most
>> important things.
>> We support both robust and regular futexes.
>>
>> As far as you know futexes work only on a shared memory basis, so for
>> non-robust futex
>> there is no need to have any special handling. We are just dumping the
>> whole process memory contents.
>> So after the restore we just need to ensure that threads instruction
>> pointers (IP) are properly set
>> (for instance we need to perform manual "syscall restart" for futexes
>> (see compel_get_task_regs()).
>>
>> For robust futexes we have a special handling here:
>> https://github.com/checkpoint-restore/criu/blob/c8f9880adab038481f7806173b698fc6e17ba76a/criu/cr-dump.c#L565
>> https://github.com/checkpoint-restore/criu/blob/c8f9880adab038481f7806173b698fc6e17ba76a/criu/pie/restorer.c#L532
>>
>> Regards,
>> Alex
>>
>> On Thu, Jul 7, 2022 at 3:13 PM Adrian Reber <adrian at lisas.de> wrote:
>>
>> Please try to submit your question as a github issue. Much higher
>> chances of getting an answer there.
>>
>> Adrian
>>
>> On Thu, Jul 07, 2022 at 01:39:17PM +0200, Dorian Goepp wrote:
>>
>> Hi,
>>
>> Is this the right place to ask questions about CRIU's features and
>> internals?
>>
>> If so, I have been considering CRIU for dumping and restoring a process
>> with a set of threads synchronised through futexes [1 [1]]. It seems to
>> work, but, I cannot tell from the documentation (wiki) whether it is
>> officially supported, or just accidentally works for the cases I tested
>> it with.
>>
>> I could not find which parts would take care of the state of the
>> futex_wait system call in CRIU's source code, except maybe for
>> `get_task_regs()` in crui/compel/arch/x86/src/lib/infect.c (I run it on
>> an amd64 processor). This function issues the warning "Will restore %d
>> with interrupted system call" when I dump the futex-heavy process. Is it
>> enough to save the process's register state to resume the futex system
>> call correctly ?
>>
>> Best regards,
>>
>> Dorian Goepp
>>
>> Links:
>> ------
>> [1] https://man7.org/linux/man-pages/man2/futex.2.html
>>
>> _______________________________________________
>> CRIU mailing list
>> CRIU at openvz.org
>> https://lists.openvz.org/mailman/listinfo/criu
>>
>> _______________________________________________
>> CRIU mailing list
>> CRIU at openvz.org
>> https://lists.openvz.org/mailman/listinfo/criu
Links:
------
[1] https://man7.org/linux/man-pages/man2/futex.2.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openvz.org/pipermail/criu/attachments/20220708/6956f663/attachment.html>
More information about the CRIU
mailing list