[CRIU] Does CRIU handle Futex system calls ?

Alexander Mikhalitsyn alexander at mihalicyn.com
Thu Jul 7 17:54:05 MSK 2022


Hello, Dorian!

>There is one thing I would just like to be sure to have understood properly from your message. Do you mean that the process will redo the futex_wait system calls on restoration ?
yes.

More detailed:
when CRIU comes to the dump process it uses ptrace() to "seize" them.
This procedure acts as a signal on the processes, so all (almost)
syscalls which was executed by the collectable
processes at the moment of dump get interrupted.
You can observe that behaviour if you write a "buggy" program that
incorrectly uses the sleep() call and does not handle EINTR. In this
case the program will sleep less than it has to.
But for futex() syscall handling is different, CRIU will restart
syscall for you because from the kernel side futex will return
-ERESTARTSYS which is handled by CRIU in compel_get_task_regs and
leads to "automatic" syscall restart.
In case of nanosleep syscall the kernel will return
-ERESTART_RESTARTBLOCK which means that kernel should not perform the
autorestart for this syscall, so in this case CRIU will "fixup" the
"ax" register to -EINTR value to be fully transparent to mimic the
generic kernel behavior for this syscalls group.

General idea here is to be fully invisible to the userspace. If
syscall has to return EINTR on signal, then CRIU will do the same, if
syscall has to be restarted after execution of a signal handler then
CRIU
will restore the process with the syscall restarted.

I've omitted details about the SA_RESTART flag, it's not so important
to understand the basic idea here :)
You can refer to the handle_signal() kernel function to get a better
understanding of the details.

References:
https://github.com/torvalds/linux/blob/8cb1ae19bfae92def42c985417cd6e894ddaa047/kernel/futex/waitwake.c#L670
https://github.com/torvalds/linux/blob/8cb1ae19bfae92def42c985417cd6e894ddaa047/kernel/futex/waitwake.c#L552
https://github.com/torvalds/linux/blob/d6ecaa0024485effd065124fe774de2e22095f2d/arch/x86/kernel/signal.c#L796
signal(7) man

Best regards,
Alex

On Thu, Jul 7, 2022 at 5:11 PM Dorian Goepp <goepp at i3s.unice.fr> wrote:
>
> Hello,
>
>
> Thanks a lot Alex for your response.
>
> There is one thing I would just like to be sure to have understood properly from your message. Do you mean that the process will redo the futex_wait system calls on restoration ?
>
> Best regards,
>
> Dorian Goepp
>
> Le 2022-07-07 14:34, Alexander Mikhalitsyn a écrit :
>
> Hello, dear friends,
>
> Yep, CRIU definitely handles futexes carefully. It's one of the most
> important things.
> We support both robust and regular futexes.
>
> As far as you know futexes work only on a shared memory basis, so for
> non-robust futex
> there is no need to have any special handling. We are just dumping the
> whole process memory contents.
> So after the restore we just need to ensure that threads instruction
> pointers (IP) are properly set
> (for instance we need to perform manual "syscall restart" for futexes
> (see compel_get_task_regs()).
>
> For robust futexes we have a special handling here:
> https://github.com/checkpoint-restore/criu/blob/c8f9880adab038481f7806173b698fc6e17ba76a/criu/cr-dump.c#L565
> https://github.com/checkpoint-restore/criu/blob/c8f9880adab038481f7806173b698fc6e17ba76a/criu/pie/restorer.c#L532
>
> Regards,
> Alex
>
> On Thu, Jul 7, 2022 at 3:13 PM Adrian Reber <adrian at lisas.de> wrote:
>
>
> Please try to submit your question as a github issue. Much higher
> chances of getting an answer there.
>
>                 Adrian
>
> On Thu, Jul 07, 2022 at 01:39:17PM +0200, Dorian Goepp wrote:
>
> Hi,
>
> Is this the right place to ask questions about CRIU's features and
> internals?
>
> If so, I have been considering CRIU for dumping and restoring a process
> with a set of threads synchronised through futexes [1]. It seems to
> work, but, I cannot tell from the documentation (wiki) whether it is
> officially supported, or just accidentally works for the cases I tested
> it with.
>
> I could not find which parts would take care of the state of the
> futex_wait system call in CRIU's source code, except maybe for
> `get_task_regs()` in crui/compel/arch/x86/src/lib/infect.c (I run it on
> an amd64 processor). This function issues the warning "Will restore %d
> with interrupted system call" when I dump the futex-heavy process. Is it
> enough to save the process's register state to resume the futex system
> call correctly ?
>
> Best regards,
>
> Dorian Goepp
>
> Links:
> ------
> [1] https://man7.org/linux/man-pages/man2/futex.2.html
>
>
> _______________________________________________
> CRIU mailing list
> CRIU at openvz.org
> https://lists.openvz.org/mailman/listinfo/criu
>
> _______________________________________________
> CRIU mailing list
> CRIU at openvz.org
> https://lists.openvz.org/mailman/listinfo/criu



More information about the CRIU mailing list