[CRIU] is CR_STATE_RESTORE_CREDS necessary?

Tycho Andersen tycho.andersen at canonical.com
Fri Dec 11 07:09:57 PST 2015


On Fri, Dec 11, 2015 at 05:48:42PM +0300, Andrew Vagin wrote:
> On Fri, Dec 11, 2015 at 12:11:10AM +0300, Pavel Emelyanov wrote:
> > On 12/10/2015 11:58 PM, Andrey Wagin wrote:
> > > 2015-12-10 23:33 GMT+03:00 Pavel Emelyanov <xemul at parallels.com>:
> > >> On 12/10/2015 11:13 PM, Andrey Wagin wrote:
> > >>> 2015-12-09 20:05 GMT+03:00 Tycho Andersen <tycho.andersen at canonical.com>:
> > >>>> Hi all,
> > >>>>
> > >>>> I'm trying to figure out a way to get rid of this FIXME:
> > >>>> https://github.com/xemul/criu/blob/master/test/zdtm/live/static/seccomp_filter.c#L103
> > >>>>
> > >>>> to do this, we need to call restore_seccomp() before restore_creds(),
> > >>>> but if we're doing that, we need to suspend seccomp so that it doesn't
> > >>>> kill the task if the policy has blocked some syscall that
> > >>>> restore_creds does.
> > >>>>
> > >>>> Right now, seccomp is suspended in attach_to_tasks(), because that's
> > >>>> the last ptrace attach that criu does to the tasks. However, that
> > >>>> happens after CR_STATE_RESTORE_CREDS, i.e. after the creds are
> > >>>> restored.
> > >>>
> > >>> CR_STATE_RESTORE_CREDS is the last synchronization point. We stop on
> > >>> __NR_sigreturn with help of ptrace, but it doesn't work if we trace
> > >>> "criu restore". I use strace to investigate bugs very often.
> > >>>
> > >>> strace -fo strace.log -s 256 ./criu restore ...
> > >>>
> > >>> I would like to save this ability if it's possible.
> > >>
> > >> How do stages help you with strace? I thought that the only reason
> > >> it works was -- if criu's restore strace fails it just exits and lets
> > >> restorer blobs self-unload (not completely, but still) from task.
> > > 
> > > We can't use ptrace to synchronize tasks.
> > 
> > Yes, that's clear -- we need global final sync point before total
> > unfreeze and resume.

Why doesn't CR_STATE_COMPLETE work for this? Doesn't the restorer wait
once it finishes CR_STATE_RESTORE_CREDS until it gets the signal for
CR_STATE_COMPLETE?

> > >>
> > >>>>
> > >>>> I could just move the attach_to_tasks call above
> > >>>> CR_STATE_RESTORE_CREDS, but I think that would deadlock since the main criu
> > >>>> tasks waits for the tasks to be complete, but they're all in the
> > >>>> stopped state (because we'd have ptraced them), and the loop that lets
> > >>>> them run is in parasite_stop_on_syscall much further below.
> > >>>
> > >>> We use breakpoints for x86_64. Maybe we can use them for other arch-s.
> > >>> Or we can attach tasks, suspend seccomp for them and resume them back
> > >>> and then interrupt them again after CR_STATE_RESTORE_CREDS.
> > >>>
> > >>> And there is another reason to have CR_STATE_RESTORE_CREDS. It allows
> > >>> us to reduce the number of syscalls which we need to trace with
> > >>> PTRACE_SYSCALL, because it works slow.
> > >>
> > >> OK, but why do we need the separate CR_STATE_RESTORE_SIGCHILD?
> > > 
> > > We can't fail on CR_STATE_RESTORE_CREDS, because network is already
> > > unlocked, sbo we should do minimum actions on this stage. It's a
> > > reason why we don't want to merge CR_STATE_RESTORE_SIGCHILD and
> > > CR_STATE_RESTORE_CREDS.
> > 
> > So the RESTORE_SIGCHILD is the last sync point till which something can
> > fail. Is that all?
> 
> CRIU sets its own sigchld handler to detect if a child process exits
> unexpectedly and we can't do this after restoring the origin sigchld
> handler.
> 
> ---------------------------------
> It's posiable to return an error and
> criu will detect if someone segfaulted
> 
> --- CR_STATE_RESTORE_SIGCHILD ---
> 
> It's posiable to return an error
> 
> --- CR_STATE_RESTORE_CREDS ------
> 
> NO WAY BACK

But this is really because of the network, not because of the creds.
What if we s/CR_STATE_RESTORE_CREDS/CR_STATE_NETWORK_UNLOCK, and in
the restorer only do rst_tcp_socks_all() here, and move all the creds
restore after CR_STATE_NETWORK_UNLOCK, when the main task sends
CR_STATE_COMPLETE.

The one problem this doesn't solve is someone mentioned in the thread
that there was an issue about resuming some tasks before other tasks
had their creds restored. I didn't understand this point, isn't this
prevented by the wait on __NR_rt_sigreturn (except in the case where
criu is itself ptraced, but I'm ignoring that one because I don't
think it's a use case other than for debugging).

Tycho


More information about the CRIU mailing list