[CRIU] is CR_STATE_RESTORE_CREDS necessary?

Andrew Vagin avagin at virtuozzo.com
Sun Dec 13 23:14:49 PST 2015


On Fri, Dec 11, 2015 at 08:09:57AM -0700, Tycho Andersen wrote:
> On Fri, Dec 11, 2015 at 05:48:42PM +0300, Andrew Vagin wrote:
> > On Fri, Dec 11, 2015 at 12:11:10AM +0300, Pavel Emelyanov wrote:
> > > On 12/10/2015 11:58 PM, Andrey Wagin wrote:
> > > > 2015-12-10 23:33 GMT+03:00 Pavel Emelyanov <xemul at parallels.com>:
> > > >> On 12/10/2015 11:13 PM, Andrey Wagin wrote:
> > > >>> 2015-12-09 20:05 GMT+03:00 Tycho Andersen <tycho.andersen at canonical.com>:
> > > >>>> Hi all,
> > > >>>>
> > > >>>> I'm trying to figure out a way to get rid of this FIXME:
> > > >>>> https://github.com/xemul/criu/blob/master/test/zdtm/live/static/seccomp_filter.c#L103
> > > >>>>
> > > >>>> to do this, we need to call restore_seccomp() before restore_creds(),
> > > >>>> but if we're doing that, we need to suspend seccomp so that it doesn't
> > > >>>> kill the task if the policy has blocked some syscall that
> > > >>>> restore_creds does.
> > > >>>>
> > > >>>> Right now, seccomp is suspended in attach_to_tasks(), because that's
> > > >>>> the last ptrace attach that criu does to the tasks. However, that
> > > >>>> happens after CR_STATE_RESTORE_CREDS, i.e. after the creds are
> > > >>>> restored.
> > > >>>
> > > >>> CR_STATE_RESTORE_CREDS is the last synchronization point. We stop on
> > > >>> __NR_sigreturn with help of ptrace, but it doesn't work if we trace
> > > >>> "criu restore". I use strace to investigate bugs very often.
> > > >>>
> > > >>> strace -fo strace.log -s 256 ./criu restore ...
> > > >>>
> > > >>> I would like to save this ability if it's possible.
> > > >>
> > > >> How do stages help you with strace? I thought that the only reason
> > > >> it works was -- if criu's restore strace fails it just exits and lets
> > > >> restorer blobs self-unload (not completely, but still) from task.
> > > > 
> > > > We can't use ptrace to synchronize tasks.
> > > 
> > > Yes, that's clear -- we need global final sync point before total
> > > unfreeze and resume.
> 
> Why doesn't CR_STATE_COMPLETE work for this? Doesn't the restorer wait
> once it finishes CR_STATE_RESTORE_CREDS until it gets the signal for
> CR_STATE_COMPLETE?

Because CRIU doesn't wait when the CR_STATE_COMPLETE stage will be
completed.

> 
> > > >>
> > > >>>>
> > > >>>> I could just move the attach_to_tasks call above
> > > >>>> CR_STATE_RESTORE_CREDS, but I think that would deadlock since the main criu
> > > >>>> tasks waits for the tasks to be complete, but they're all in the
> > > >>>> stopped state (because we'd have ptraced them), and the loop that lets
> > > >>>> them run is in parasite_stop_on_syscall much further below.
> > > >>>
> > > >>> We use breakpoints for x86_64. Maybe we can use them for other arch-s.
> > > >>> Or we can attach tasks, suspend seccomp for them and resume them back
> > > >>> and then interrupt them again after CR_STATE_RESTORE_CREDS.
> > > >>>
> > > >>> And there is another reason to have CR_STATE_RESTORE_CREDS. It allows
> > > >>> us to reduce the number of syscalls which we need to trace with
> > > >>> PTRACE_SYSCALL, because it works slow.
> > > >>
> > > >> OK, but why do we need the separate CR_STATE_RESTORE_SIGCHILD?
> > > > 
> > > > We can't fail on CR_STATE_RESTORE_CREDS, because network is already
> > > > unlocked, sbo we should do minimum actions on this stage. It's a
> > > > reason why we don't want to merge CR_STATE_RESTORE_SIGCHILD and
> > > > CR_STATE_RESTORE_CREDS.
> > > 
> > > So the RESTORE_SIGCHILD is the last sync point till which something can
> > > fail. Is that all?
> > 
> > CRIU sets its own sigchld handler to detect if a child process exits
> > unexpectedly and we can't do this after restoring the origin sigchld
> > handler.
> > 
> > ---------------------------------
> > It's posiable to return an error and
> > criu will detect if someone segfaulted
> > 
> > --- CR_STATE_RESTORE_SIGCHILD ---
> > 
> > It's posiable to return an error
> > 
> > --- CR_STATE_RESTORE_CREDS ------
> > 
> > NO WAY BACK
> 
> But this is really because of the network, not because of the creds.
> What if we s/CR_STATE_RESTORE_CREDS/CR_STATE_NETWORK_UNLOCK, and in
> the restorer only do rst_tcp_socks_all() here, and move all the creds
> restore after CR_STATE_NETWORK_UNLOCK, when the main task sends
> CR_STATE_COMPLETE.

I don't understand this. Do you want to restore creds after
CR_STATE_COMPLETE?

> 
> The one problem this doesn't solve is someone mentioned in the thread
> that there was an issue about resuming some tasks before other tasks
> had their creds restored. I didn't understand this point, isn't this
> prevented by the wait on __NR_rt_sigreturn (except in the case where
> criu is itself ptraced, but I'm ignoring that one because I don't
> think it's a use case other than for debugging).

Pls, don't ignore this usecases, it is enought critical to take it
into account.

I don't understand your problem. Could you send your draft code about
seccomp for which you need these changes?

> 
> Tycho


More information about the CRIU mailing list