[CRIU] is CR_STATE_RESTORE_CREDS necessary?

Andrew Vagin avagin at virtuozzo.com
Fri Dec 11 06:48:42 PST 2015


On Fri, Dec 11, 2015 at 12:11:10AM +0300, Pavel Emelyanov wrote:
> On 12/10/2015 11:58 PM, Andrey Wagin wrote:
> > 2015-12-10 23:33 GMT+03:00 Pavel Emelyanov <xemul at parallels.com>:
> >> On 12/10/2015 11:13 PM, Andrey Wagin wrote:
> >>> 2015-12-09 20:05 GMT+03:00 Tycho Andersen <tycho.andersen at canonical.com>:
> >>>> Hi all,
> >>>>
> >>>> I'm trying to figure out a way to get rid of this FIXME:
> >>>> https://github.com/xemul/criu/blob/master/test/zdtm/live/static/seccomp_filter.c#L103
> >>>>
> >>>> to do this, we need to call restore_seccomp() before restore_creds(),
> >>>> but if we're doing that, we need to suspend seccomp so that it doesn't
> >>>> kill the task if the policy has blocked some syscall that
> >>>> restore_creds does.
> >>>>
> >>>> Right now, seccomp is suspended in attach_to_tasks(), because that's
> >>>> the last ptrace attach that criu does to the tasks. However, that
> >>>> happens after CR_STATE_RESTORE_CREDS, i.e. after the creds are
> >>>> restored.
> >>>
> >>> CR_STATE_RESTORE_CREDS is the last synchronization point. We stop on
> >>> __NR_sigreturn with help of ptrace, but it doesn't work if we trace
> >>> "criu restore". I use strace to investigate bugs very often.
> >>>
> >>> strace -fo strace.log -s 256 ./criu restore ...
> >>>
> >>> I would like to save this ability if it's possible.
> >>
> >> How do stages help you with strace? I thought that the only reason
> >> it works was -- if criu's restore strace fails it just exits and lets
> >> restorer blobs self-unload (not completely, but still) from task.
> > 
> > We can't use ptrace to synchronize tasks.
> 
> Yes, that's clear -- we need global final sync point before total
> unfreeze and resume.
> 
> >>
> >>>>
> >>>> I could just move the attach_to_tasks call above
> >>>> CR_STATE_RESTORE_CREDS, but I think that would deadlock since the main criu
> >>>> tasks waits for the tasks to be complete, but they're all in the
> >>>> stopped state (because we'd have ptraced them), and the loop that lets
> >>>> them run is in parasite_stop_on_syscall much further below.
> >>>
> >>> We use breakpoints for x86_64. Maybe we can use them for other arch-s.
> >>> Or we can attach tasks, suspend seccomp for them and resume them back
> >>> and then interrupt them again after CR_STATE_RESTORE_CREDS.
> >>>
> >>> And there is another reason to have CR_STATE_RESTORE_CREDS. It allows
> >>> us to reduce the number of syscalls which we need to trace with
> >>> PTRACE_SYSCALL, because it works slow.
> >>
> >> OK, but why do we need the separate CR_STATE_RESTORE_SIGCHILD?
> > 
> > We can't fail on CR_STATE_RESTORE_CREDS, because network is already
> > unlocked, sbo we should do minimum actions on this stage. It's a
> > reason why we don't want to merge CR_STATE_RESTORE_SIGCHILD and
> > CR_STATE_RESTORE_CREDS.
> 
> So the RESTORE_SIGCHILD is the last sync point till which something can
> fail. Is that all?

CRIU sets its own sigchld handler to detect if a child process exits
unexpectedly and we can't do this after restoring the origin sigchld
handler.

---------------------------------
It's posiable to return an error and
criu will detect if someone segfaulted

--- CR_STATE_RESTORE_SIGCHILD ---

It's posiable to return an error

--- CR_STATE_RESTORE_CREDS ------

NO WAY BACK

> 
> -- Pavel
> _______________________________________________
> CRIU mailing list
> CRIU at openvz.org
> https://lists.openvz.org/mailman/listinfo/criu


More information about the CRIU mailing list