[CRIU] [PATCH 2/8] x86/kerndat: Add a check for ptrace() bug on Skylake
Andrei Vagin
avagin at virtuozzo.com
Tue Feb 6 21:45:17 MSK 2018
On Tue, Feb 06, 2018 at 02:01:41AM +0000, Dmitry Safonov wrote:
> 2018-02-06 1:57 GMT+00:00 Andrei Vagin <avagin at virtuozzo.com>:
> > On Tue, Feb 06, 2018 at 01:54:21AM +0000, Dmitry Safonov wrote:
> >> 2018-02-06 1:40 GMT+00:00 Andrei Vagin <avagin at virtuozzo.com>:
> >> > On Wed, Jan 31, 2018 at 04:58:09PM +0000, Dmitry Safonov wrote:
> >> >> We need to know if ptrace(PTRACE_GETREGSET, pid, NT_X86_XSTATE, iov)
> >> >> returns xsave without FP state part.
> >> >>
> >> >> Signed-off-by: Dmitry Safonov <dima at arista.com>
> >> >> ---
> >> >> criu/arch/x86/crtools.c | 71 ++++++++++++++++++++++++++++++++++++
> >> >> criu/arch/x86/include/asm/restorer.h | 2 +
> >> >> criu/include/kerndat.h | 1 +
> >> >> criu/kerndat.c | 18 +++++++++
> >> >> 4 files changed, 92 insertions(+)
> >> >>
> >> >> diff --git a/criu/arch/x86/crtools.c b/criu/arch/x86/crtools.c
> >> >> index d7b21e7c9bae..a389e563a886 100644
> >> >> --- a/criu/arch/x86/crtools.c
> >> >> +++ b/criu/arch/x86/crtools.c
> >> >> @@ -7,6 +7,7 @@
> >> >> #include <sys/syscall.h>
> >> >> #include <sys/auxv.h>
> >> >> #include <sys/wait.h>
> >> >> +#include <sys/ptrace.h>
> >> >>
> >> >> #include "types.h"
> >> >> #include "log.h"
> >> >> @@ -30,6 +31,7 @@
> >> >> #include "images/core.pb-c.h"
> >> >> #include "images/creds.pb-c.h"
> >> >>
> >> >> +/* XXX: Move all kerndat features to per-arch kerndat .c */
> >> >> int kdat_can_map_vdso(void)
> >> >> {
> >> >> pid_t child;
> >> >> @@ -188,6 +190,75 @@ int kdat_compatible_cr(void)
> >> >> }
> >> >> #endif
> >> >>
> >> >> +/*
> >> >> + * Pre v4.14 kernels have a bug on Skylake CPUs:
> >> >> + * copyout_from_xsaves() creates fpu state for
> >> >> + * ptrace(PTRACE_GETREGSET, pid, NT_X86_XSTATE, &iov)
> >> >> + * without MXCSR and MXCSR_FLAGS if there is SSE/YMM state, but no FP state.
> >> >> + * That is xfeatures had either/both XFEATURE_MASK_{SSE,YMM} set, but not
> >> >> + * XFEATURE_MASK_FP.
> >> >> + * But we *really* need to C/R MXCSR & MXCSR_FLAGS if SSE/YMM active,
> >> >> + * as mxcsr store part of the state.
> >> >> + */
> >> >> +int kdat_x86_has_ptrace_fpu_xsave_bug(void)
> >> >> +{
> >> >> + user_fpregs_struct_t xsave = { };
> >> >> + struct iovec iov;
> >> >> + int ret = -1;
> >> >> + pid_t child;
> >> >> + int stat;
> >> >> +
> >> >> + /* OSXSAVE can't be changed during boot. */
> >> >> + if (!compel_cpu_has_feature(X86_FEATURE_OSXSAVE))
> >> >> + return 0;
> >> >> +
> >> >> + child = fork();
> >> >> + if (child < 0) {
> >> >> + pr_perror("%s(): failed to fork()", __func__);
> >> >> + return -1;
> >> >> + }
> >> >> +
> >> >> + if (child == 0) {
> >> >> + ptrace(PTRACE_TRACEME, 0, 0, 0);
> >> >
> >> > It can fail in a case when criu is running under strace. In this case I
> >> > would like to get a warning and conside that a kernel doesn't have this
> >> > bug.
> >>
> >> Makes sense, will do.
> >>
> >> >
> >> >> + kill(getpid(), SIGSTOP);
> >> >> + pr_err("Continue after SIGSTOP.. Urr what?\n");
> >> >> + exit(1);
> >> >> + }
> >> >> +
> >> >> + if (waitpid(child, &stat, WUNTRACED) != child) {
> >> >> + /*
> >> >> + * waitpid() may end with ECHILD if SIGCHLD == SIG_IGN,
> >> >> + * and the child has stopped already.
> >> >> + */
> >> >> + if (!(errno == ECHILD && WIFSTOPPED(stat))) {
> >> >
> >> > I don't understand this expression ^^^^
> >>
> >> If I'm not mistaken, if criu process has SIG_IGN set for SIGCHLD
> >> waitpid will return ECHILD, but stat will be still filled.
> >
> > No, it is impossible. waitpid will return ECHILD only if someone else
> > waited it.
>
> I'm maybe misreading this, eh?
>
> :ERRORS
> : ECHILD (for wait()) The calling process does not have any unwaited-for
> : children.
> :
> : ECHILD (for waitpid() or waitid()) The process specified by pid (wait‐
> : pid()) or idtype and id (waitid()) does not exist or is not a
> : child of the calling process. (This can happen for one's own
> : child if the action for SIGCHLD is set to SIG_IGN. See also the
> : Linux Notes section about threads.)
>
Just for the history. Dima was partly right. If we set SIG_IGN for
SIGCHLD, the kernel is auto-reaping zombies and waitpid() returns
ECHILD, but status isn't initialized in this case.
> >
> >> (criu doesn't set SIG_IGN at the moment, but I didn't want to rely
> >> on that. Having in mind how many modes criu now has..
> >> and for each we need kerndat to be initialized)
> >> Anyway, I can delete this sub-check if you think it'll be better.
> >>
>
>
>
> --
> Dmitry
More information about the CRIU
mailing list