[CRIU] BUG: CRIU corrupt floating point state after checkpoint

Diyu Zhou zhoudiyupku at gmail.com
Wed Sep 25 04:41:51 MSK 2019


Sure. See:
https://github.com/vmexit/fpu-debug

I'm very new to github, so let me know if there is any issue
accessing the link.

On Tue, Sep 24, 2019 at 6:36 PM Dmitry Safonov <0x7f454c46 at gmail.com> wrote:
>
> On 9/25/19 2:28 AM, Diyu Zhou wrote:
> > Hi Cyrill and Andrei,
> >
> > Thank you for your help.
> >
> > I have tried to run it on a machine with xsave instruction and the problem is
> > still there.  Dump log and cpuinfo is attached.
> >
> >> Another question is -- the problem appears after chekpoint only, you didnt do
> >> restore procedure?
> >
> > Correct. I will explain more in detail below.
> >
> > The problem seems to me is that the checkpoint process corrupts the floating
> > point register value, after it have obtained the value of floating point
> > register.  If I leave the floating point process continue to run after
> > checkpointing, the floating point process will yield an error.
> >
> > However, the floating point register CRIU obtain is correct. I have verified it
> > with a script that keeps dumping (and kill it) and restoring the floating point
> > program with a 30ms interval. The floating point program runs to the end
> > correctly.
> >
> > So I conclude the corruption occurs after obtaining the FPU register value. My
> > guess is some part of the parasite code somehow executes floating point
> > instruction or invoke functions like memset, memcpy that potentially uses SSE.
>
> Oh, that's a good guess.
> I had in TODO adding a warning/breaking the build when parasite blob
> found to be using fpu.. But I thought, it doesn't happen so it still on
> the list somewhere.
>
> Could you upload `objdump -dS criu/pie/parasite.built-in.o` somewhere?
> (like gist on github i.e.)
>
> Thanks,
>           Dmitry


More information about the CRIU mailing list