[CRIU] BUG: CRIU corrupt floating point state after checkpoint
Dmitry Safonov
0x7f454c46 at gmail.com
Wed Sep 25 04:36:08 MSK 2019
On 9/25/19 2:28 AM, Diyu Zhou wrote:
> Hi Cyrill and Andrei,
>
> Thank you for your help.
>
> I have tried to run it on a machine with xsave instruction and the problem is
> still there. Dump log and cpuinfo is attached.
>
>> Another question is -- the problem appears after chekpoint only, you didnt do
>> restore procedure?
>
> Correct. I will explain more in detail below.
>
> The problem seems to me is that the checkpoint process corrupts the floating
> point register value, after it have obtained the value of floating point
> register. If I leave the floating point process continue to run after
> checkpointing, the floating point process will yield an error.
>
> However, the floating point register CRIU obtain is correct. I have verified it
> with a script that keeps dumping (and kill it) and restoring the floating point
> program with a 30ms interval. The floating point program runs to the end
> correctly.
>
> So I conclude the corruption occurs after obtaining the FPU register value. My
> guess is some part of the parasite code somehow executes floating point
> instruction or invoke functions like memset, memcpy that potentially uses SSE.
Oh, that's a good guess.
I had in TODO adding a warning/breaking the build when parasite blob
found to be using fpu.. But I thought, it doesn't happen so it still on
the list somewhere.
Could you upload `objdump -dS criu/pie/parasite.built-in.o` somewhere?
(like gist on github i.e.)
Thanks,
Dmitry
More information about the CRIU
mailing list