[CRIU] Bug report: a process restored with criu crashes on SIGFPE

Andrei Vagin avagin at virtuozzo.com
Thu Jan 25 03:25:01 MSK 2018


On Wed, Jan 24, 2018 at 11:48:14PM +0200, Shlomi Matichin wrote:
> Hello,
> 
> first, thank you guys for all your awesome work with criu. i have a bug
> report i would like to ask your help with, but please know that criu to me
> is magic, and its amazing how well it works.
> 
> REPRODUCING CODE ATTACHED:
> attached are two programs written in python3. the server side is a simple
> tcp socket accept connection, compute, return answer and close connection
> loop (implemented with two files, main.py, generated.py). the client side
> just connects and prints whatever comes on the tcp connection.
> 
> STEPS TO REPRODUCE:
> on terminal 1: pypy main.py
> on terminal 2: python3 client.py
> on terminal 2: cd <dump directory>
> on terminal 2: sudo criu dump -t `pidof pypy` --shell-job
> on terminal 1: <server dies>
> on terminal 2: sudo criu restore --shell-job
> on terminal 3: sudo strace -fF -p `pidof pypy`
> on terminal 1: python3 client.py
> on terminal 2: <pypy crashes, parent process exists>
> on terminal 3: <output follows:>
> strace: Process 326 attached
> accept(3, {sa_family=AF_INET, sin_port=htons(56262),
> sin_addr=inet_addr("127.0.0.1")}, [16]) = 4
> --- SIGFPE {si_signo=SIGFPE, si_code=FPE_FLTRES, si_addr=0x7f6b19ce76d1} ---
> +++ killed by SIGFPE (core dumped) +++

I can't reproduce this issue in my local environment:

[root at fc24 xxx]# sudo strace -fF -p `pidof pypy`
strace: Process 606 attached
accept(3, {sa_family=AF_INET, sin_port=htons(56456), sin_addr=inet_addr("127.0.0.1")}, [16]) = 4
brk(NULL)                               = 0x1ab1000
brk(0x1ad2000)                          = 0x1ad2000
brk(NULL)                               = 0x1ad2000
brk(0x1af3000)                          = 0x1af3000
sendto(4, "2000000000", 10, 0, NULL, 0) = 10
close(4)                                = 0
accept(3, 

Could you send me a core file? If you don't know where it is, you can
change /proc/sys/kernel/core_pattern and reproduce a problem again:

For example:
If you execute the next command:
$ echo /tmp/core > /proc/sys/kernel/core_pattern

core files will be saved in /tmp/core.{pid}


What processor do you use?
$ cat /proc/cpuinfo | grep model

Thanks,
Andrei

> 
> same scenario exactly, but instead of running "pypy main.py" on the first
> line, running "python3 main.py" works perfectly. it only happens when
> running with pypy.
> 
> REPRODUCTION ENVIRONMENT:
> 1. tested with personal ubuntu 17.10 laptop, and aws ubuntu 17.10 ec2
> server.
> 2. pypy installed with "sudo apt-get install pypy"
> 3. two versions of criu on both machines reproduce the bug: 3.7 stable
> built from source (downloaded from criu.org), and 3.4 installed with "sudo
> apt-get install criu"
> 
> motivation behind project:
> pypy is a python jit, which accelerates python computations significantly.
> the use case in generated.py takes ~2minutes to run using python3, but 4.1s
> using pypy! however, the pypy jit needs to "warm up": the same computation
> takes 3.6s running for the second time inside the same process. of course
> this is just a "sample", the real application the improvement between warm
> and cold jit is around 2X. the sample application attached was to simplify
> reproduction to a trivial application (a single tcp socket in "accepting"
> state).
> the pypy team declare that the jit cannot be snapshotted (
> http://doc.pypy.org/en/latest/faq.html#couldn-t-the-jit-dump-and-reload-already-compiled-machine-code
> ), so we thought we can emulate the effect with criu.
> 
> please help me!
> thanks in advance,
> Shlomi

> import generated
> import time
> import socket
> 
> s = socket.socket()
> s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
> s.bind(("", 9000))
> s.listen(10)
> while True:
>     c = s.accept()[0]
>     try:
>         generated.main()
>         c.send(("%s" % generated.counter).encode())
>     finally:
>         c.close()

> import socket
> 
> s = socket.socket()
> s.connect(("127.0.0.1", 9000))
> data = s.recv(4096)
> print(data.decode())

> counter = 0
> def a_0():
>  global counter
>  counter += 1
> 
> def b_0():
>  global counter
>  counter += 1
> 
> def c_0():
>  global counter
>  counter += 1
> 
> def d_0():
>  global counter
>  counter += 1
> 
> def e_0():
>  global counter
>  counter += 1
> 
> def f_0():
>  global counter
>  counter += 1
> 
> def g_0():
>  global counter
>  counter += 1
> 
> def h_0():
>  global counter
>  counter += 1
> 
> def i_0():
>  global counter
>  counter += 1
> 
> def j_0():
>  global counter
>  counter += 1
> 
> def a_1():
>  a_0()
>  b_0()
>  c_0()
>  d_0()
>  e_0()
>  f_0()
>  g_0()
>  h_0()
>  i_0()
>  j_0()
> 
> def b_1():
>  a_0()
>  b_0()
>  c_0()
>  d_0()
>  e_0()
>  f_0()
>  g_0()
>  h_0()
>  i_0()
>  j_0()
> 
> def c_1():
>  a_0()
>  b_0()
>  c_0()
>  d_0()
>  e_0()
>  f_0()
>  g_0()
>  h_0()
>  i_0()
>  j_0()
> 
> def d_1():
>  a_0()
>  b_0()
>  c_0()
>  d_0()
>  e_0()
>  f_0()
>  g_0()
>  h_0()
>  i_0()
>  j_0()
> 
> def e_1():
>  a_0()
>  b_0()
>  c_0()
>  d_0()
>  e_0()
>  f_0()
>  g_0()
>  h_0()
>  i_0()
>  j_0()
> 
> def f_1():
>  a_0()
>  b_0()
>  c_0()
>  d_0()
>  e_0()
>  f_0()
>  g_0()
>  h_0()
>  i_0()
>  j_0()
> 
> def g_1():
>  a_0()
>  b_0()
>  c_0()
>  d_0()
>  e_0()
>  f_0()
>  g_0()
>  h_0()
>  i_0()
>  j_0()
> 
> def h_1():
>  a_0()
>  b_0()
>  c_0()
>  d_0()
>  e_0()
>  f_0()
>  g_0()
>  h_0()
>  i_0()
>  j_0()
> 
> def i_1():
>  a_0()
>  b_0()
>  c_0()
>  d_0()
>  e_0()
>  f_0()
>  g_0()
>  h_0()
>  i_0()
>  j_0()
> 
> def j_1():
>  a_0()
>  b_0()
>  c_0()
>  d_0()
>  e_0()
>  f_0()
>  g_0()
>  h_0()
>  i_0()
>  j_0()
> 
> def a_2():
>  a_1()
>  b_1()
>  c_1()
>  d_1()
>  e_1()
>  f_1()
>  g_1()
>  h_1()
>  i_1()
>  j_1()
> 
> def b_2():
>  a_1()
>  b_1()
>  c_1()
>  d_1()
>  e_1()
>  f_1()
>  g_1()
>  h_1()
>  i_1()
>  j_1()
> 
> def c_2():
>  a_1()
>  b_1()
>  c_1()
>  d_1()
>  e_1()
>  f_1()
>  g_1()
>  h_1()
>  i_1()
>  j_1()
> 
> def d_2():
>  a_1()
>  b_1()
>  c_1()
>  d_1()
>  e_1()
>  f_1()
>  g_1()
>  h_1()
>  i_1()
>  j_1()
> 
> def e_2():
>  a_1()
>  b_1()
>  c_1()
>  d_1()
>  e_1()
>  f_1()
>  g_1()
>  h_1()
>  i_1()
>  j_1()
> 
> def f_2():
>  a_1()
>  b_1()
>  c_1()
>  d_1()
>  e_1()
>  f_1()
>  g_1()
>  h_1()
>  i_1()
>  j_1()
> 
> def g_2():
>  a_1()
>  b_1()
>  c_1()
>  d_1()
>  e_1()
>  f_1()
>  g_1()
>  h_1()
>  i_1()
>  j_1()
> 
> def h_2():
>  a_1()
>  b_1()
>  c_1()
>  d_1()
>  e_1()
>  f_1()
>  g_1()
>  h_1()
>  i_1()
>  j_1()
> 
> def i_2():
>  a_1()
>  b_1()
>  c_1()
>  d_1()
>  e_1()
>  f_1()
>  g_1()
>  h_1()
>  i_1()
>  j_1()
> 
> def j_2():
>  a_1()
>  b_1()
>  c_1()
>  d_1()
>  e_1()
>  f_1()
>  g_1()
>  h_1()
>  i_1()
>  j_1()
> 
> def a_3():
>  a_2()
>  b_2()
>  c_2()
>  d_2()
>  e_2()
>  f_2()
>  g_2()
>  h_2()
>  i_2()
>  j_2()
> 
> def b_3():
>  a_2()
>  b_2()
>  c_2()
>  d_2()
>  e_2()
>  f_2()
>  g_2()
>  h_2()
>  i_2()
>  j_2()
> 
> def c_3():
>  a_2()
>  b_2()
>  c_2()
>  d_2()
>  e_2()
>  f_2()
>  g_2()
>  h_2()
>  i_2()
>  j_2()
> 
> def d_3():
>  a_2()
>  b_2()
>  c_2()
>  d_2()
>  e_2()
>  f_2()
>  g_2()
>  h_2()
>  i_2()
>  j_2()
> 
> def e_3():
>  a_2()
>  b_2()
>  c_2()
>  d_2()
>  e_2()
>  f_2()
>  g_2()
>  h_2()
>  i_2()
>  j_2()
> 
> def f_3():
>  a_2()
>  b_2()
>  c_2()
>  d_2()
>  e_2()
>  f_2()
>  g_2()
>  h_2()
>  i_2()
>  j_2()
> 
> def g_3():
>  a_2()
>  b_2()
>  c_2()
>  d_2()
>  e_2()
>  f_2()
>  g_2()
>  h_2()
>  i_2()
>  j_2()
> 
> def h_3():
>  a_2()
>  b_2()
>  c_2()
>  d_2()
>  e_2()
>  f_2()
>  g_2()
>  h_2()
>  i_2()
>  j_2()
> 
> def i_3():
>  a_2()
>  b_2()
>  c_2()
>  d_2()
>  e_2()
>  f_2()
>  g_2()
>  h_2()
>  i_2()
>  j_2()
> 
> def j_3():
>  a_2()
>  b_2()
>  c_2()
>  d_2()
>  e_2()
>  f_2()
>  g_2()
>  h_2()
>  i_2()
>  j_2()
> 
> def a_4():
>  a_3()
>  b_3()
>  c_3()
>  d_3()
>  e_3()
>  f_3()
>  g_3()
>  h_3()
>  i_3()
>  j_3()
> 
> def b_4():
>  a_3()
>  b_3()
>  c_3()
>  d_3()
>  e_3()
>  f_3()
>  g_3()
>  h_3()
>  i_3()
>  j_3()
> 
> def c_4():
>  a_3()
>  b_3()
>  c_3()
>  d_3()
>  e_3()
>  f_3()
>  g_3()
>  h_3()
>  i_3()
>  j_3()
> 
> def d_4():
>  a_3()
>  b_3()
>  c_3()
>  d_3()
>  e_3()
>  f_3()
>  g_3()
>  h_3()
>  i_3()
>  j_3()
> 
> def e_4():
>  a_3()
>  b_3()
>  c_3()
>  d_3()
>  e_3()
>  f_3()
>  g_3()
>  h_3()
>  i_3()
>  j_3()
> 
> def f_4():
>  a_3()
>  b_3()
>  c_3()
>  d_3()
>  e_3()
>  f_3()
>  g_3()
>  h_3()
>  i_3()
>  j_3()
> 
> def g_4():
>  a_3()
>  b_3()
>  c_3()
>  d_3()
>  e_3()
>  f_3()
>  g_3()
>  h_3()
>  i_3()
>  j_3()
> 
> def h_4():
>  a_3()
>  b_3()
>  c_3()
>  d_3()
>  e_3()
>  f_3()
>  g_3()
>  h_3()
>  i_3()
>  j_3()
> 
> def i_4():
>  a_3()
>  b_3()
>  c_3()
>  d_3()
>  e_3()
>  f_3()
>  g_3()
>  h_3()
>  i_3()
>  j_3()
> 
> def j_4():
>  a_3()
>  b_3()
>  c_3()
>  d_3()
>  e_3()
>  f_3()
>  g_3()
>  h_3()
>  i_3()
>  j_3()
> 
> def a_5():
>  a_4()
>  b_4()
>  c_4()
>  d_4()
>  e_4()
>  f_4()
>  g_4()
>  h_4()
>  i_4()
>  j_4()
> 
> def b_5():
>  a_4()
>  b_4()
>  c_4()
>  d_4()
>  e_4()
>  f_4()
>  g_4()
>  h_4()
>  i_4()
>  j_4()
> 
> def c_5():
>  a_4()
>  b_4()
>  c_4()
>  d_4()
>  e_4()
>  f_4()
>  g_4()
>  h_4()
>  i_4()
>  j_4()
> 
> def d_5():
>  a_4()
>  b_4()
>  c_4()
>  d_4()
>  e_4()
>  f_4()
>  g_4()
>  h_4()
>  i_4()
>  j_4()
> 
> def e_5():
>  a_4()
>  b_4()
>  c_4()
>  d_4()
>  e_4()
>  f_4()
>  g_4()
>  h_4()
>  i_4()
>  j_4()
> 
> def f_5():
>  a_4()
>  b_4()
>  c_4()
>  d_4()
>  e_4()
>  f_4()
>  g_4()
>  h_4()
>  i_4()
>  j_4()
> 
> def g_5():
>  a_4()
>  b_4()
>  c_4()
>  d_4()
>  e_4()
>  f_4()
>  g_4()
>  h_4()
>  i_4()
>  j_4()
> 
> def h_5():
>  a_4()
>  b_4()
>  c_4()
>  d_4()
>  e_4()
>  f_4()
>  g_4()
>  h_4()
>  i_4()
>  j_4()
> 
> def i_5():
>  a_4()
>  b_4()
>  c_4()
>  d_4()
>  e_4()
>  f_4()
>  g_4()
>  h_4()
>  i_4()
>  j_4()
> 
> def j_5():
>  a_4()
>  b_4()
>  c_4()
>  d_4()
>  e_4()
>  f_4()
>  g_4()
>  h_4()
>  i_4()
>  j_4()
> 
> def a_6():
>  a_5()
>  b_5()
>  c_5()
>  d_5()
>  e_5()
>  f_5()
>  g_5()
>  h_5()
>  i_5()
>  j_5()
> 
> def b_6():
>  a_5()
>  b_5()
>  c_5()
>  d_5()
>  e_5()
>  f_5()
>  g_5()
>  h_5()
>  i_5()
>  j_5()
> 
> def c_6():
>  a_5()
>  b_5()
>  c_5()
>  d_5()
>  e_5()
>  f_5()
>  g_5()
>  h_5()
>  i_5()
>  j_5()
> 
> def d_6():
>  a_5()
>  b_5()
>  c_5()
>  d_5()
>  e_5()
>  f_5()
>  g_5()
>  h_5()
>  i_5()
>  j_5()
> 
> def e_6():
>  a_5()
>  b_5()
>  c_5()
>  d_5()
>  e_5()
>  f_5()
>  g_5()
>  h_5()
>  i_5()
>  j_5()
> 
> def f_6():
>  a_5()
>  b_5()
>  c_5()
>  d_5()
>  e_5()
>  f_5()
>  g_5()
>  h_5()
>  i_5()
>  j_5()
> 
> def g_6():
>  a_5()
>  b_5()
>  c_5()
>  d_5()
>  e_5()
>  f_5()
>  g_5()
>  h_5()
>  i_5()
>  j_5()
> 
> def h_6():
>  a_5()
>  b_5()
>  c_5()
>  d_5()
>  e_5()
>  f_5()
>  g_5()
>  h_5()
>  i_5()
>  j_5()
> 
> def i_6():
>  a_5()
>  b_5()
>  c_5()
>  d_5()
>  e_5()
>  f_5()
>  g_5()
>  h_5()
>  i_5()
>  j_5()
> 
> def j_6():
>  a_5()
>  b_5()
>  c_5()
>  d_5()
>  e_5()
>  f_5()
>  g_5()
>  h_5()
>  i_5()
>  j_5()
> 
> def a_7():
>  a_6()
>  b_6()
>  c_6()
>  d_6()
>  e_6()
>  f_6()
>  g_6()
>  h_6()
>  i_6()
>  j_6()
> 
> def b_7():
>  a_6()
>  b_6()
>  c_6()
>  d_6()
>  e_6()
>  f_6()
>  g_6()
>  h_6()
>  i_6()
>  j_6()
> 
> def c_7():
>  a_6()
>  b_6()
>  c_6()
>  d_6()
>  e_6()
>  f_6()
>  g_6()
>  h_6()
>  i_6()
>  j_6()
> 
> def d_7():
>  a_6()
>  b_6()
>  c_6()
>  d_6()
>  e_6()
>  f_6()
>  g_6()
>  h_6()
>  i_6()
>  j_6()
> 
> def e_7():
>  a_6()
>  b_6()
>  c_6()
>  d_6()
>  e_6()
>  f_6()
>  g_6()
>  h_6()
>  i_6()
>  j_6()
> 
> def f_7():
>  a_6()
>  b_6()
>  c_6()
>  d_6()
>  e_6()
>  f_6()
>  g_6()
>  h_6()
>  i_6()
>  j_6()
> 
> def g_7():
>  a_6()
>  b_6()
>  c_6()
>  d_6()
>  e_6()
>  f_6()
>  g_6()
>  h_6()
>  i_6()
>  j_6()
> 
> def h_7():
>  a_6()
>  b_6()
>  c_6()
>  d_6()
>  e_6()
>  f_6()
>  g_6()
>  h_6()
>  i_6()
>  j_6()
> 
> def i_7():
>  a_6()
>  b_6()
>  c_6()
>  d_6()
>  e_6()
>  f_6()
>  g_6()
>  h_6()
>  i_6()
>  j_6()
> 
> def j_7():
>  a_6()
>  b_6()
>  c_6()
>  d_6()
>  e_6()
>  f_6()
>  g_6()
>  h_6()
>  i_6()
>  j_6()
> 
> def a_8():
>  a_7()
>  b_7()
>  c_7()
>  d_7()
>  e_7()
>  f_7()
>  g_7()
>  h_7()
>  i_7()
>  j_7()
> 
> def b_8():
>  a_7()
>  b_7()
>  c_7()
>  d_7()
>  e_7()
>  f_7()
>  g_7()
>  h_7()
>  i_7()
>  j_7()
> 
> def c_8():
>  a_7()
>  b_7()
>  c_7()
>  d_7()
>  e_7()
>  f_7()
>  g_7()
>  h_7()
>  i_7()
>  j_7()
> 
> def d_8():
>  a_7()
>  b_7()
>  c_7()
>  d_7()
>  e_7()
>  f_7()
>  g_7()
>  h_7()
>  i_7()
>  j_7()
> 
> def e_8():
>  a_7()
>  b_7()
>  c_7()
>  d_7()
>  e_7()
>  f_7()
>  g_7()
>  h_7()
>  i_7()
>  j_7()
> 
> def f_8():
>  a_7()
>  b_7()
>  c_7()
>  d_7()
>  e_7()
>  f_7()
>  g_7()
>  h_7()
>  i_7()
>  j_7()
> 
> def g_8():
>  a_7()
>  b_7()
>  c_7()
>  d_7()
>  e_7()
>  f_7()
>  g_7()
>  h_7()
>  i_7()
>  j_7()
> 
> def h_8():
>  a_7()
>  b_7()
>  c_7()
>  d_7()
>  e_7()
>  f_7()
>  g_7()
>  h_7()
>  i_7()
>  j_7()
> 
> def i_8():
>  a_7()
>  b_7()
>  c_7()
>  d_7()
>  e_7()
>  f_7()
>  g_7()
>  h_7()
>  i_7()
>  j_7()
> 
> def j_8():
>  a_7()
>  b_7()
>  c_7()
>  d_7()
>  e_7()
>  f_7()
>  g_7()
>  h_7()
>  i_7()
>  j_7()
> 
> def main():
>  a_8()
>  b_8()
>  c_8()
>  d_8()
>  e_8()
>  f_8()
>  g_8()
>  h_8()
>  i_8()
>  j_8()
> 

> _______________________________________________
> CRIU mailing list
> CRIU at openvz.org
> https://lists.openvz.org/mailman/listinfo/criu



More information about the CRIU mailing list