[CRIU] Bug report: a process restored with criu crashes on SIGFPE

Andrei Vagin avagin at virtuozzo.com
Thu Jan 25 04:04:56 MSK 2018


On Wed, Jan 24, 2018 at 04:25:01PM -0800, Andrei Vagin wrote:
> On Wed, Jan 24, 2018 at 11:48:14PM +0200, Shlomi Matichin wrote:
> > Hello,
> > 
> > first, thank you guys for all your awesome work with criu. i have a bug
> > report i would like to ask your help with, but please know that criu to me
> > is magic, and its amazing how well it works.
> > 
> > REPRODUCING CODE ATTACHED:
> > attached are two programs written in python3. the server side is a simple
> > tcp socket accept connection, compute, return answer and close connection
> > loop (implemented with two files, main.py, generated.py). the client side
> > just connects and prints whatever comes on the tcp connection.
> > 
> > STEPS TO REPRODUCE:
> > on terminal 1: pypy main.py
> > on terminal 2: python3 client.py
> > on terminal 2: cd <dump directory>
> > on terminal 2: sudo criu dump -t `pidof pypy` --shell-job
> > on terminal 1: <server dies>
> > on terminal 2: sudo criu restore --shell-job
> > on terminal 3: sudo strace -fF -p `pidof pypy`
> > on terminal 1: python3 client.py
> > on terminal 2: <pypy crashes, parent process exists>
> > on terminal 3: <output follows:>
> > strace: Process 326 attached
> > accept(3, {sa_family=AF_INET, sin_port=htons(56262),
> > sin_addr=inet_addr("127.0.0.1")}, [16]) = 4
> > --- SIGFPE {si_signo=SIGFPE, si_code=FPE_FLTRES, si_addr=0x7f6b19ce76d1} ---
> > +++ killed by SIGFPE (core dumped) +++
> 
> I can't reproduce this issue in my local environment:
> 
> [root at fc24 xxx]# sudo strace -fF -p `pidof pypy`
> strace: Process 606 attached
> accept(3, {sa_family=AF_INET, sin_port=htons(56456), sin_addr=inet_addr("127.0.0.1")}, [16]) = 4
> brk(NULL)                               = 0x1ab1000
> brk(0x1ad2000)                          = 0x1ad2000
> brk(NULL)                               = 0x1ad2000
> brk(0x1af3000)                          = 0x1af3000
> sendto(4, "2000000000", 10, 0, NULL, 0) = 10
> close(4)                                = 0
> accept(3, 
> 
> Could you send me a core file? If you don't know where it is, you can
> change /proc/sys/kernel/core_pattern and reproduce a problem again:
> 
> For example:
> If you execute the next command:
> $ echo /tmp/core > /proc/sys/kernel/core_pattern
> 
> core files will be saved in /tmp/core.{pid}
> 
> 
> What processor do you use?
> $ cat /proc/cpuinfo | grep model

And could you show output for this command:
$ dmesg | grep x86/fpu

> 
> Thanks,
> Andrei
> 
> > 
> > same scenario exactly, but instead of running "pypy main.py" on the first
> > line, running "python3 main.py" works perfectly. it only happens when
> > running with pypy.
> > 
> > REPRODUCTION ENVIRONMENT:
> > 1. tested with personal ubuntu 17.10 laptop, and aws ubuntu 17.10 ec2
> > server.
> > 2. pypy installed with "sudo apt-get install pypy"
> > 3. two versions of criu on both machines reproduce the bug: 3.7 stable
> > built from source (downloaded from criu.org), and 3.4 installed with "sudo
> > apt-get install criu"
> > 
> > motivation behind project:
> > pypy is a python jit, which accelerates python computations significantly.
> > the use case in generated.py takes ~2minutes to run using python3, but 4.1s
> > using pypy! however, the pypy jit needs to "warm up": the same computation
> > takes 3.6s running for the second time inside the same process. of course
> > this is just a "sample", the real application the improvement between warm
> > and cold jit is around 2X. the sample application attached was to simplify
> > reproduction to a trivial application (a single tcp socket in "accepting"
> > state).
> > the pypy team declare that the jit cannot be snapshotted (
> > http://doc.pypy.org/en/latest/faq.html#couldn-t-the-jit-dump-and-reload-already-compiled-machine-code
> > ), so we thought we can emulate the effect with criu.
> > 
> > please help me!
> > thanks in advance,
> > Shlomi
> 
> > import generated
> > import time
> > import socket
> > 
> > s = socket.socket()
> > s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
> > s.bind(("", 9000))
> > s.listen(10)
> > while True:
> >     c = s.accept()[0]
> >     try:
> >         generated.main()
> >         c.send(("%s" % generated.counter).encode())
> >     finally:
> >         c.close()
> 
> > import socket
> > 
> > s = socket.socket()
> > s.connect(("127.0.0.1", 9000))
> > data = s.recv(4096)
> > print(data.decode())
> 
> > counter = 0
> > def a_0():
> >  global counter
> >  counter += 1
> > 
> > def b_0():
> >  global counter
> >  counter += 1
> > 
> > def c_0():
> >  global counter
> >  counter += 1
> > 
> > def d_0():
> >  global counter
> >  counter += 1
> > 
> > def e_0():
> >  global counter
> >  counter += 1
> > 
> > def f_0():
> >  global counter
> >  counter += 1
> > 
> > def g_0():
> >  global counter
> >  counter += 1
> > 
> > def h_0():
> >  global counter
> >  counter += 1
> > 
> > def i_0():
> >  global counter
> >  counter += 1
> > 
> > def j_0():
> >  global counter
> >  counter += 1
> > 
> > def a_1():
> >  a_0()
> >  b_0()
> >  c_0()
> >  d_0()
> >  e_0()
> >  f_0()
> >  g_0()
> >  h_0()
> >  i_0()
> >  j_0()
> > 
> > def b_1():
> >  a_0()
> >  b_0()
> >  c_0()
> >  d_0()
> >  e_0()
> >  f_0()
> >  g_0()
> >  h_0()
> >  i_0()
> >  j_0()
> > 
> > def c_1():
> >  a_0()
> >  b_0()
> >  c_0()
> >  d_0()
> >  e_0()
> >  f_0()
> >  g_0()
> >  h_0()
> >  i_0()
> >  j_0()
> > 
> > def d_1():
> >  a_0()
> >  b_0()
> >  c_0()
> >  d_0()
> >  e_0()
> >  f_0()
> >  g_0()
> >  h_0()
> >  i_0()
> >  j_0()
> > 
> > def e_1():
> >  a_0()
> >  b_0()
> >  c_0()
> >  d_0()
> >  e_0()
> >  f_0()
> >  g_0()
> >  h_0()
> >  i_0()
> >  j_0()
> > 
> > def f_1():
> >  a_0()
> >  b_0()
> >  c_0()
> >  d_0()
> >  e_0()
> >  f_0()
> >  g_0()
> >  h_0()
> >  i_0()
> >  j_0()
> > 
> > def g_1():
> >  a_0()
> >  b_0()
> >  c_0()
> >  d_0()
> >  e_0()
> >  f_0()
> >  g_0()
> >  h_0()
> >  i_0()
> >  j_0()
> > 
> > def h_1():
> >  a_0()
> >  b_0()
> >  c_0()
> >  d_0()
> >  e_0()
> >  f_0()
> >  g_0()
> >  h_0()
> >  i_0()
> >  j_0()
> > 
> > def i_1():
> >  a_0()
> >  b_0()
> >  c_0()
> >  d_0()
> >  e_0()
> >  f_0()
> >  g_0()
> >  h_0()
> >  i_0()
> >  j_0()
> > 
> > def j_1():
> >  a_0()
> >  b_0()
> >  c_0()
> >  d_0()
> >  e_0()
> >  f_0()
> >  g_0()
> >  h_0()
> >  i_0()
> >  j_0()
> > 
> > def a_2():
> >  a_1()
> >  b_1()
> >  c_1()
> >  d_1()
> >  e_1()
> >  f_1()
> >  g_1()
> >  h_1()
> >  i_1()
> >  j_1()
> > 
> > def b_2():
> >  a_1()
> >  b_1()
> >  c_1()
> >  d_1()
> >  e_1()
> >  f_1()
> >  g_1()
> >  h_1()
> >  i_1()
> >  j_1()
> > 
> > def c_2():
> >  a_1()
> >  b_1()
> >  c_1()
> >  d_1()
> >  e_1()
> >  f_1()
> >  g_1()
> >  h_1()
> >  i_1()
> >  j_1()
> > 
> > def d_2():
> >  a_1()
> >  b_1()
> >  c_1()
> >  d_1()
> >  e_1()
> >  f_1()
> >  g_1()
> >  h_1()
> >  i_1()
> >  j_1()
> > 
> > def e_2():
> >  a_1()
> >  b_1()
> >  c_1()
> >  d_1()
> >  e_1()
> >  f_1()
> >  g_1()
> >  h_1()
> >  i_1()
> >  j_1()
> > 
> > def f_2():
> >  a_1()
> >  b_1()
> >  c_1()
> >  d_1()
> >  e_1()
> >  f_1()
> >  g_1()
> >  h_1()
> >  i_1()
> >  j_1()
> > 
> > def g_2():
> >  a_1()
> >  b_1()
> >  c_1()
> >  d_1()
> >  e_1()
> >  f_1()
> >  g_1()
> >  h_1()
> >  i_1()
> >  j_1()
> > 
> > def h_2():
> >  a_1()
> >  b_1()
> >  c_1()
> >  d_1()
> >  e_1()
> >  f_1()
> >  g_1()
> >  h_1()
> >  i_1()
> >  j_1()
> > 
> > def i_2():
> >  a_1()
> >  b_1()
> >  c_1()
> >  d_1()
> >  e_1()
> >  f_1()
> >  g_1()
> >  h_1()
> >  i_1()
> >  j_1()
> > 
> > def j_2():
> >  a_1()
> >  b_1()
> >  c_1()
> >  d_1()
> >  e_1()
> >  f_1()
> >  g_1()
> >  h_1()
> >  i_1()
> >  j_1()
> > 
> > def a_3():
> >  a_2()
> >  b_2()
> >  c_2()
> >  d_2()
> >  e_2()
> >  f_2()
> >  g_2()
> >  h_2()
> >  i_2()
> >  j_2()
> > 
> > def b_3():
> >  a_2()
> >  b_2()
> >  c_2()
> >  d_2()
> >  e_2()
> >  f_2()
> >  g_2()
> >  h_2()
> >  i_2()
> >  j_2()
> > 
> > def c_3():
> >  a_2()
> >  b_2()
> >  c_2()
> >  d_2()
> >  e_2()
> >  f_2()
> >  g_2()
> >  h_2()
> >  i_2()
> >  j_2()
> > 
> > def d_3():
> >  a_2()
> >  b_2()
> >  c_2()
> >  d_2()
> >  e_2()
> >  f_2()
> >  g_2()
> >  h_2()
> >  i_2()
> >  j_2()
> > 
> > def e_3():
> >  a_2()
> >  b_2()
> >  c_2()
> >  d_2()
> >  e_2()
> >  f_2()
> >  g_2()
> >  h_2()
> >  i_2()
> >  j_2()
> > 
> > def f_3():
> >  a_2()
> >  b_2()
> >  c_2()
> >  d_2()
> >  e_2()
> >  f_2()
> >  g_2()
> >  h_2()
> >  i_2()
> >  j_2()
> > 
> > def g_3():
> >  a_2()
> >  b_2()
> >  c_2()
> >  d_2()
> >  e_2()
> >  f_2()
> >  g_2()
> >  h_2()
> >  i_2()
> >  j_2()
> > 
> > def h_3():
> >  a_2()
> >  b_2()
> >  c_2()
> >  d_2()
> >  e_2()
> >  f_2()
> >  g_2()
> >  h_2()
> >  i_2()
> >  j_2()
> > 
> > def i_3():
> >  a_2()
> >  b_2()
> >  c_2()
> >  d_2()
> >  e_2()
> >  f_2()
> >  g_2()
> >  h_2()
> >  i_2()
> >  j_2()
> > 
> > def j_3():
> >  a_2()
> >  b_2()
> >  c_2()
> >  d_2()
> >  e_2()
> >  f_2()
> >  g_2()
> >  h_2()
> >  i_2()
> >  j_2()
> > 
> > def a_4():
> >  a_3()
> >  b_3()
> >  c_3()
> >  d_3()
> >  e_3()
> >  f_3()
> >  g_3()
> >  h_3()
> >  i_3()
> >  j_3()
> > 
> > def b_4():
> >  a_3()
> >  b_3()
> >  c_3()
> >  d_3()
> >  e_3()
> >  f_3()
> >  g_3()
> >  h_3()
> >  i_3()
> >  j_3()
> > 
> > def c_4():
> >  a_3()
> >  b_3()
> >  c_3()
> >  d_3()
> >  e_3()
> >  f_3()
> >  g_3()
> >  h_3()
> >  i_3()
> >  j_3()
> > 
> > def d_4():
> >  a_3()
> >  b_3()
> >  c_3()
> >  d_3()
> >  e_3()
> >  f_3()
> >  g_3()
> >  h_3()
> >  i_3()
> >  j_3()
> > 
> > def e_4():
> >  a_3()
> >  b_3()
> >  c_3()
> >  d_3()
> >  e_3()
> >  f_3()
> >  g_3()
> >  h_3()
> >  i_3()
> >  j_3()
> > 
> > def f_4():
> >  a_3()
> >  b_3()
> >  c_3()
> >  d_3()
> >  e_3()
> >  f_3()
> >  g_3()
> >  h_3()
> >  i_3()
> >  j_3()
> > 
> > def g_4():
> >  a_3()
> >  b_3()
> >  c_3()
> >  d_3()
> >  e_3()
> >  f_3()
> >  g_3()
> >  h_3()
> >  i_3()
> >  j_3()
> > 
> > def h_4():
> >  a_3()
> >  b_3()
> >  c_3()
> >  d_3()
> >  e_3()
> >  f_3()
> >  g_3()
> >  h_3()
> >  i_3()
> >  j_3()
> > 
> > def i_4():
> >  a_3()
> >  b_3()
> >  c_3()
> >  d_3()
> >  e_3()
> >  f_3()
> >  g_3()
> >  h_3()
> >  i_3()
> >  j_3()
> > 
> > def j_4():
> >  a_3()
> >  b_3()
> >  c_3()
> >  d_3()
> >  e_3()
> >  f_3()
> >  g_3()
> >  h_3()
> >  i_3()
> >  j_3()
> > 
> > def a_5():
> >  a_4()
> >  b_4()
> >  c_4()
> >  d_4()
> >  e_4()
> >  f_4()
> >  g_4()
> >  h_4()
> >  i_4()
> >  j_4()
> > 
> > def b_5():
> >  a_4()
> >  b_4()
> >  c_4()
> >  d_4()
> >  e_4()
> >  f_4()
> >  g_4()
> >  h_4()
> >  i_4()
> >  j_4()
> > 
> > def c_5():
> >  a_4()
> >  b_4()
> >  c_4()
> >  d_4()
> >  e_4()
> >  f_4()
> >  g_4()
> >  h_4()
> >  i_4()
> >  j_4()
> > 
> > def d_5():
> >  a_4()
> >  b_4()
> >  c_4()
> >  d_4()
> >  e_4()
> >  f_4()
> >  g_4()
> >  h_4()
> >  i_4()
> >  j_4()
> > 
> > def e_5():
> >  a_4()
> >  b_4()
> >  c_4()
> >  d_4()
> >  e_4()
> >  f_4()
> >  g_4()
> >  h_4()
> >  i_4()
> >  j_4()
> > 
> > def f_5():
> >  a_4()
> >  b_4()
> >  c_4()
> >  d_4()
> >  e_4()
> >  f_4()
> >  g_4()
> >  h_4()
> >  i_4()
> >  j_4()
> > 
> > def g_5():
> >  a_4()
> >  b_4()
> >  c_4()
> >  d_4()
> >  e_4()
> >  f_4()
> >  g_4()
> >  h_4()
> >  i_4()
> >  j_4()
> > 
> > def h_5():
> >  a_4()
> >  b_4()
> >  c_4()
> >  d_4()
> >  e_4()
> >  f_4()
> >  g_4()
> >  h_4()
> >  i_4()
> >  j_4()
> > 
> > def i_5():
> >  a_4()
> >  b_4()
> >  c_4()
> >  d_4()
> >  e_4()
> >  f_4()
> >  g_4()
> >  h_4()
> >  i_4()
> >  j_4()
> > 
> > def j_5():
> >  a_4()
> >  b_4()
> >  c_4()
> >  d_4()
> >  e_4()
> >  f_4()
> >  g_4()
> >  h_4()
> >  i_4()
> >  j_4()
> > 
> > def a_6():
> >  a_5()
> >  b_5()
> >  c_5()
> >  d_5()
> >  e_5()
> >  f_5()
> >  g_5()
> >  h_5()
> >  i_5()
> >  j_5()
> > 
> > def b_6():
> >  a_5()
> >  b_5()
> >  c_5()
> >  d_5()
> >  e_5()
> >  f_5()
> >  g_5()
> >  h_5()
> >  i_5()
> >  j_5()
> > 
> > def c_6():
> >  a_5()
> >  b_5()
> >  c_5()
> >  d_5()
> >  e_5()
> >  f_5()
> >  g_5()
> >  h_5()
> >  i_5()
> >  j_5()
> > 
> > def d_6():
> >  a_5()
> >  b_5()
> >  c_5()
> >  d_5()
> >  e_5()
> >  f_5()
> >  g_5()
> >  h_5()
> >  i_5()
> >  j_5()
> > 
> > def e_6():
> >  a_5()
> >  b_5()
> >  c_5()
> >  d_5()
> >  e_5()
> >  f_5()
> >  g_5()
> >  h_5()
> >  i_5()
> >  j_5()
> > 
> > def f_6():
> >  a_5()
> >  b_5()
> >  c_5()
> >  d_5()
> >  e_5()
> >  f_5()
> >  g_5()
> >  h_5()
> >  i_5()
> >  j_5()
> > 
> > def g_6():
> >  a_5()
> >  b_5()
> >  c_5()
> >  d_5()
> >  e_5()
> >  f_5()
> >  g_5()
> >  h_5()
> >  i_5()
> >  j_5()
> > 
> > def h_6():
> >  a_5()
> >  b_5()
> >  c_5()
> >  d_5()
> >  e_5()
> >  f_5()
> >  g_5()
> >  h_5()
> >  i_5()
> >  j_5()
> > 
> > def i_6():
> >  a_5()
> >  b_5()
> >  c_5()
> >  d_5()
> >  e_5()
> >  f_5()
> >  g_5()
> >  h_5()
> >  i_5()
> >  j_5()
> > 
> > def j_6():
> >  a_5()
> >  b_5()
> >  c_5()
> >  d_5()
> >  e_5()
> >  f_5()
> >  g_5()
> >  h_5()
> >  i_5()
> >  j_5()
> > 
> > def a_7():
> >  a_6()
> >  b_6()
> >  c_6()
> >  d_6()
> >  e_6()
> >  f_6()
> >  g_6()
> >  h_6()
> >  i_6()
> >  j_6()
> > 
> > def b_7():
> >  a_6()
> >  b_6()
> >  c_6()
> >  d_6()
> >  e_6()
> >  f_6()
> >  g_6()
> >  h_6()
> >  i_6()
> >  j_6()
> > 
> > def c_7():
> >  a_6()
> >  b_6()
> >  c_6()
> >  d_6()
> >  e_6()
> >  f_6()
> >  g_6()
> >  h_6()
> >  i_6()
> >  j_6()
> > 
> > def d_7():
> >  a_6()
> >  b_6()
> >  c_6()
> >  d_6()
> >  e_6()
> >  f_6()
> >  g_6()
> >  h_6()
> >  i_6()
> >  j_6()
> > 
> > def e_7():
> >  a_6()
> >  b_6()
> >  c_6()
> >  d_6()
> >  e_6()
> >  f_6()
> >  g_6()
> >  h_6()
> >  i_6()
> >  j_6()
> > 
> > def f_7():
> >  a_6()
> >  b_6()
> >  c_6()
> >  d_6()
> >  e_6()
> >  f_6()
> >  g_6()
> >  h_6()
> >  i_6()
> >  j_6()
> > 
> > def g_7():
> >  a_6()
> >  b_6()
> >  c_6()
> >  d_6()
> >  e_6()
> >  f_6()
> >  g_6()
> >  h_6()
> >  i_6()
> >  j_6()
> > 
> > def h_7():
> >  a_6()
> >  b_6()
> >  c_6()
> >  d_6()
> >  e_6()
> >  f_6()
> >  g_6()
> >  h_6()
> >  i_6()
> >  j_6()
> > 
> > def i_7():
> >  a_6()
> >  b_6()
> >  c_6()
> >  d_6()
> >  e_6()
> >  f_6()
> >  g_6()
> >  h_6()
> >  i_6()
> >  j_6()
> > 
> > def j_7():
> >  a_6()
> >  b_6()
> >  c_6()
> >  d_6()
> >  e_6()
> >  f_6()
> >  g_6()
> >  h_6()
> >  i_6()
> >  j_6()
> > 
> > def a_8():
> >  a_7()
> >  b_7()
> >  c_7()
> >  d_7()
> >  e_7()
> >  f_7()
> >  g_7()
> >  h_7()
> >  i_7()
> >  j_7()
> > 
> > def b_8():
> >  a_7()
> >  b_7()
> >  c_7()
> >  d_7()
> >  e_7()
> >  f_7()
> >  g_7()
> >  h_7()
> >  i_7()
> >  j_7()
> > 
> > def c_8():
> >  a_7()
> >  b_7()
> >  c_7()
> >  d_7()
> >  e_7()
> >  f_7()
> >  g_7()
> >  h_7()
> >  i_7()
> >  j_7()
> > 
> > def d_8():
> >  a_7()
> >  b_7()
> >  c_7()
> >  d_7()
> >  e_7()
> >  f_7()
> >  g_7()
> >  h_7()
> >  i_7()
> >  j_7()
> > 
> > def e_8():
> >  a_7()
> >  b_7()
> >  c_7()
> >  d_7()
> >  e_7()
> >  f_7()
> >  g_7()
> >  h_7()
> >  i_7()
> >  j_7()
> > 
> > def f_8():
> >  a_7()
> >  b_7()
> >  c_7()
> >  d_7()
> >  e_7()
> >  f_7()
> >  g_7()
> >  h_7()
> >  i_7()
> >  j_7()
> > 
> > def g_8():
> >  a_7()
> >  b_7()
> >  c_7()
> >  d_7()
> >  e_7()
> >  f_7()
> >  g_7()
> >  h_7()
> >  i_7()
> >  j_7()
> > 
> > def h_8():
> >  a_7()
> >  b_7()
> >  c_7()
> >  d_7()
> >  e_7()
> >  f_7()
> >  g_7()
> >  h_7()
> >  i_7()
> >  j_7()
> > 
> > def i_8():
> >  a_7()
> >  b_7()
> >  c_7()
> >  d_7()
> >  e_7()
> >  f_7()
> >  g_7()
> >  h_7()
> >  i_7()
> >  j_7()
> > 
> > def j_8():
> >  a_7()
> >  b_7()
> >  c_7()
> >  d_7()
> >  e_7()
> >  f_7()
> >  g_7()
> >  h_7()
> >  i_7()
> >  j_7()
> > 
> > def main():
> >  a_8()
> >  b_8()
> >  c_8()
> >  d_8()
> >  e_8()
> >  f_8()
> >  g_8()
> >  h_8()
> >  i_8()
> >  j_8()
> > 
> 
> > _______________________________________________
> > CRIU mailing list
> > CRIU at openvz.org
> > https://lists.openvz.org/mailman/listinfo/criu
> 
> _______________________________________________
> CRIU mailing list
> CRIU at openvz.org
> https://lists.openvz.org/mailman/listinfo/criu


More information about the CRIU mailing list