[CRIU] TCP_REPAIR and which kernel patches

Adrian Reber adrian at lisas.de
Wed Dec 21 10:54:53 PST 2016


I am trying to get criu and TCP_REPAIR working on the RHEL kernel and it
seems to work. Partly at least:

#️ criu/criu check --feature tcp_half_closed
Warn  (criu/autofs.c:79): Failed to find pipe_ino option (old kernel?)
tcp_half_closed is supported

That looks good. Some of the test in zdtm are working correctly, but one
tests seems to timeout:

#️  ./zdtm.py run -f h -t zdtm/static/socket-tcp-close-wait
Checking feature tcp_half_closed
=== Run 1/1 ================ zdtm/static/socket-tcp-close-wait

================== Run zdtm/static/socket-tcp-close-wait in h ==================
Start test
Running zdtm/static/socket-tcp-close-wait.hook(--post-start)
./socket-tcp-close-wait --pidfile=socket-tcp-close-wait.pid --outfile=socket-tcp-close-wait.out
Running zdtm/static/socket-tcp-close-wait.hook(--pre-dump)
State      Recv-Q Send-Q                                                         Local Address:Port                                                                        Peer Address:Port              
LISTEN     0      1                                                                          *:8880                                                                                   *:*                  
ESTAB      0      0                                                                  127.0.0.1:8880                                                                           127.0.0.1:59768              
ESTAB      0      0                                                                  127.0.0.1:59768                                                                          127.0.0.1:8880               
FIN-WAIT-2 0      0                                                                  127.0.0.1:59766                                                                          127.0.0.1:8880               
CLOSE-WAIT 1      0                                                                  127.0.0.1:8880                                                                           127.0.0.1:59766              
Run criu dump
Running zdtm/static/socket-tcp-close-wait.hook(--pre-restore)
Run criu restore
Running zdtm/static/socket-tcp-close-wait.hook(--post-restore)
State      Recv-Q Send-Q                                                         Local Address:Port                                                                        Peer Address:Port              
LISTEN     0      1                                                                          *:8880                                                                                   *:*                  
ESTAB      0      0                                                                  127.0.0.1:8880                                                                           127.0.0.1:59768              
ESTAB      0      0                                                                  127.0.0.1:59768                                                                          127.0.0.1:8880               
FIN-WAIT-2 0      0                                                                  127.0.0.1:59766                                                                          127.0.0.1:8880               
CLOSE-WAIT 1      0                                                                  127.0.0.1:8880                                                                           127.0.0.1:59766              
Check TCP images
dump/zdtm/static/socket-tcp-close-wait/32/1/tcp-stream-11f12.img
dump/zdtm/static/socket-tcp-close-wait/32/1/tcp-stream-11f13.img
dump/zdtm/static/socket-tcp-close-wait/32/1/tcp-stream-11f11.img
Send the 15 signal to  32
Wait for zdtm/static/socket-tcp-close-wait(32) to die for 0.100000
Wait for zdtm/static/socket-tcp-close-wait(32) to die for 0.200000
Wait for zdtm/static/socket-tcp-close-wait(32) to die for 0.400000
Wait for zdtm/static/socket-tcp-close-wait(32) to die for 0.800000
Wait for zdtm/static/socket-tcp-close-wait(32) to die for 1.600000
Wait for zdtm/static/socket-tcp-close-wait(32) to die for 3.200000
Wait for zdtm/static/socket-tcp-close-wait(32) to die for 6.400000
Wait for zdtm/static/socket-tcp-close-wait(32) to die for 12.800000
Wait for zdtm/static/socket-tcp-close-wait(32) to die for 25.600000
  PID TTY          TIME CMD
   32 ?        00:00:00 socket-tcp-clos
  PID TTY      STAT   TIME COMMAND
    1 ?        S+     0:00 python2 zdtm.py
    3 ?        R+     0:00 python2 zdtm.py
   40 ?        R+     0:00  \_ ps axf 32
   31 ?        S+     0:00 ./socket-tcp-close-wait --pidfile=socket-tcp-close-wait.pid --outfile=socket-tcp-close-wait.out
   32 ?        Ss     0:00 ./socket-tcp-close-wait --pidfile=socket-tcp-close-wait.pid --outfile=socket-tcp-close-wait.out
 Test zdtm/static/socket-tcp-close-wait FAIL at zdtm/static/socket-tcp-close-wait die 
Send the 9 signal to  32
Wait for zdtm/static/socket-tcp-close-wait(32) to die for 0.100000
Running zdtm/static/socket-tcp-close-wait.hook(--clean)
##################################### FAIL #####################################

I have following patch applied

 tcp: allow to enable the repair mode for non-listening sockets

Is that enough? It works on a Fedora system with a 4.10rc kernel.

# cat zdtm/static/socket-tcp-close-wait.out.inprogress
18:51:53.084:    32: rcv_size = 0

According to gdb one process is hanging in line 161 and the other
process in 254.

		Adrian


More information about the CRIU mailing list