[CRIU] [RFC] run each test case also in --check-only mode

Adrian Reber adrian at lisas.de
Tue Mar 21 10:47:26 PDT 2017


On Mon, Mar 20, 2017 at 04:21:57PM +0300, Pavel Emelyanov wrote:
> >>>>> ======================= Run zdtm/static/socket-tcp in h ========================
> >>>>> Start test
> >>>>> ./socket-tcp --pidfile=socket-tcp.pid --outfile=socket-tcp.out
> >>>>> Run criu dump in check-only mode
> >>>>> Only checking if requested operation will succeed
> >>>>> Run criu dump
> >>>>> Run criu restore in check-only mode
> >>>>> Only checking if requested operation will succeed
> >>>>> Checking mode enabled
> >>>>> Run criu restore
> >>>>> =[log]=> dump/zdtm/static/socket-tcp/31/1/restore.log
> >>>>> ------------------------ grep Error ------------------------
> >>>>> (00.008036) Error (criu/util.c:707): exited, status=1
> >>>>> (00.008050) Error (criu/netfilter.c:91): Iptables configuration failed
> >>>>> (00.009993) Error (criu/util.c:707): exited, status=1
> >>>>> (00.010007) Error (criu/netfilter.c:91): Iptables configuration failed
> >>>>> ------------------------ ERROR OVER ------------------------
> >>>>> Send the 15 signal to  31
> >>>>> Wait for zdtm/static/socket-tcp(31) to die for 0.100000
> >>>>> ############### Test zdtm/static/socket-tcp FAIL at result check ###############
> >>>>> Test output: ================================
> >>>>> 11:30:41.072:    31: ERR: socket-tcp.c:190: can't write (errno = 104 (Connection reset by peer))
> >>>>>
> >>>>>  <<< ================================
> >>>>> ##################################### FAIL #####################################
> >>>>>
> >>>>> The first problem is that the network unlocking fails for the real restore.
> >>>>> The '--check-only' restore already unlocked the network. Which is wrong, but
> >>>>> I am not sure what the right solution is. Should I just ignore network
> >>>>> unlocking in check-only mode?
> >>>>
> >>>> I would say yes. Since you haven't done real dump, there's no why you'd
> >>>> expect the network to be locked.
> >>>>
> >>>>> The second problem seems to be that when CRIU restores the process in
> >>>>> real restore mode the sockets cannot be restored again and I am not sure
> >>>>> why.
> >>>>
> >>>> Would you show the restore.log file for this case?
> >>>
> >>> https://lisas.de/~adrian/restore.log
> >>
> >> But that's the "first problem" :) Inability to turn off the netfilter rule
> >> used to lock the connection.
> > 
> > That is the log of the real restore which has both problems. The
> > unlocking does not work anymore as it already has been unlocked by the
> > check-only restore. The check-only restore works without any errors.
> > Only the real restore after the check-only fails.
> > 
> > I am guessing that the message from the test case:
> > 
> > 11:30:41.072:    31: ERR: socket-tcp.c:190: can't write (errno = 104 (Connection reset by peer))
> 
> Ah, I see. That's because in --check-only restore you've restored the
> socket, then unlocked the connection. Peer noticed this and shifted its
> sequences. Then you do the 2nd restore which cannot happen, because the
> peer's state has changed.
> 
> What you should do on --check-only restore is either ignore the restoration
> of TCP sockets (which is not nice) or restore the socket, but kill one
> right before unlocking the connection, so that the peer doesn't see a single
> packet came from the restored socket.

Thanks Pavel. That was the information I was looking for. I like the simple
option 1 ;-) But with Andrei as criu-dev maintainer I guess that will
not be accepted ;-) So I went with something close to your second
proposal.

I am now down to 8 tests failing in check-only mode which should be
fixed soon.

		Adrian


More information about the CRIU mailing list