[CRIU] Restoring the state of a crashed Java process

Pavel Emelyanov xemul at parallels.com
Mon Sep 15 03:19:51 PDT 2014


On 09/13/2014 04:33 AM, Balakrishnan Chandrasekaran wrote:
> Hi all,
> 
> I came across this project just recently, and am impressed with criu. I want to thank all contributors and for keeping it open.
> 
> I am trying to run through some examples and understand how to use criu. I ran a simple experiment with a Java 
> program that maintains a couple of TCP connections with another on the same machine. I took a checkpoint using
> criu while the connection was open; no data was being exchanged over the connection. If I let the process stop
> after the checkpointing, I can recover and continue to run the program. However, I let the process run, using 
> the '-R' option, and try to recover the process from the checkpoint after is crashes (by throwing a RuntimeException), 

You try to roll-back one end of TCP connection. This is not possible, because the
peer is not ready to see such time travels :)

> it doesn't work. I encountered the following error --
> 
> --------
> ip6tables: Bad rule (does a matching rule exist in that chain?).
> Error (util.c:576): exited, status=1
> Error (netfilter.c:69): Iptables configuration failed: No such file or directory
> ip6tables: Bad rule (does a matching rule exist in that chain?).
> Error (util.c:576): exited, status=1
> Error (netfilter.c:69): Iptables configuration failed: No such file or directory

Is this log from CRIU run? When CRIU dumps and restores TCP connection it plays with
the netfilter rules to lock the connection. If you dump with -R these rules are not
preserved, while restore expects to see them and tries to turn them off.

> 09-12-2014 05:25:33.239 ERROR [CPlaneHandler] Connection reset by peer
> ...
> 09-12-2014 05:25:33.244 INFO  [AppLoader] Control plane is shutdown
> 09-12-2014 05:25:33.246 INFO  [AppLoader] Data plane is shutdown
> bala at galvatron:~/legosdn$ Write failed: Broken pipe
> --------
> 
> Can someone help with resolving this issue? Or, at least tell me, that this is a bad use case for criu,
> to start with... I do not know much about checkpointing/restore and treating criu as a blackbox, so far,
> and hence, any pointers would be greatly appreciated.

Well, as was said, dumping and restoring a TCP connection is only possible in one way --
dump, kill then restore. Any deviation would cause the peer of the connection to suffer.

Thanks,
Pavel



More information about the CRIU mailing list