[CRIU] criu and runc

Adrian Reber adrian at lisas.de
Thu Dec 8 08:04:14 PST 2016


On Wed, Dec 07, 2016 at 09:40:10AM -0800, Andrei Vagin wrote:
> On Wed, Dec 07, 2016 at 10:19:21AM +0100, Adrian Reber wrote:
> > On Wed, Dec 07, 2016 at 12:29:43AM -0800, Andrei Vagin wrote:
> > > On Tue, Dec 06, 2016 at 04:55:12PM +0100, Adrian Reber wrote:
> > > > I tried to checkpoint and restore a runc container with today's git
> > > > checkout. It works, but tcp-established is not really working.
> > > > 
> > > > I have container with a httpd running inside and I and connect to it
> > > > using 'telnet rhel0x 80' to keep the connection established.
> > > > 
> > > > I then do 'runc checkpoint rhel7-httpd --tcp-established' and 'runc
> > > > restore -d rhel7-httpd --tcp-established'. Both commands are working.
> > > 
> > > Does the container have its own network namespace? What network
> > > configuration is used for this container?
> > 
> > Host network. I am using 'oci-runtime-tool generate --network host' to
> > generate the config and the namespace configuration looks like this:
> > 
> > mespaces": [
> > 			{
> > 				"type": "pid"
> > 			},
> > 			{
> > 				"type": "ipc"
> > 			},
> > 			{
> > 				"type": "uts"
> > 			},
> > 			{
> > 				"type": "mount"
> > 			}
> > 		]
> > 
> > 
> > > > In my telnet session I now type 'GET /' but I get a TCP reset:
> > > > 
> > > > 15:35:07.622294 IP dcbz.58608 > rhel0x.http: Flags [S], seq 1885340748, win 29200, options [mss 1460,sackOK,TS val 1499839760 ecr 0,nop,wscale 7], length 0
> > > > 15:35:07.622342 IP rhel0x.http > dcbz.58608: Flags [S.], seq 1948584834, ack 1885340749, win 28960, options [mss 1460,sackOK,TS val 1521845 ecr 1499839760,nop,wscale 7], length 0
> > > > 15:35:07.622409 IP dcbz.58608 > rhel0x.http: Flags [.], ack 1, win 229, options [nop,nop,TS val 1499839760 ecr 1521845], length 0
> > > > 15:35:32.268394 IP dcbz.58608 > rhel0x.http: Flags [P.], seq 1:3, ack 1, win 229, options [nop,nop,TS val 1499864406 ecr 1521845], length 2
> > > > 15:35:32.268433 IP rhel0x.http > dcbz.58608: Flags [R], seq 1948584835, win 0, length 0
> > > > 
> > > > https://lisas.de/~adrian/dump.log
> > > 
> > > (00.008968) Dumping inet socket at 3
> > > (00.008972) 	Dumping: ino 0x   16f26 family    2 type    1 port        0 state  7 src_addr 0.0.0.0
> > > (00.008974) 	Dumped: family 2 type 1 proto 6 port 0 state 7 src_addr 0.0.0.0
> > > (00.008977) fdinfo: type: 0x 4 flags: 02000002/01 pos: 0x       0 fd: 3
> > > (00.008991) 10989 fdinfo 4: pos: 0x               0 flags:          2000002/0x1
> > > (00.008994) 	Searching for socket 16f27 (family 10.6)
> > > (00.009001) No filter for socket
> > > (00.009004) Dumping inet socket at 4
> > > (00.009005) 	Dumping: ino 0x   16f27 family   10 type    1 port       80 state 10 src_addr ::
> > > (00.009007) 	Dumped: family 10 type 1 proto 6 port 80 state 10 src_addr ::
> > > 
> > > I found only two tcp sockets and one has the TCP_LISTEN (10) state
> > > and another one has the TCP_CLOSED(7) state. I exepect to find
> > > a socket with the TCP_ESTABLISHED state in the log.
> > 
> > Yes, it is interesting that it cannot be seen in the log file.
> > 
> > I had a closer look at netstat before and during the dump and I see that my
> > test method is flawed. Running 'telnet rhel0x 80' puts the TCP connection in
> > SYN_SENT and only after hitting enter for the first time it is established.
> > 
> > Having an actual established connection gives me following error during restore:
> > 
> > (00.135695)      7: Error (criu/sk-inet.c:638): Connected TCP socket in image
> 
>         if (tcp_connection(ie)) {
>                 if (!opts.tcp_established_ok) {
>                         pr_err("Connected TCP socket in image\n");
>                         goto err;
>                 }
> 
> --tcp-established was not set for "criu restore"

Now the connections seems to be correctly restored. There seems to be in
a difference where parameters can be specified on the command-line of
runc:

I was using:

 * runc restore -d rhel7-httpd --tcp-established

and the container ID needs the last parameter. So that works:

 * runc restore --tcp-established -d rhel7-httpd

For 'runc checkpoint' I can specify '--tcp-established' before or after
the container ID. So that is kind of strange. But now it works for me,
that's good for now.

Thanks for your help!

		Adrian


More information about the CRIU mailing list