[CRIU] crtools from git tree - Error (sk-inet.c:443): Can't bind inet socket: Address already in use

Dilip Daya dilip.daya at hp.com
Wed Aug 1 22:16:27 EDT 2012


On Tue, 2012-07-31 at 14:03 +0400, Pavel Emelyanov wrote:
> > => Restore is looking for "tcp-stream-10.img", but actual file name is
> >    "tcp-stream-94e4e.img".
> > 
> > I've tried multiple attempts but restore always looks for
> > "tcp-stream-10.img".
> > 
> > Would this be a bug?
> 
> Yup :( Was accidentally broken by 424a4adb6. This patch fixes it
> 
> http://git.criu.org/?p=crtools.git;a=commitdiff;h=4ee52f340392254de0e80d2b040927bd10fb18d4
> 
> But I must tell, that TCP migration is still not perfect :( I see random
> failures from time to time. Plus, the TCP test itself is not yet polished
> (you'll have to look into .log files after that to check that the connection
> itself was working OK while cpt/rst) I planned to concentrate on this a
> little bit later, and if you can help that would be greatly appreciated :)
> 
> Thanks for looking into it :)


I took a snapshot from the crtools git tree on July-31st:

=> crtools-HEAD-368d7ac/test/tcp

# ./run.sh 
+ PORT=12345
+ CLN_PIPE=./clnt_pipe
+ SRV_LOG=./srv.log
+ CLN_LOG=./cln.log
+ DDIR=dump
+ CRTOOLS=../../crtools
++ hexdump -C /dev/urandom
++ head -n 1
+ TEXT='00000000  ac e3 80 65 9f c6 76 2f  12 9f fb f3 8d a3 aa 9b
|...e..v/........|'
+ echo 'Building services'
Building services
+ make clean
rm -f cln srv
+ make
cc     cln.c   -o cln
cc     srv.c   -o srv
+ rm -rf dump ./srv.log ./cln.log ./clnt_pipe
+ mkdir dump
+ echo 'Starting server'
Starting server
+ SRV_PID=26825
+ echo 'Starting pipe'
Starting pipe
+ mkfifo ./clnt_pipe
+ setsid ./srv 12345
+ echo 'Starting client'
Starting client
+ CLN_PID=26827
+ exec
+ ./cln 127.0.0.1 12345
+ echo 'Make it run'
Make it run
+ echo '00000000  ac e3 80 65 9f c6 76 2f  12 9f fb f3 8d a3 aa 9b
|...e..v/........|'
+ echo 'Suspend server'
Suspend server
+ ../../crtools dump -D dump -o dump.log -t 26825 --tcp-established
-vvvv
./run.sh: line 45: 26825 Killed                  setsid ./srv ${PORT} >
${SRV_LOG} 2>&1
+ sleep 5
+ echo 'Resume server'
Resume server
+ ../../crtools restore -D dump -o restore.log -t 26825 -d
--tcp-established -vvvv --close 3
+ fail 'Fail to restore server'
+ echo 'In function: fail'
In function: fail
+ echo 'Fail to restore server'
Fail to restore server
+ kill -9 26827
+ kill -9 26825
./run.sh: line 40: kill: (26825) - No such process
+ exit 1


# ll
total 48
-rw-rw-r-- 1 root root 1718 Jul 31 05:48 srv.c
-rw-rw-r-- 1 root root   50 Jul 31 05:48 Makefile
-rw-r--r-- 1 root root 2013 Aug  1 19:19 cln.c
-rwxrwxr-x 1 root root 1104 Aug  1 19:36 run.sh*
-rw-r--r-- 1 root root    0 Aug  1 19:49 srv.log
-rwxr-xr-x 1 root root 9772 Aug  1 19:49 srv*
prw-r--r-- 1 root root    0 Aug  1 19:49 clnt_pipe|
-rw-r--r-- 1 root root    0 Aug  1 19:49 cln.log
-rwxr-xr-x 1 root root 9563 Aug  1 19:49 cln*
drwxr-xr-x 2 root root 4096 Aug  1 19:49 dump/


=> test/tcp/dump/restore.log
...
...
 26829: Restoring TCP connection
 26825:     Send fd 1 to /crtools-fd-26829-1
 26829: Restoring TCP connection id 10 ino 10bf50
 26825:     Wait fdinfo pid=26829 fd=2
 26825:     Send fd 1 to /crtools-fd-26829-2
 26829:   Setting 1 queue seq to 2999954635
 26825:   Restoring fd 2 (state -> create)
 26829:   Setting 2 queue seq to 2374915616
 26825:   Restoring fd 3 (state -> create)
 26825:   Restore: family 2 type 1 proto 6 port 12345 state 10 src_addr
0.0.0.0
 26829:   Restoring TCP options
 26829:     Will turn SAK on
 26825: bind on sockfd: 3
 26829:     Will set wscale to 5
 26829:     Will turn timestamps on
 26829: Will set mss clamp to 16396
 26825: Error (sk-inet.c:443): Can't bind inet socket: Address already
in use 
 26829:     Create fd for 4
 26829:       Going to dup 4 into 4
 26829:   Restoring fd 0 (state -> receive)
 26829:   Receive fd for 0
 26829:   Restoring fd 1 (state -> receive)
 26829:   Receive fd for 1
 26829:   Restoring fd 2 (state -> receive)
 26829:   Receive fd for 2
 26829:   Restoring fd 4 (state -> receive)
 26829: Running delayed unix connections
 26829: Opening 0x0000000000400000-0x0000000000401000 0x0000000000000000
vma 
 26829: Opening 0x0000000000601000-0x0000000000602000 0x0000000000001000
vma 
Error (cr-restore.c:568): 26825 exited, status=255
 26829: Opening 0x00007fc67b527000-0x00007fc67b6a1000 0x0000000000000000
vma 
Error (cr-restore.c:840): Someone can't be restored
 26829: Opening 0x00007fc67b6a1000-0x00007fc67b8a1000 0x000000000017a000
vma 


=> I applied the following changes to crtools-HEAD-368d7ac/sk-inet.c 
   and it seems to work.  Does this make sense to you??


----------------------------------------------------------------------
# diff -Naurp sk-inet.c_orig sk-inet.c
--- sk-inet.c_orig	2012-08-01 13:39:22.000000000 -0600
+++ sk-inet.c	2012-08-01 19:54:08.000000000 -0600
@@ -407,7 +407,8 @@ int inet_bind(int sk, struct inet_sk_inf
 		struct sockaddr_in6	v6;
 	} addr;
 	int addr_size = 0;
-
+	int result;
+	int optlen;
 
 	memzero(&addr, sizeof(addr));
 	if (ii->ie->family == AF_INET) {
@@ -427,7 +428,16 @@ int inet_bind(int sk, struct inet_sk_inf
 	} else
 		BUG_ON(1);
 
+	optlen = 1;
+  result = setsockopt(sk, SOL_SOCKET, SO_REUSEADDR, &optlen,
sizeof(optlen));
+  if (result < 0) {
+		perror("sk-inet");
+		return 0;
+	}
+	pr_info("SO_REUSEADDR issued on sockfd: %d\n", sk);
+
 	if (bind(sk, (struct sockaddr *)&addr, addr_size) == -1) {
+		pr_info("bind on sockfd: %d\n", sk);
 		pr_perror("Can't bind inet socket");
 		return -1;
 	}
----------------------------------------------------------------------



After applying the above patch...and waiting for the persist-timer on 
the previous socket (127.0.0.1:12345) to expire, I then re-ran the test
to get:


# ./run.sh 
+ PORT=12345
+ CLN_PIPE=./clnt_pipe
+ SRV_LOG=./srv.log
+ CLN_LOG=./cln.log
+ DDIR=dump
+ CRTOOLS=../../crtools
++ hexdump -C /dev/urandom
++ head -n 1
+ TEXT='00000000  b6 9d 66 6b 42 a5 b3 6b  b3 93 c2 01 4e 2c c7 58
|..fkB..k....N,.X|'
+ echo 'Building services'
Building services
+ make clean
rm -f cln srv
+ make
cc     cln.c   -o cln
cc     srv.c   -o srv
+ rm -rf dump ./srv.log ./cln.log ./clnt_pipe
+ mkdir dump
+ echo 'Starting server'
Starting server
+ SRV_PID=28889
+ echo 'Starting pipe'
Starting pipe
+ mkfifo ./clnt_pipe
+ setsid ./srv 12345
+ echo 'Starting client'
Starting client
+ CLN_PID=28891
+ exec
+ ./cln 127.0.0.1 12345
+ echo 'Make it run'
Make it run
+ echo '00000000  b6 9d 66 6b 42 a5 b3 6b  b3 93 c2 01 4e 2c c7 58
|..fkB..k....N,.X|'
+ echo 'Suspend server'
Suspend server
+ ../../crtools dump -D dump -o dump.log -t 28889 --tcp-established
-vvvv
+ sleep 5
./run.sh: line 46: 28889 Killed                  setsid ./srv ${PORT} >
${SRV_LOG} 2>&1
+ echo 'Resume server'
Resume server
+ ../../crtools restore -D dump -o restore.log -t 28889 -d
--tcp-established -vvvv --close 3
+ echo 'Make client run again'
Make client run again
+ echo '00000000  b6 9d 66 6b 42 a5 b3 6b  b3 93 c2 01 4e 2c c7 58
|..fkB..k....N,.X|'
+ echo 'Collect results'
Collect results
+ exec
+ wait 28891
+ kill -9 28889



# ll
total 56
-rw-rw-r-- 1 root root 1718 Jul 31 05:48 srv.c
-rw-rw-r-- 1 root root   50 Jul 31 05:48 Makefile
-rw-r--r-- 1 root root 2013 Aug  1 19:19 cln.c
-rwxrwxr-x 1 root root 1104 Aug  1 19:36 run.sh*
-rwxr-xr-x 1 root root 9772 Aug  1 20:03 srv*
-rwxr-xr-x 1 root root 9563 Aug  1 20:03 cln*
-rw-r--r-- 1 root root   66 Aug  1 20:04 srv.log
drwxr-xr-x 2 root root 4096 Aug  1 20:04 dump/
prw-r--r-- 1 root root    0 Aug  1 20:04 clnt_pipe|
-rw-r--r-- 1 root root  156 Aug  1 20:04 cln.log

# cat srv.log 
Binding to port 12345
Waiting for connections
New connection
Done

# cat cln.log 
Connecting to 127.0.0.1:12345
New connection
Read 79 bytes, sending to sock
Checking for 79 bytes
Read 79 bytes, sending to sock
Checking for 79 bytes
Done


=> tcp/dump/restore.log
...
...
 28892:   Restore: family 2 type 1 proto 6 port 12345 state 1 src_addr
127.0.0.1
 28889:     Create fd for 1
 28892: Restoring TCP connection
 28889:       Going to dup 1 into 1
 28889:       Going to dup 1 into 2
 28889:     Wait fdinfo pid=28892 fd=1
 28889:     Send fd 1 to /crtools-fd-28892-1
 28892: Restoring TCP connection id 10 ino 1165ec
 28889:     Wait fdinfo pid=28892 fd=2
 28889:     Send fd 1 to /crtools-fd-28892-2
 28889:   Restoring fd 2 (state -> create)
 28889:   Restoring fd 3 (state -> create)
 28892:   Setting 1 queue seq to 2764575329
 28892:   Setting 2 queue seq to 2514028750
 28889:   Restore: family 2 type 1 proto 6 port 12345 state 10 src_addr
0.0.0.0
 28892: SO_REUSEADDR issued on sockfd: 3
 28889: SO_REUSEADDR issued on sockfd: 3
 28892:   Restoring TCP options
 28892:     Will turn SAK on
 28892:     Will set wscale to 5
 28892:     Will turn timestamps on
 28889:     Create fd for 3
 28892: Will set mss clamp to 16396
 28889:       Going to dup 3 into 3
 28889:   Restoring fd 0 (state -> receive)
 28889:   Restoring fd 1 (state -> receive)
 28889:   Restoring fd 2 (state -> receive)
 28889:   Restoring fd 3 (state -> receive)
 28889: Running delayed unix connections
 28892:     Create fd for 4
 28892:       Going to dup 4 into 4
 28892:   Restoring fd 0 (state -> receive)
 28892:   Receive fd for 0
 28892:   Restoring fd 1 (state -> receive)
 28892:   Receive fd for 1
 28892:   Restoring fd 2 (state -> receive)
 28892:   Receive fd for 2
 28892:   Restoring fd 4 (state -> receive)
 28892: Running delayed unix connections
 28892:     Create fd for 4
 28892:       Going to dup 4 into 4
 28892:   Restoring fd 0 (state -> receive)
 28892:   Receive fd for 0
 28892:   Restoring fd 1 (state -> receive)
 28892:   Receive fd for 1
 28892:   Restoring fd 2 (state -> receive)
 28892:   Receive fd for 2
 28892:   Restoring fd 4 (state -> receive)
 28892: Running delayed unix connections
 28889: Opening 0x0000000000400000-0x0000000000401000 0x0000000000000000
vma
 28889: Opening 0x0000000000601000-0x0000000000602000 0x0000000000001000
vma
 28889: Opening 0x00007fde07965000-0x00007fde07adf000 0x0000000000000000
vma
 28889: Opening 0x00007fde07adf000-0x00007fde07cdf000 0x000000000017a000
vma
 28889: Opening 0x00007fde07cdf000-0x00007fde07ce3000 0x000000000017a000
vma
 28889: Opening 0x00007fde07ce3000-0x00007fde07ce4000 0x000000000017e000
vma
 28889: Opening 0x00007fde07ce4000-0x00007fde07ce9000 0x0000000000000000
vma
 28889: Opening 0x00007fde07ce9000-0x00007fde07d08000 0x0000000000000000
vma
 28892: Opening 0x0000000000400000-0x0000000000401000 0x0000000000000000
vma
 28889: Opening 0x00007fde07ef7000-0x00007fde07efa000 0x0000000000000000
vma
 28892: Opening 0x0000000000601000-0x0000000000602000 0x0000000000001000
vma
 28889: Opening 0x00007fde07f05000-0x00007fde07f08000 0x0000000000000000
vma
 28889: Opening 0x00007fde07f08000-0x00007fde07f09000 0x000000000001f000
vma
 28892: Opening 0x00007fde07965000-0x00007fde07adf000 0x0000000000000000
vma
 28889: Opening 0x00007fde07f09000-0x00007fde07f0a000 0x0000000000020000
vma
 28892: Opening 0x00007fde07adf000-0x00007fde07cdf000 0x000000000017a000
vma
 28889: Opening 0x00007fde07f0a000-0x00007fde07f0b000 0x0000000000000000
vma
 28892: Opening 0x00007fde07cdf000-0x00007fde07ce3000 0x000000000017a000
vma
 28889: Opening 0x00007fff011e8000-0x00007fff01209000 0x0000000000000000
vma
 28892: Opening 0x00007fde07ce3000-0x00007fde07ce4000 0x000000000017e000
vma
 28889: Opening 0x00007fff0129c000-0x00007fff0129d000 0x0000000000000000
vma
 28892: Opening 0x00007fde07ce4000-0x00007fde07ce9000 0x0000000000000000
vma
 28889: Restore via sigreturn
 28892: Opening 0x00007fde07ce9000-0x00007fde07d08000 0x0000000000000000
vma
 28892: Opening 0x00007fde07ef7000-0x00007fde07efa000 0x0000000000000000
vma
 28892: Opening 0x00007fde07f05000-0x00007fde07f08000 0x0000000000000000
vma
 28892: Opening 0x00007fde07f08000-0x00007fde07f09000 0x000000000001f000
vma
 28892: Opening 0x00007fde07f09000-0x00007fde07f0a000 0x0000000000020000
vma
 28892: Opening 0x00007fde07f0a000-0x00007fde07f0b000 0x0000000000000000
vma
 28892: Opening 0x00007fff011e8000-0x00007fff01209000 0x0000000000000000
vma
 28892: Opening 0x00007fff0129c000-0x00007fff0129d000 0x0000000000000000
vma
 28892: Restore via sigreturn
 28889: 1 threads require 64K of memory
 28889: Found bootstrap VMA hint at: 0x1000 (needs ~144K)
 28892: 1 threads require 64K of memory
 28892: Found bootstrap VMA hint at: 0x1000 (needs ~144K)
 28889: task_args: 0x3360
task_args->pid: 28892
task_args->nr_threads: 1
task_args->clone_restore_fn: 0x15b3
task_args->thread_args: 0x13870
Restoring EXE
28889: Restored
Restoring EXE
28892: Restored
Unlinking ghosts
  Running iptables [iptables -t filter -D INPUT --protocol tcp --source
127.0.0.1 --sport 44711 --destination 127.0.0.1 --dport 12345 -j DROP]
Unlocked 127.0.0.1:12345 - 127.0.0.1:44711 connection
Go on!!!


=> Does this look as if its (test/tcp/run.sh) working?


Thanking you in advance.
- Dilip Daya.



More information about the CRIU mailing list