[CRIU] crtools from git tree - Error (sk-inet.c:443): Can't
bind inet socket: Address already in use
Dilip Daya
dilip.daya at hp.com
Wed Aug 1 22:16:27 EDT 2012
On Tue, 2012-07-31 at 14:03 +0400, Pavel Emelyanov wrote:
> > => Restore is looking for "tcp-stream-10.img", but actual file name is
> > "tcp-stream-94e4e.img".
> >
> > I've tried multiple attempts but restore always looks for
> > "tcp-stream-10.img".
> >
> > Would this be a bug?
>
> Yup :( Was accidentally broken by 424a4adb6. This patch fixes it
>
> http://git.criu.org/?p=crtools.git;a=commitdiff;h=4ee52f340392254de0e80d2b040927bd10fb18d4
>
> But I must tell, that TCP migration is still not perfect :( I see random
> failures from time to time. Plus, the TCP test itself is not yet polished
> (you'll have to look into .log files after that to check that the connection
> itself was working OK while cpt/rst) I planned to concentrate on this a
> little bit later, and if you can help that would be greatly appreciated :)
>
> Thanks for looking into it :)
I took a snapshot from the crtools git tree on July-31st:
=> crtools-HEAD-368d7ac/test/tcp
# ./run.sh
+ PORT=12345
+ CLN_PIPE=./clnt_pipe
+ SRV_LOG=./srv.log
+ CLN_LOG=./cln.log
+ DDIR=dump
+ CRTOOLS=../../crtools
++ hexdump -C /dev/urandom
++ head -n 1
+ TEXT='00000000 ac e3 80 65 9f c6 76 2f 12 9f fb f3 8d a3 aa 9b
|...e..v/........|'
+ echo 'Building services'
Building services
+ make clean
rm -f cln srv
+ make
cc cln.c -o cln
cc srv.c -o srv
+ rm -rf dump ./srv.log ./cln.log ./clnt_pipe
+ mkdir dump
+ echo 'Starting server'
Starting server
+ SRV_PID=26825
+ echo 'Starting pipe'
Starting pipe
+ mkfifo ./clnt_pipe
+ setsid ./srv 12345
+ echo 'Starting client'
Starting client
+ CLN_PID=26827
+ exec
+ ./cln 127.0.0.1 12345
+ echo 'Make it run'
Make it run
+ echo '00000000 ac e3 80 65 9f c6 76 2f 12 9f fb f3 8d a3 aa 9b
|...e..v/........|'
+ echo 'Suspend server'
Suspend server
+ ../../crtools dump -D dump -o dump.log -t 26825 --tcp-established
-vvvv
./run.sh: line 45: 26825 Killed setsid ./srv ${PORT} >
${SRV_LOG} 2>&1
+ sleep 5
+ echo 'Resume server'
Resume server
+ ../../crtools restore -D dump -o restore.log -t 26825 -d
--tcp-established -vvvv --close 3
+ fail 'Fail to restore server'
+ echo 'In function: fail'
In function: fail
+ echo 'Fail to restore server'
Fail to restore server
+ kill -9 26827
+ kill -9 26825
./run.sh: line 40: kill: (26825) - No such process
+ exit 1
# ll
total 48
-rw-rw-r-- 1 root root 1718 Jul 31 05:48 srv.c
-rw-rw-r-- 1 root root 50 Jul 31 05:48 Makefile
-rw-r--r-- 1 root root 2013 Aug 1 19:19 cln.c
-rwxrwxr-x 1 root root 1104 Aug 1 19:36 run.sh*
-rw-r--r-- 1 root root 0 Aug 1 19:49 srv.log
-rwxr-xr-x 1 root root 9772 Aug 1 19:49 srv*
prw-r--r-- 1 root root 0 Aug 1 19:49 clnt_pipe|
-rw-r--r-- 1 root root 0 Aug 1 19:49 cln.log
-rwxr-xr-x 1 root root 9563 Aug 1 19:49 cln*
drwxr-xr-x 2 root root 4096 Aug 1 19:49 dump/
=> test/tcp/dump/restore.log
...
...
26829: Restoring TCP connection
26825: Send fd 1 to /crtools-fd-26829-1
26829: Restoring TCP connection id 10 ino 10bf50
26825: Wait fdinfo pid=26829 fd=2
26825: Send fd 1 to /crtools-fd-26829-2
26829: Setting 1 queue seq to 2999954635
26825: Restoring fd 2 (state -> create)
26829: Setting 2 queue seq to 2374915616
26825: Restoring fd 3 (state -> create)
26825: Restore: family 2 type 1 proto 6 port 12345 state 10 src_addr
0.0.0.0
26829: Restoring TCP options
26829: Will turn SAK on
26825: bind on sockfd: 3
26829: Will set wscale to 5
26829: Will turn timestamps on
26829: Will set mss clamp to 16396
26825: Error (sk-inet.c:443): Can't bind inet socket: Address already
in use
26829: Create fd for 4
26829: Going to dup 4 into 4
26829: Restoring fd 0 (state -> receive)
26829: Receive fd for 0
26829: Restoring fd 1 (state -> receive)
26829: Receive fd for 1
26829: Restoring fd 2 (state -> receive)
26829: Receive fd for 2
26829: Restoring fd 4 (state -> receive)
26829: Running delayed unix connections
26829: Opening 0x0000000000400000-0x0000000000401000 0x0000000000000000
vma
26829: Opening 0x0000000000601000-0x0000000000602000 0x0000000000001000
vma
Error (cr-restore.c:568): 26825 exited, status=255
26829: Opening 0x00007fc67b527000-0x00007fc67b6a1000 0x0000000000000000
vma
Error (cr-restore.c:840): Someone can't be restored
26829: Opening 0x00007fc67b6a1000-0x00007fc67b8a1000 0x000000000017a000
vma
=> I applied the following changes to crtools-HEAD-368d7ac/sk-inet.c
and it seems to work. Does this make sense to you??
----------------------------------------------------------------------
# diff -Naurp sk-inet.c_orig sk-inet.c
--- sk-inet.c_orig 2012-08-01 13:39:22.000000000 -0600
+++ sk-inet.c 2012-08-01 19:54:08.000000000 -0600
@@ -407,7 +407,8 @@ int inet_bind(int sk, struct inet_sk_inf
struct sockaddr_in6 v6;
} addr;
int addr_size = 0;
-
+ int result;
+ int optlen;
memzero(&addr, sizeof(addr));
if (ii->ie->family == AF_INET) {
@@ -427,7 +428,16 @@ int inet_bind(int sk, struct inet_sk_inf
} else
BUG_ON(1);
+ optlen = 1;
+ result = setsockopt(sk, SOL_SOCKET, SO_REUSEADDR, &optlen,
sizeof(optlen));
+ if (result < 0) {
+ perror("sk-inet");
+ return 0;
+ }
+ pr_info("SO_REUSEADDR issued on sockfd: %d\n", sk);
+
if (bind(sk, (struct sockaddr *)&addr, addr_size) == -1) {
+ pr_info("bind on sockfd: %d\n", sk);
pr_perror("Can't bind inet socket");
return -1;
}
----------------------------------------------------------------------
After applying the above patch...and waiting for the persist-timer on
the previous socket (127.0.0.1:12345) to expire, I then re-ran the test
to get:
# ./run.sh
+ PORT=12345
+ CLN_PIPE=./clnt_pipe
+ SRV_LOG=./srv.log
+ CLN_LOG=./cln.log
+ DDIR=dump
+ CRTOOLS=../../crtools
++ hexdump -C /dev/urandom
++ head -n 1
+ TEXT='00000000 b6 9d 66 6b 42 a5 b3 6b b3 93 c2 01 4e 2c c7 58
|..fkB..k....N,.X|'
+ echo 'Building services'
Building services
+ make clean
rm -f cln srv
+ make
cc cln.c -o cln
cc srv.c -o srv
+ rm -rf dump ./srv.log ./cln.log ./clnt_pipe
+ mkdir dump
+ echo 'Starting server'
Starting server
+ SRV_PID=28889
+ echo 'Starting pipe'
Starting pipe
+ mkfifo ./clnt_pipe
+ setsid ./srv 12345
+ echo 'Starting client'
Starting client
+ CLN_PID=28891
+ exec
+ ./cln 127.0.0.1 12345
+ echo 'Make it run'
Make it run
+ echo '00000000 b6 9d 66 6b 42 a5 b3 6b b3 93 c2 01 4e 2c c7 58
|..fkB..k....N,.X|'
+ echo 'Suspend server'
Suspend server
+ ../../crtools dump -D dump -o dump.log -t 28889 --tcp-established
-vvvv
+ sleep 5
./run.sh: line 46: 28889 Killed setsid ./srv ${PORT} >
${SRV_LOG} 2>&1
+ echo 'Resume server'
Resume server
+ ../../crtools restore -D dump -o restore.log -t 28889 -d
--tcp-established -vvvv --close 3
+ echo 'Make client run again'
Make client run again
+ echo '00000000 b6 9d 66 6b 42 a5 b3 6b b3 93 c2 01 4e 2c c7 58
|..fkB..k....N,.X|'
+ echo 'Collect results'
Collect results
+ exec
+ wait 28891
+ kill -9 28889
# ll
total 56
-rw-rw-r-- 1 root root 1718 Jul 31 05:48 srv.c
-rw-rw-r-- 1 root root 50 Jul 31 05:48 Makefile
-rw-r--r-- 1 root root 2013 Aug 1 19:19 cln.c
-rwxrwxr-x 1 root root 1104 Aug 1 19:36 run.sh*
-rwxr-xr-x 1 root root 9772 Aug 1 20:03 srv*
-rwxr-xr-x 1 root root 9563 Aug 1 20:03 cln*
-rw-r--r-- 1 root root 66 Aug 1 20:04 srv.log
drwxr-xr-x 2 root root 4096 Aug 1 20:04 dump/
prw-r--r-- 1 root root 0 Aug 1 20:04 clnt_pipe|
-rw-r--r-- 1 root root 156 Aug 1 20:04 cln.log
# cat srv.log
Binding to port 12345
Waiting for connections
New connection
Done
# cat cln.log
Connecting to 127.0.0.1:12345
New connection
Read 79 bytes, sending to sock
Checking for 79 bytes
Read 79 bytes, sending to sock
Checking for 79 bytes
Done
=> tcp/dump/restore.log
...
...
28892: Restore: family 2 type 1 proto 6 port 12345 state 1 src_addr
127.0.0.1
28889: Create fd for 1
28892: Restoring TCP connection
28889: Going to dup 1 into 1
28889: Going to dup 1 into 2
28889: Wait fdinfo pid=28892 fd=1
28889: Send fd 1 to /crtools-fd-28892-1
28892: Restoring TCP connection id 10 ino 1165ec
28889: Wait fdinfo pid=28892 fd=2
28889: Send fd 1 to /crtools-fd-28892-2
28889: Restoring fd 2 (state -> create)
28889: Restoring fd 3 (state -> create)
28892: Setting 1 queue seq to 2764575329
28892: Setting 2 queue seq to 2514028750
28889: Restore: family 2 type 1 proto 6 port 12345 state 10 src_addr
0.0.0.0
28892: SO_REUSEADDR issued on sockfd: 3
28889: SO_REUSEADDR issued on sockfd: 3
28892: Restoring TCP options
28892: Will turn SAK on
28892: Will set wscale to 5
28892: Will turn timestamps on
28889: Create fd for 3
28892: Will set mss clamp to 16396
28889: Going to dup 3 into 3
28889: Restoring fd 0 (state -> receive)
28889: Restoring fd 1 (state -> receive)
28889: Restoring fd 2 (state -> receive)
28889: Restoring fd 3 (state -> receive)
28889: Running delayed unix connections
28892: Create fd for 4
28892: Going to dup 4 into 4
28892: Restoring fd 0 (state -> receive)
28892: Receive fd for 0
28892: Restoring fd 1 (state -> receive)
28892: Receive fd for 1
28892: Restoring fd 2 (state -> receive)
28892: Receive fd for 2
28892: Restoring fd 4 (state -> receive)
28892: Running delayed unix connections
28892: Create fd for 4
28892: Going to dup 4 into 4
28892: Restoring fd 0 (state -> receive)
28892: Receive fd for 0
28892: Restoring fd 1 (state -> receive)
28892: Receive fd for 1
28892: Restoring fd 2 (state -> receive)
28892: Receive fd for 2
28892: Restoring fd 4 (state -> receive)
28892: Running delayed unix connections
28889: Opening 0x0000000000400000-0x0000000000401000 0x0000000000000000
vma
28889: Opening 0x0000000000601000-0x0000000000602000 0x0000000000001000
vma
28889: Opening 0x00007fde07965000-0x00007fde07adf000 0x0000000000000000
vma
28889: Opening 0x00007fde07adf000-0x00007fde07cdf000 0x000000000017a000
vma
28889: Opening 0x00007fde07cdf000-0x00007fde07ce3000 0x000000000017a000
vma
28889: Opening 0x00007fde07ce3000-0x00007fde07ce4000 0x000000000017e000
vma
28889: Opening 0x00007fde07ce4000-0x00007fde07ce9000 0x0000000000000000
vma
28889: Opening 0x00007fde07ce9000-0x00007fde07d08000 0x0000000000000000
vma
28892: Opening 0x0000000000400000-0x0000000000401000 0x0000000000000000
vma
28889: Opening 0x00007fde07ef7000-0x00007fde07efa000 0x0000000000000000
vma
28892: Opening 0x0000000000601000-0x0000000000602000 0x0000000000001000
vma
28889: Opening 0x00007fde07f05000-0x00007fde07f08000 0x0000000000000000
vma
28889: Opening 0x00007fde07f08000-0x00007fde07f09000 0x000000000001f000
vma
28892: Opening 0x00007fde07965000-0x00007fde07adf000 0x0000000000000000
vma
28889: Opening 0x00007fde07f09000-0x00007fde07f0a000 0x0000000000020000
vma
28892: Opening 0x00007fde07adf000-0x00007fde07cdf000 0x000000000017a000
vma
28889: Opening 0x00007fde07f0a000-0x00007fde07f0b000 0x0000000000000000
vma
28892: Opening 0x00007fde07cdf000-0x00007fde07ce3000 0x000000000017a000
vma
28889: Opening 0x00007fff011e8000-0x00007fff01209000 0x0000000000000000
vma
28892: Opening 0x00007fde07ce3000-0x00007fde07ce4000 0x000000000017e000
vma
28889: Opening 0x00007fff0129c000-0x00007fff0129d000 0x0000000000000000
vma
28892: Opening 0x00007fde07ce4000-0x00007fde07ce9000 0x0000000000000000
vma
28889: Restore via sigreturn
28892: Opening 0x00007fde07ce9000-0x00007fde07d08000 0x0000000000000000
vma
28892: Opening 0x00007fde07ef7000-0x00007fde07efa000 0x0000000000000000
vma
28892: Opening 0x00007fde07f05000-0x00007fde07f08000 0x0000000000000000
vma
28892: Opening 0x00007fde07f08000-0x00007fde07f09000 0x000000000001f000
vma
28892: Opening 0x00007fde07f09000-0x00007fde07f0a000 0x0000000000020000
vma
28892: Opening 0x00007fde07f0a000-0x00007fde07f0b000 0x0000000000000000
vma
28892: Opening 0x00007fff011e8000-0x00007fff01209000 0x0000000000000000
vma
28892: Opening 0x00007fff0129c000-0x00007fff0129d000 0x0000000000000000
vma
28892: Restore via sigreturn
28889: 1 threads require 64K of memory
28889: Found bootstrap VMA hint at: 0x1000 (needs ~144K)
28892: 1 threads require 64K of memory
28892: Found bootstrap VMA hint at: 0x1000 (needs ~144K)
28889: task_args: 0x3360
task_args->pid: 28892
task_args->nr_threads: 1
task_args->clone_restore_fn: 0x15b3
task_args->thread_args: 0x13870
Restoring EXE
28889: Restored
Restoring EXE
28892: Restored
Unlinking ghosts
Running iptables [iptables -t filter -D INPUT --protocol tcp --source
127.0.0.1 --sport 44711 --destination 127.0.0.1 --dport 12345 -j DROP]
Unlocked 127.0.0.1:12345 - 127.0.0.1:44711 connection
Go on!!!
=> Does this look as if its (test/tcp/run.sh) working?
Thanking you in advance.
- Dilip Daya.
More information about the CRIU
mailing list