[CRIU] crtools from git tree - Error (sk-inet.c:443): Can't bind
inet socket: Address already in use
Pavel Emelyanov
xemul at parallels.com
Wed Aug 1 22:39:57 EDT 2012
> => I applied the following changes to crtools-HEAD-368d7ac/sk-inet.c
> and it seems to work. Does this make sense to you??
>
>
> ----------------------------------------------------------------------
> # diff -Naurp sk-inet.c_orig sk-inet.c
> --- sk-inet.c_orig 2012-08-01 13:39:22.000000000 -0600
> +++ sk-inet.c 2012-08-01 19:54:08.000000000 -0600
> @@ -407,7 +407,8 @@ int inet_bind(int sk, struct inet_sk_inf
> struct sockaddr_in6 v6;
> } addr;
> int addr_size = 0;
> -
> + int result;
> + int optlen;
>
> memzero(&addr, sizeof(addr));
> if (ii->ie->family == AF_INET) {
> @@ -427,7 +428,16 @@ int inet_bind(int sk, struct inet_sk_inf
> } else
> BUG_ON(1);
>
> + optlen = 1;
> + result = setsockopt(sk, SOL_SOCKET, SO_REUSEADDR, &optlen,
> sizeof(optlen));
> + if (result < 0) {
> + perror("sk-inet");
> + return 0;
> + }
> + pr_info("SO_REUSEADDR issued on sockfd: %d\n", sk);
> +
> if (bind(sk, (struct sockaddr *)&addr, addr_size) == -1) {
> + pr_info("bind on sockfd: %d\n", sk);
> pr_perror("Can't bind inet socket");
> return -1;
> }
No, this is not correct. The original socket was created without this option,
so should be the restored one.
I think, that we're facing a race here -- there are two sockets on the same
port here -- the listener and the established conn. As seen from logs the
connected socket gets restored earlier, than the listening one:
26829: Restoring TCP connection
26825: Restore: family 2 type 1 proto 6 port 12345 state 10 src_addr
26825: Error (sk-inet.c:443): Can't bind inet socket: Address already in use
Thus the listening one conflicts on bind(). I think the proper fix would be in
setting the SO_REUSEADDR before bind and the dropping it afterwards.
> After applying the above patch...and waiting for the persist-timer on
> the previous socket (127.0.0.1:12345) to expire, I then re-ran the test
> to get:
This is strange -- why do you have to wait for timer to expire? The repaired
socket on close just kills itself w/o any post-connected states.
> # cat srv.log
> Binding to port 12345
> Waiting for connections
> New connection
> Done
>
> # cat cln.log
> Connecting to 127.0.0.1:12345
> New connection
> Read 79 bytes, sending to sock
> Checking for 79 bytes
> Read 79 bytes, sending to sock
> Checking for 79 bytes
> Done
>
>
> => tcp/dump/restore.log
> ...
> ...
> Unlocked 127.0.0.1:12345 - 127.0.0.1:44711 connection
> Go on!!!
>
>
> => Does this look as if its (test/tcp/run.sh) working?
Yes, this means that the test passed OK.
>
> Thanking you in advance.
> - Dilip Daya.
>
More information about the CRIU
mailing list