[CRIU] [PATCH] net: Do not toggle TCP_REPAIR while restoring TCP send queues
Amey Deshpande
ameyd at google.com
Fri Feb 6 11:10:12 PST 2015
For an established TCP connection, the send queue is restored in two
steps: in step (1), we retransmit the data that was sent before but not
yet acknowledged, and in step (2), we transmit the data that was never
sent outside before. The TCP_REPAIR option is disabled before step (2)
and re-enabled after step (2) (without this patch).
If the amount of data to be sent in step (2) is large, the TCP_REPAIR
flag on the socket can remain off for some time (O(milliseconds)). If a
listen() is called on another socket bound to the same port during this
time window, it fails. This is because -- turning TCP_REPAIR off clears
the SO_REUSEADDR flag on the socket.
There are several possible ways to prevent this problem from happening:
- The simplest option is to *not* toggle TCP_REPAIR option while
restoring the TCP queues.
- Another way would be to explicitly enable SO_REUSEADDR on the
socket after turning TCP_REPAIR off. This still leaves a small time
window, and such race could still occur.
- A more involved solution would use a mutex per port number, so
that a listen() on a port number does not happen while SO_REUSEADDR for
another socket on the same port is off.
This patch removes the toggling of TCP_REPAIR option during restoring
TCP send queues.
Signed-off-by: Amey Deshpande <ameyd at google.com>
---
sk-tcp.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/sk-tcp.c b/sk-tcp.c
index 3f1556d..02ed1f9 100644
--- a/sk-tcp.c
+++ b/sk-tcp.c
@@ -534,11 +534,8 @@ static int restore_tcp_queues(int sk, TcpStreamEntry *tse, struct cr_img *img)
* they can be restored without any tricks.
*/
len = tse->unsq_len;
- tcp_repair_off(sk);
if (len && __send_tcp_queue(sk, TCP_SEND_QUEUE, len, img))
return -1;
- if (tcp_repair_on(sk))
- return -1;
return 0;
}
--
2.2.0.rc0.207.ga3a616c
More information about the CRIU
mailing list