[CRIU] [PATCH] tcp: Try harder to restore recv queue

Pavel Emelyanov xemul at parallels.com
Thu Dec 24 11:50:46 PST 2015


On restore we try to put data back into recv queue with quite
big chunks. However the kernel doesn't try hard to split the
data into packets in repair mode for this queue and just
allocates the linear skb of the given size. If the size is
moderately big, the allocation is subject to fail, slab doesn't
reliably allocates memory over 4k.

So, when failing with big chunk on recv queue -- shrink the
chunk and try again.

Signed-off-by: Pavel Emelyanov <xemul at parallels.com>
---
 sk-tcp.c | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/sk-tcp.c b/sk-tcp.c
index 9f3bf0d..b5e66b0 100644
--- a/sk-tcp.c
+++ b/sk-tcp.c
@@ -461,7 +461,7 @@ static int restore_tcp_seqs(int sk, TcpStreamEntry *tse)
 
 static int __send_tcp_queue(int sk, int queue, u32 len, struct cr_img *img)
 {
-	int ret, err = -1;
+	int ret, err = -1, max_chunk;
 	int off;
 	char *buf;
 
@@ -472,17 +472,32 @@ static int __send_tcp_queue(int sk, int queue, u32 len, struct cr_img *img)
 	if (read_img_buf(img, buf, len) < 0)
 		goto err;
 
+	max_chunk = (queue == TCP_RECV_QUEUE ? kdat.tcp_max_rshare : len);
 	off = 0;
 	while (len) {
 		int chunk = len;
 
-		if (queue == TCP_RECV_QUEUE && len > kdat.tcp_max_rshare)
-			chunk = kdat.tcp_max_rshare;
+		if (chunk > max_chunk)
+			chunk = max_chunk;
 
 		ret = send(sk, buf + off, chunk, 0);
 		if (ret <= 0) {
-			pr_perror("Can't restore %d queue data (%d), want (%d:%d)",
-				  queue, ret, chunk, len);
+			if ((queue == TCP_RECV_QUEUE) && (max_chunk > 1024) && (errno == ENOMEM)) {
+				/*
+				 * When restoring recv queue in repair mode
+				 * kernel doesn't try hard and just allocates
+				 * a linear skb with the size we pass to the
+				 * system call. Thus, if the size is too big
+				 * for slab allocator, the send just fails
+				 * with ENOMEM. Try smaller chunk, hopefully
+				 * there's still enough memory in the system.
+				 */
+				max_chunk >>= 1;
+				continue;
+			}
+
+			pr_perror("Can't restore %d queue data (%d), want (%d:%d:%d)",
+				  queue, ret, chunk, len, max_chunk);
 			goto err;
 		}
 		off += ret;
-- 
1.9.3



More information about the CRIU mailing list