[Devel] net: [PATCH VZ9 1/2] zerocopy for unix socket, fixups

Alexey Kuznetsov kuznet at virtuozzo.com
Tue Jan 23 22:33:26 MSK 2024


We do not want to deal with SOCK_SEQPACKET sockets, as was
noticed by Pavel Tikhomirov <ptikhomirov at virtuozzo.com>

Fallback for occasional splicing of zerocopied pages did not
work, returned EINVAL. Not essential as we do not use it,
still tests revealed this situation. So, repairing this.

vstorage specific note: soon we enable zerocopy at server
side and will have to choose between zerocopy at sender
and splice are receiver. Kernel will not be affected.
Yet, we should think how to save cpu both in server
and client, this is not impossible, but not straightforward
at all.

https://pmc.acronis.work/browse/VSTOR-79527

Signed-off-by: Alexey Kuznetsov <kuznet at acronis.com>
---
 net/core/sock.c    |  2 +-
 net/unix/af_unix.c | 31 ++++++++++++++++++++++++-------
 2 files changed, 25 insertions(+), 8 deletions(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index fe8e102..9887331 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1406,7 +1406,7 @@ int sk_setsockopt(struct sock *sk, int level, int optname,
 			       sk->sk_protocol == IPPROTO_UDP)))
 				ret = -EOPNOTSUPP;
 		} else if (sk->sk_family == PF_UNIX) {
-			if (sk->sk_type == SOCK_DGRAM)
+			if (sk->sk_type != SOCK_STREAM)
 				ret = -EOPNOTSUPP;
 		} else if (sk->sk_family != PF_RDS) {
 			ret = -EOPNOTSUPP;
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index e32fe47..9313de3 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -2849,16 +2849,33 @@ static int unix_stream_splice_actor(struct sk_buff *skb,
 {
 	/* Zerocopy pages cannot be spliced, alas. It looks like splice interface
 	 * gives no way to notify about actual page consumption. So, we have to copy.
-	 * This path is not going be legit, sender will be notified and will stop zerocopying.
+	 * This path is not going be used, sender and receiver should agree about
+	 * the protocol apriori or sender will be notified with SO_EE_CODE_ZEROCOPY_COPIED
+	 * to stop zerocopying.
 	 */
-	int err = skb_orphan_frags_rx(skb, GFP_KERNEL);
+	int err = 0;
+	struct sk_buff *__skb = skb;
+
+	if (skb_zcopy(skb)) {
+		/* skb is always shared, unfortunately. */
+		if (skb_shared(skb)) {
+			__skb = skb_clone(skb, GFP_KERNEL);
+			if (!__skb)
+				return -ENOMEM;
+		}
+		err = skb_orphan_frags_rx(__skb, GFP_KERNEL);
+		if (err)
+			goto out;
+	}
 
-	if (err)
-		return err;
+	err = skb_splice_bits(__skb, state->socket->sk,
+			      UNIXCB(__skb).consumed + skip,
+			      state->pipe, chunk, state->splice_flags);
 
-	return skb_splice_bits(skb, state->socket->sk,
-			       UNIXCB(skb).consumed + skip,
-			       state->pipe, chunk, state->splice_flags);
+out:
+	if (skb != __skb)
+		kfree_skb(__skb);
+	return err;
 }
 
 static ssize_t unix_stream_splice_read(struct socket *sock,  loff_t *ppos,
-- 
1.8.3.1



More information about the Devel mailing list