[Devel] [PATCH RHEL7 COMMIT] fs/fuse kio: adjust rdma connection parameters - increase retry counts

Konstantin Khorenko khorenko at virtuozzo.com
Fri Aug 18 17:37:23 MSK 2023


The commit is pushed to "branch-rh7-3.10.0-1160.95.1.vz7.210.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-1160.95.1.vz7.210.1
------>
commit 6ffc75884e0b37e0c9efbc454c4ee20500a19fa9
Author: Kui Liu <Kui.Liu at acronis.com>
Date:   Thu Aug 17 15:45:39 2023 +0000

    fs/fuse kio: adjust rdma connection parameters - increase retry counts
    
    In RoCE network, packet loss and delay due to congestion can happen
    quite often. We need to tolerate such event. So increase retry_count
    and rnr_retry_count to 7 to allow NIC to retry operations when an
    error happens, instead of returning the error directly which causes
    the connection to be aborted.
    
    I believe 0 values mean no retry based on test. With both values as 0,
    It kept receiving IBV_WC_RETRY_EXC_ERR for RDMA READ operations, which
    aborts the connection. Changing to 7 (which I find to be commonly used
    value in several examples from RDMA Core library), the error goes away
    and we can at least have stable connection in such RoCE network.
    
    Signed-off-by: Liu Kui <Kui.Liu at acronis.com>
    Acked-by: Alexey Kuznetsov <kuznet at acronis.com>
    
    Feature: vStorage
---
 fs/fuse/kio/pcs/pcs_rdma_conn.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/fuse/kio/pcs/pcs_rdma_conn.c b/fs/fuse/kio/pcs/pcs_rdma_conn.c
index 4db903151de0..7339b1466d3a 100644
--- a/fs/fuse/kio/pcs/pcs_rdma_conn.c
+++ b/fs/fuse/kio/pcs/pcs_rdma_conn.c
@@ -44,8 +44,8 @@ conn_param_init(struct rdma_conn_param *cp, struct pcs_rdmaio_conn_req *cr,
 	cp->initiator_depth     = min_t(int, U8_MAX, cmid->device->attrs.max_qp_init_rd_atom);
 
 	cp->flow_control        = 1; /* does not matter */
-	cp->retry_count         = 0; /* # retransmissions when no ACK received */
-	cp->rnr_retry_count     = 0; /* # RNR retransmissions */
+	cp->retry_count         = 7; /* # retransmissions when no ACK received */
+	cp->rnr_retry_count     = 7; /* # RNR retransmissions */
 }
 
 static int pcs_rdma_cm_event_handler(struct rdma_cm_id *cmid,


More information about the Devel mailing list