sunrpc: don't immediately retransmit on seqno miss

[ Upstream commit fadc0f3bb2de8c570ced6d9c1f97222213d93140 ]

RFC2203 requires that retransmitted messages use a new gss sequence
number, but the same XID. This means that if the server is just slow
(e.x. overloaded), the client might receive a response using an older
seqno than the one it has recorded.

Currently, Linux's client immediately retransmits in this case. However,
this leads to a lot of wasted retransmits until the server eventually
responds faster than the client can resend.

Client -> SEQ 1 -> Server
Client -> SEQ 2 -> Server
Client <- SEQ 1 <- Server (misses, expecting seqno = 2)
Client -> SEQ 3 -> Server (immediate retransmission on miss)
Client <- SEQ 2 <- Server (misses, expecting seqno = 3)
Client -> SEQ 4 -> Server (immediate retransmission on miss)
... and so on ...

This commit makes it so that we ignore messages with bad checksums
due to seqnum mismatch, and rely on the usual timeout behavior for
retransmission instead of doing so immediately.

Signed-off-by: Nikhil Jha <njha@janestreet.com>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit c3616dfddf1d484839ea853a25063ab7452e44e9)
This commit is contained in:
Nikhil Jha 2025-03-19 13:02:40 -04:00 committed by Wentao Guan
parent e8d9bed1c3
commit 2eb8aa18d5
1 changed files with 7 additions and 2 deletions

View File

@ -2733,8 +2733,13 @@ out_verifier:
case -EPROTONOSUPPORT:
goto out_err;
case -EACCES:
/* Re-encode with a fresh cred */
fallthrough;
/* possible RPCSEC_GSS out-of-sequence event (RFC2203),
* reset recv state and keep waiting, don't retransmit
*/
task->tk_rqstp->rq_reply_bytes_recvd = 0;
task->tk_status = xprt_request_enqueue_receive(task);
task->tk_action = call_transmit_status;
return -EBADMSG;
default:
goto out_garbage;
}