On Thu, Dec 04, 2025 at 08:45:39AM +0100, Stefano Brivio wrote: > ...under two conditions: > > - the remote peer is advertising a bigger value to us, meaning that a > bigger sending buffer is likely to benefit throughput, AND I think this condition is redundant: if the remote peer is advertising less, we'll clamp new_wnd_to_tap to that value anyway. > - this is not a short-lived connection, where the latency cost of > retransmissions would be otherwise unacceptable. > > By doing this, we can reliably trigger TCP buffer size auto-tuning (as > long as it's available) on bulk data transfers. > > Signed-off-by: Stefano Brivio > --- > tcp.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/tcp.c b/tcp.c > index 2220059..454df69 100644 > --- a/tcp.c > +++ b/tcp.c > @@ -353,6 +353,13 @@ enum { > #define LOW_RTT_TABLE_SIZE 8 > #define LOW_RTT_THRESHOLD 10 /* us */ > > +/* Try to avoid retransmissions to improve latency on short-lived connections */ > +#define SHORT_CONN_BYTES (16ULL * 1024 * 1024) > + > +/* Temporarily exceed available sending buffer to force TCP auto-tuning */ > +#define SNDBUF_BOOST_FACTOR 150 /* % */ > +#define SNDBUF_BOOST(x) ((x) * SNDBUF_BOOST_FACTOR / 100) For the short term, the fact this works empirically is enough. For the longer term, it would be nice to have a better understanding of what this "overcommit" amount is actually estimating. I think what we're looking for is an estimate of the number of bytes that will have left the buffer by the time the guest gets back to us. So: * Alas, I don't see a way to estimate either of those from the information we already track - we'd need additional bookkeeping. > #define ACK_IF_NEEDED 0 /* See tcp_send_flag() */ > > #define CONN_IS_CLOSING(conn) \ > @@ -1137,6 +1144,9 @@ int tcp_update_seqack_wnd(const struct ctx *c, struct tcp_tap_conn *conn, > > if ((int)sendq > SNDBUF_GET(conn)) /* Due to memory pressure? */ > limit = 0; > + else if ((int)tinfo->tcpi_snd_wnd > SNDBUF_GET(conn) && > + tinfo->tcpi_bytes_acked > SHORT_CONN_BYTES) This is pretty subtle, I think it would be worth having some rationale in a comment, not just the commit message. > + limit = SNDBUF_BOOST(SNDBUF_GET(conn)) - (int)sendq; > else > limit = SNDBUF_GET(conn) - (int)sendq; > > -- > 2.43.0 > -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson