On Mon, Nov 04, 2024 at 12:16:58PM +1100, David Gibson wrote: > On Fri, Nov 01, 2024 at 08:51:32PM -0400, Jon Maloy wrote: > > In order to reduce static memory and code footprint, we merge > > the array for l2 flag frames into the one for payload frames. > > > > This change also ensures that no flag message will be sent out > > over the l2 media bypassing already queued payload messages. > > > > Performance measurements with iperf3, where we force all > > traffic via the tap queue, show no significant difference: Spotted a couple of extra things I missed in my previous mail. [snip] > > @@ -107,13 +96,6 @@ void tcp_sock_iov_init(const struct ctx *c) > > tcp_payload[i].th.ack = 1; > > } > > > > - for (i = 0; i < ARRAY_SIZE(tcp_flags); i++) { > > - tcp6_flags_ip[i] = ip6; > > - tcp4_flags_ip[i] = iph; > > - tcp_flags[i].th.doff = sizeof(struct tcphdr) / 4; > > - tcp_flags[i].th.ack = 1; > > - } > > - > > for (i = 0; i < TCP_FRAMES_MEM; i++) { > > struct iovec *iov = tcp_l2_iov[i]; > > Because you're now setting all the flags fields for every frame, you should remove the initialisation of th.ack from iov_init. [snip] > > @@ -274,6 +237,9 @@ static void tcp_data_to_tap(const struct ctx *c, struct tcp_tap_conn *conn, > > iov[TCP_IOV_IP] = IOV_OF_LVALUE(tcp6_payload_ip[tcp_payload_used]); > > iov[TCP_IOV_ETH].iov_base = &tcp6_eth_src; > > } > > + payload = iov[TCP_IOV_PAYLOAD].iov_base; > > + flags = &payload->th.window - 1; > > + *(flags) = PAYLOAD_FLAGS; > > I think this is likely to run afoul of TBAA rules, which could cause > miscompiles because cc assumes *flags is not aliased with payload->th. > Although it's more bulky, I think it's better to explicitly assign > each of the fields in struct tcphdr. The compiler should be able to > boil it down to the same instructions in any case. > > Note that now we're using the netinet version of struct tcphdr, it has > an anonymous union so you can use th_flags, rather than the one bit > fields (doesn't include doffset, though). Actually, looking at this again, you shouldn't need to update doff in any case - it's the same for flag frames and data frames. Of course, it's plausible there's no point pre-filling the doff (or any other) header fields, but I don't think changing that should be in scope for this patch. I hope to post this afternoon a bunch of different buffer cleanups, along with vhost rebased on those and other changes. Unfortunately those will conflict with this patch, sorry. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson