public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
* [PATCH 0/7] Small improvements to IOV handling
@ 2024-04-29  7:09 David Gibson
  2024-04-29  7:09 ` [PATCH 1/7] checksum: Use proto_ipv6_header_psum() for ICMPv6 as well David Gibson
                   ` (6 more replies)
  0 siblings, 7 replies; 18+ messages in thread
From: David Gibson @ 2024-04-29  7:09 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: Laurent Vivier, David Gibson

Laurent's changes to split TCP buffers into various components with
IOVs is now merged.  This series has a batch of small cleanups to make
the handling of this slightly nicer.  These are preliminaries to doing
something similar with the UDP buffers.

Note that patch 7/7 might interfere with the experiments to work out
what is going wrong with the odd batching / performance issues we've
seen.  We can leave it off for the time being if that's a problem.

David Gibson (7):
  checksum: Use proto_ipv6_header_psum() for ICMPv6 as well
  tap: Split tap specific and L2 (ethernet) headers
  treewide: Standardise variable names for various packet lengths
  tcp: Simplify packet length calculation when preparing headers
  tap, tcp: (Re-)abstract TAP specific header handling
  iov: Helper macro to construct iovs covering existing variables or
    fields
  tcp: Update tap specific header too in tcp_fill_headers[46]()

 checksum.c |  46 ++++++++---------
 checksum.h |  10 ++--
 icmp.c     |   8 +--
 iov.h      |   3 ++
 passt.h    |   4 +-
 tap.c      |  51 ++++++++++---------
 tap.h      |  46 +++++++++++++----
 tcp.c      | 144 +++++++++++++++++++++++------------------------------
 udp.c      |  50 ++++++++++---------
 9 files changed, 184 insertions(+), 178 deletions(-)

-- 
2.44.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/7] checksum: Use proto_ipv6_header_psum() for ICMPv6 as well
  2024-04-29  7:09 [PATCH 0/7] Small improvements to IOV handling David Gibson
@ 2024-04-29  7:09 ` David Gibson
  2024-04-29  7:09 ` [PATCH 2/7] tap: Split tap specific and L2 (ethernet) headers David Gibson
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: David Gibson @ 2024-04-29  7:09 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: Laurent Vivier, David Gibson

7df624e79 ("checksum: introduce functions to compute the header part
checksum for TCP/UDP") introduced a helper to compute the partial checksum
for the IPv6 pseudo-header used in L4 protocol checksums.  It used it in
csum_udp6() for UDP packets, but not in csum_icmp6() for the identical
calculation in csum_icmp6() for ICMPv6 packets.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 checksum.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/checksum.c b/checksum.c
index f8a7b539..9cbe0b29 100644
--- a/checksum.c
+++ b/checksum.c
@@ -253,10 +253,8 @@ void csum_icmp6(struct icmp6hdr *icmp6hr,
 		const struct in6_addr *saddr, const struct in6_addr *daddr,
 		const void *payload, size_t len)
 {
-	/* Partial checksum for the pseudo-IPv6 header */
-	uint32_t psum = sum_16b(saddr, sizeof(*saddr)) +
-		        sum_16b(daddr, sizeof(*daddr)) +
-		        htons(len + sizeof(*icmp6hr)) + htons(IPPROTO_ICMPV6);
+	uint32_t psum = proto_ipv6_header_psum(len + sizeof(*icmp6hr),
+					       IPPROTO_ICMPV6, saddr, daddr);
 
 	icmp6hr->icmp6_cksum = 0;
 	/* Add in partial checksum for the ICMPv6 header alone */
-- 
@@ -253,10 +253,8 @@ void csum_icmp6(struct icmp6hdr *icmp6hr,
 		const struct in6_addr *saddr, const struct in6_addr *daddr,
 		const void *payload, size_t len)
 {
-	/* Partial checksum for the pseudo-IPv6 header */
-	uint32_t psum = sum_16b(saddr, sizeof(*saddr)) +
-		        sum_16b(daddr, sizeof(*daddr)) +
-		        htons(len + sizeof(*icmp6hr)) + htons(IPPROTO_ICMPV6);
+	uint32_t psum = proto_ipv6_header_psum(len + sizeof(*icmp6hr),
+					       IPPROTO_ICMPV6, saddr, daddr);
 
 	icmp6hr->icmp6_cksum = 0;
 	/* Add in partial checksum for the ICMPv6 header alone */
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/7] tap: Split tap specific and L2 (ethernet) headers
  2024-04-29  7:09 [PATCH 0/7] Small improvements to IOV handling David Gibson
  2024-04-29  7:09 ` [PATCH 1/7] checksum: Use proto_ipv6_header_psum() for ICMPv6 as well David Gibson
@ 2024-04-29  7:09 ` David Gibson
  2024-04-30 18:46   ` Stefano Brivio
  2024-04-29  7:09 ` [PATCH 3/7] treewide: Standardise variable names for various packet lengths David Gibson
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: David Gibson @ 2024-04-29  7:09 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: Laurent Vivier, David Gibson

In some places (well, actually only UDP now) we use struct tap_hdr to
represent both tap backend specific and L2 ethernet headers.  Handling
these together seemed like a good idea at the time, but Laurent's changes
in the TCP code working towards vhost-user support suggest that treating
them separately is more useful, more often.

Alter struct tap_hdr to represent only the TAP backend specific headers.
Updated related helpers and the UDP code to match.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 tap.h | 21 +++++++++------------
 udp.c | 23 ++++++++++++++---------
 2 files changed, 23 insertions(+), 21 deletions(-)

diff --git a/tap.h b/tap.h
index 2adc4e2b..dbc23b31 100644
--- a/tap.h
+++ b/tap.h
@@ -6,30 +6,28 @@
 #ifndef TAP_H
 #define TAP_H
 
+#define ETH_HDR_INIT(proto) { .h_proto = htons_constant(proto) }
+
 /**
- * struct tap_hdr - L2 and tap specific headers
+ * struct tap_hdr - tap backend specific headers
  * @vnet_len:	Frame length (for qemu socket transport)
- * @eh:		Ethernet header
  */
 struct tap_hdr {
 	uint32_t vnet_len;
-	struct ethhdr eh;
 } __attribute__((packed));
 
-#define TAP_HDR_INIT(proto) { .eh.h_proto = htons_constant(proto) }
-
 static inline size_t tap_hdr_len_(const struct ctx *c)
 {
 	if (c->mode == MODE_PASST)
 		return sizeof(struct tap_hdr);
 	else
-		return sizeof(struct ethhdr);
+		return 0;
 }
 
 /**
  * tap_frame_base() - Find start of tap frame
  * @c:		Execution context
- * @taph:	Pointer to L2 and tap specific header buffer
+ * @taph:	Pointer to tap specific header buffer
  *
  * Returns: pointer to the start of tap frame - suitable for an
  *          iov_base to be passed to tap_send_frames())
@@ -43,17 +41,16 @@ static inline void *tap_frame_base(const struct ctx *c, struct tap_hdr *taph)
  * tap_frame_len() - Finalize tap frame and return total length
  * @c:		Execution context
  * @taph:	Tap header to finalize
- * @plen:	L3 packet length (excludes L2 and tap specific headers)
+ * @plen:	L2 packet length (includes L2, excludes tap specific headers)
  *
- * Returns: length of the tap frame including L2 and tap specific
- *          headers - suitable for an iov_len to be passed to
- *          tap_send_frames()
+ * Returns: length of the tap frame including tap specific headers - suitable
+ *          for an iov_len to be passed to tap_send_frames()
  */
 static inline size_t tap_frame_len(const struct ctx *c, struct tap_hdr *taph,
 				   size_t plen)
 {
 	if (c->mode == MODE_PASST)
-		taph->vnet_len = htonl(plen + sizeof(taph->eh));
+		taph->vnet_len = htonl(plen);
 	return plen + tap_hdr_len_(c);
 }
 
diff --git a/udp.c b/udp.c
index 594ea191..c3e4f6b6 100644
--- a/udp.c
+++ b/udp.c
@@ -173,7 +173,8 @@ static uint8_t udp_act[IP_VERSIONS][UDP_ACT_TYPE_MAX][DIV_ROUND_UP(NUM_PORTS, 8)
 /**
  * udp4_l2_buf_t - Pre-cooked IPv4 packet buffers for tap connections
  * @s_in:	Source socket address, filled in by recvmmsg()
- * @taph:	Tap-level headers (partially pre-filled)
+ * @taph:	Tap backend specific header
+ * @eh:		Prefilled ethernet header
  * @iph:	Pre-filled IP header (except for tot_len and saddr)
  * @uh:		Headroom for UDP header
  * @data:	Storage for UDP payload
@@ -182,6 +183,7 @@ static struct udp4_l2_buf_t {
 	struct sockaddr_in s_in;
 
 	struct tap_hdr taph;
+	struct ethhdr eh;
 	struct iphdr iph;
 	struct udphdr uh;
 	uint8_t data[USHRT_MAX -
@@ -192,7 +194,8 @@ udp4_l2_buf[UDP_MAX_FRAMES];
 /**
  * udp6_l2_buf_t - Pre-cooked IPv6 packet buffers for tap connections
  * @s_in6:	Source socket address, filled in by recvmmsg()
- * @taph:	Tap-level headers (partially pre-filled)
+ * @taph:	Tap backend specific header
+ * @eh:		Pre-filled ethernet header
  * @ip6h:	Pre-filled IP header (except for payload_len and addresses)
  * @uh:		Headroom for UDP header
  * @data:	Storage for UDP payload
@@ -202,10 +205,11 @@ struct udp6_l2_buf_t {
 #ifdef __AVX2__
 	/* Align ip6h to 32-byte boundary. */
 	uint8_t pad[64 - (sizeof(struct sockaddr_in6) + sizeof(struct ethhdr) +
-			  sizeof(uint32_t))];
+			  sizeof(struct tap_hdr))];
 #endif
 
 	struct tap_hdr taph;
+	struct ethhdr eh;
 	struct ipv6hdr ip6h;
 	struct udphdr uh;
 	uint8_t data[USHRT_MAX -
@@ -289,8 +293,8 @@ void udp_update_l2_buf(const unsigned char *eth_d, const unsigned char *eth_s)
 		struct udp4_l2_buf_t *b4 = &udp4_l2_buf[i];
 		struct udp6_l2_buf_t *b6 = &udp6_l2_buf[i];
 
-		eth_update_mac(&b4->taph.eh, eth_d, eth_s);
-		eth_update_mac(&b6->taph.eh, eth_d, eth_s);
+		eth_update_mac(&b4->eh, eth_d, eth_s);
+		eth_update_mac(&b6->eh, eth_d, eth_s);
 	}
 }
 
@@ -307,7 +311,7 @@ static void udp_sock4_iov_init_one(const struct ctx *c, size_t i)
 	struct iovec *tiov = &udp4_l2_iov_tap[i];
 
 	*buf = (struct udp4_l2_buf_t) {
-		.taph = TAP_HDR_INIT(ETH_P_IP),
+		.eh  = ETH_HDR_INIT(ETH_P_IP),
 		.iph = L2_BUF_IP4_INIT(IPPROTO_UDP)
 	};
 
@@ -335,7 +339,7 @@ static void udp_sock6_iov_init_one(const struct ctx *c, size_t i)
 	struct iovec *tiov = &udp6_l2_iov_tap[i];
 
 	*buf = (struct udp6_l2_buf_t) {
-		.taph = TAP_HDR_INIT(ETH_P_IPV6),
+		.eh   = ETH_HDR_INIT(ETH_P_IPV6),
 		.ip6h = L2_BUF_IP6_INIT(IPPROTO_UDP)
 	};
 
@@ -608,7 +612,7 @@ static size_t udp_update_hdr4(const struct ctx *c, struct udp4_l2_buf_t *b,
 	b->uh.dest = htons(dstport);
 	b->uh.len = htons(datalen + sizeof(b->uh));
 
-	return tap_frame_len(c, &b->taph, ip_len);
+	return tap_frame_len(c, &b->taph, ip_len + sizeof(b->eh));
 }
 
 /**
@@ -676,7 +680,8 @@ static size_t udp_update_hdr6(const struct ctx *c, struct udp6_l2_buf_t *b,
 	b->uh.len = b->ip6h.payload_len;
 	csum_udp6(&b->uh, src, dst, b->data, datalen);
 
-	return tap_frame_len(c, &b->taph, payload_len + sizeof(b->ip6h));
+	return tap_frame_len(c, &b->taph, payload_len + sizeof(b->ip6h)
+			     + sizeof(b->eh));
 }
 
 /**
-- 
@@ -173,7 +173,8 @@ static uint8_t udp_act[IP_VERSIONS][UDP_ACT_TYPE_MAX][DIV_ROUND_UP(NUM_PORTS, 8)
 /**
  * udp4_l2_buf_t - Pre-cooked IPv4 packet buffers for tap connections
  * @s_in:	Source socket address, filled in by recvmmsg()
- * @taph:	Tap-level headers (partially pre-filled)
+ * @taph:	Tap backend specific header
+ * @eh:		Prefilled ethernet header
  * @iph:	Pre-filled IP header (except for tot_len and saddr)
  * @uh:		Headroom for UDP header
  * @data:	Storage for UDP payload
@@ -182,6 +183,7 @@ static struct udp4_l2_buf_t {
 	struct sockaddr_in s_in;
 
 	struct tap_hdr taph;
+	struct ethhdr eh;
 	struct iphdr iph;
 	struct udphdr uh;
 	uint8_t data[USHRT_MAX -
@@ -192,7 +194,8 @@ udp4_l2_buf[UDP_MAX_FRAMES];
 /**
  * udp6_l2_buf_t - Pre-cooked IPv6 packet buffers for tap connections
  * @s_in6:	Source socket address, filled in by recvmmsg()
- * @taph:	Tap-level headers (partially pre-filled)
+ * @taph:	Tap backend specific header
+ * @eh:		Pre-filled ethernet header
  * @ip6h:	Pre-filled IP header (except for payload_len and addresses)
  * @uh:		Headroom for UDP header
  * @data:	Storage for UDP payload
@@ -202,10 +205,11 @@ struct udp6_l2_buf_t {
 #ifdef __AVX2__
 	/* Align ip6h to 32-byte boundary. */
 	uint8_t pad[64 - (sizeof(struct sockaddr_in6) + sizeof(struct ethhdr) +
-			  sizeof(uint32_t))];
+			  sizeof(struct tap_hdr))];
 #endif
 
 	struct tap_hdr taph;
+	struct ethhdr eh;
 	struct ipv6hdr ip6h;
 	struct udphdr uh;
 	uint8_t data[USHRT_MAX -
@@ -289,8 +293,8 @@ void udp_update_l2_buf(const unsigned char *eth_d, const unsigned char *eth_s)
 		struct udp4_l2_buf_t *b4 = &udp4_l2_buf[i];
 		struct udp6_l2_buf_t *b6 = &udp6_l2_buf[i];
 
-		eth_update_mac(&b4->taph.eh, eth_d, eth_s);
-		eth_update_mac(&b6->taph.eh, eth_d, eth_s);
+		eth_update_mac(&b4->eh, eth_d, eth_s);
+		eth_update_mac(&b6->eh, eth_d, eth_s);
 	}
 }
 
@@ -307,7 +311,7 @@ static void udp_sock4_iov_init_one(const struct ctx *c, size_t i)
 	struct iovec *tiov = &udp4_l2_iov_tap[i];
 
 	*buf = (struct udp4_l2_buf_t) {
-		.taph = TAP_HDR_INIT(ETH_P_IP),
+		.eh  = ETH_HDR_INIT(ETH_P_IP),
 		.iph = L2_BUF_IP4_INIT(IPPROTO_UDP)
 	};
 
@@ -335,7 +339,7 @@ static void udp_sock6_iov_init_one(const struct ctx *c, size_t i)
 	struct iovec *tiov = &udp6_l2_iov_tap[i];
 
 	*buf = (struct udp6_l2_buf_t) {
-		.taph = TAP_HDR_INIT(ETH_P_IPV6),
+		.eh   = ETH_HDR_INIT(ETH_P_IPV6),
 		.ip6h = L2_BUF_IP6_INIT(IPPROTO_UDP)
 	};
 
@@ -608,7 +612,7 @@ static size_t udp_update_hdr4(const struct ctx *c, struct udp4_l2_buf_t *b,
 	b->uh.dest = htons(dstport);
 	b->uh.len = htons(datalen + sizeof(b->uh));
 
-	return tap_frame_len(c, &b->taph, ip_len);
+	return tap_frame_len(c, &b->taph, ip_len + sizeof(b->eh));
 }
 
 /**
@@ -676,7 +680,8 @@ static size_t udp_update_hdr6(const struct ctx *c, struct udp6_l2_buf_t *b,
 	b->uh.len = b->ip6h.payload_len;
 	csum_udp6(&b->uh, src, dst, b->data, datalen);
 
-	return tap_frame_len(c, &b->taph, payload_len + sizeof(b->ip6h));
+	return tap_frame_len(c, &b->taph, payload_len + sizeof(b->ip6h)
+			     + sizeof(b->eh));
 }
 
 /**
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/7] treewide: Standardise variable names for various packet lengths
  2024-04-29  7:09 [PATCH 0/7] Small improvements to IOV handling David Gibson
  2024-04-29  7:09 ` [PATCH 1/7] checksum: Use proto_ipv6_header_psum() for ICMPv6 as well David Gibson
  2024-04-29  7:09 ` [PATCH 2/7] tap: Split tap specific and L2 (ethernet) headers David Gibson
@ 2024-04-29  7:09 ` David Gibson
  2024-04-30 18:46   ` Stefano Brivio
  2024-04-29  7:09 ` [PATCH 4/7] tcp: Simplify packet length calculation when preparing headers David Gibson
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: David Gibson @ 2024-04-29  7:09 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: Laurent Vivier, David Gibson

At various points we need to track the lengths of a packet including or
excluding various different sets of headers.  We don't always use the same
variable names for doing so.  Worse in some places we use the same name
for different things: e.g. tcp_fill_headers[46]() use ip_len for the
length including the IP headers, but then tcp_send_flag() which calls it
uses it to mean the IP payload length only.

To improve clarity, standardise on these names:
   plen:		L4 protocol payload length
   l4len:		plen + length of L4 protocol header
   l3len:		l4len + length of IPv4/IPv6 header
   l2len:		l3len + length of L2 (ethernet) header

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 checksum.c | 42 +++++++++++++++++-----------------
 checksum.h | 10 ++++-----
 icmp.c     |  8 +++----
 passt.h    |  4 ++--
 tap.c      | 48 +++++++++++++++++++--------------------
 tcp.c      | 66 +++++++++++++++++++++++++++---------------------------
 udp.c      | 24 ++++++++++----------
 7 files changed, 101 insertions(+), 101 deletions(-)

diff --git a/checksum.c b/checksum.c
index 9cbe0b29..3602215a 100644
--- a/checksum.c
+++ b/checksum.c
@@ -116,7 +116,7 @@ uint16_t csum_fold(uint32_t sum)
 
 /**
  * csum_ip4_header() - Calculate IPv4 header checksum
- * @tot_len:	IPv4 payload length (data + IP header, network order)
+ * @tot_len:	IPv4 packet length (data + IP header, network order)
  * @protocol:	Protocol number (network order)
  * @saddr:	IPv4 source address (network order)
  * @daddr:	IPv4 destination address (network order)
@@ -140,13 +140,13 @@ uint16_t csum_ip4_header(uint16_t tot_len, uint8_t protocol,
 /**
  * proto_ipv4_header_psum() - Calculates the partial checksum of an
  * 			      IPv4 header for UDP or TCP
- * @tot_len:	IPv4 Payload length (host order)
+ * @l4len:	IPv4 Payload length (host order)
  * @proto:	Protocol number (host order)
  * @saddr:	Source address  (network order)
  * @daddr:	Destination address (network order)
  * Returns:	Partial checksum of the IPv4 header
  */
-uint32_t proto_ipv4_header_psum(uint16_t tot_len, uint8_t protocol,
+uint32_t proto_ipv4_header_psum(uint16_t l4len, uint8_t protocol,
 				struct in_addr saddr, struct in_addr daddr)
 {
 	uint32_t psum = htons(protocol);
@@ -155,7 +155,7 @@ uint32_t proto_ipv4_header_psum(uint16_t tot_len, uint8_t protocol,
 	psum += saddr.s_addr & 0xffff;
 	psum += (daddr.s_addr >> 16) & 0xffff;
 	psum += daddr.s_addr & 0xffff;
-	psum += htons(tot_len);
+	psum += htons(l4len);
 
 	return psum;
 }
@@ -165,22 +165,22 @@ uint32_t proto_ipv4_header_psum(uint16_t tot_len, uint8_t protocol,
  * @udp4hr:	UDP header, initialised apart from checksum
  * @saddr:	IPv4 source address
  * @daddr:	IPv4 destination address
- * @payload:	ICMPv4 packet payload
- * @len:	Length of @payload (not including UDP)
+ * @payload:	UDP packet payload
+ * @plen:	Length of @payload (not including UDP header)
  */
 void csum_udp4(struct udphdr *udp4hr,
 	       struct in_addr saddr, struct in_addr daddr,
-	       const void *payload, size_t len)
+	       const void *payload, size_t plen)
 {
 	/* UDP checksums are optional, so don't bother */
 	udp4hr->check = 0;
 
 	if (UDP4_REAL_CHECKSUMS) {
-		uint16_t tot_len = len + sizeof(struct udphdr);
-		uint32_t psum = proto_ipv4_header_psum(tot_len, IPPROTO_UDP,
+		uint16_t l4len = plen + sizeof(struct udphdr);
+		uint32_t psum = proto_ipv4_header_psum(l4len, IPPROTO_UDP,
 						       saddr, daddr);
 		psum = csum_unfolded(udp4hr, sizeof(struct udphdr), psum);
-		udp4hr->check = csum(payload, len, psum);
+		udp4hr->check = csum(payload, plen, psum);
 	}
 }
 
@@ -188,9 +188,9 @@ void csum_udp4(struct udphdr *udp4hr,
  * csum_icmp4() - Calculate and set checksum for an ICMP packet
  * @icmp4hr:	ICMP header, initialised apart from checksum
  * @payload:	ICMP packet payload
- * @len:	Length of @payload (not including ICMP header)
+ * @plen:	Length of @payload (not including ICMP header)
  */
-void csum_icmp4(struct icmphdr *icmp4hr, const void *payload, size_t len)
+void csum_icmp4(struct icmphdr *icmp4hr, const void *payload, size_t plen)
 {
 	uint32_t psum;
 
@@ -199,7 +199,7 @@ void csum_icmp4(struct icmphdr *icmp4hr, const void *payload, size_t len)
 	/* Partial checksum for ICMP header alone */
 	psum = sum_16b(icmp4hr, sizeof(*icmp4hr));
 
-	icmp4hr->checksum = csum(payload, len, psum);
+	icmp4hr->checksum = csum(payload, plen, psum);
 }
 
 /**
@@ -227,18 +227,18 @@ uint32_t proto_ipv6_header_psum(uint16_t payload_len, uint8_t protocol,
  * csum_udp6() - Calculate and set checksum for a UDP over IPv6 packet
  * @udp6hr:	UDP header, initialised apart from checksum
  * @payload:	UDP packet payload
- * @len:	Length of @payload (not including UDP header)
+ * @plen:	Length of @payload (not including UDP header)
  */
 void csum_udp6(struct udphdr *udp6hr,
 	       const struct in6_addr *saddr, const struct in6_addr *daddr,
-	       const void *payload, size_t len)
+	       const void *payload, size_t plen)
 {
-	uint32_t psum = proto_ipv6_header_psum(len + sizeof(struct udphdr),
+	uint32_t psum = proto_ipv6_header_psum(plen + sizeof(struct udphdr),
 					       IPPROTO_UDP, saddr, daddr);
 	udp6hr->check = 0;
 
 	psum = csum_unfolded(udp6hr, sizeof(struct udphdr), psum);
-	udp6hr->check = csum(payload, len, psum);
+	udp6hr->check = csum(payload, plen, psum);
 }
 
 /**
@@ -247,19 +247,19 @@ void csum_udp6(struct udphdr *udp6hr,
  * @saddr:	IPv6 source address
  * @daddr:	IPv6 destination address
  * @payload:	ICMP packet payload
- * @len:	Length of @payload (not including ICMPv6 header)
+ * @plen:	Length of @payload (not including ICMPv6 header)
  */
 void csum_icmp6(struct icmp6hdr *icmp6hr,
 		const struct in6_addr *saddr, const struct in6_addr *daddr,
-		const void *payload, size_t len)
+		const void *payload, size_t plen)
 {
-	uint32_t psum = proto_ipv6_header_psum(len + sizeof(*icmp6hr),
+	uint32_t psum = proto_ipv6_header_psum(plen + sizeof(*icmp6hr),
 					       IPPROTO_ICMPV6, saddr, daddr);
 
 	icmp6hr->icmp6_cksum = 0;
 	/* Add in partial checksum for the ICMPv6 header alone */
 	psum += sum_16b(icmp6hr, sizeof(*icmp6hr));
-	icmp6hr->icmp6_cksum = csum(payload, len, psum);
+	icmp6hr->icmp6_cksum = csum(payload, plen, psum);
 }
 
 #ifdef __AVX2__
diff --git a/checksum.h b/checksum.h
index 0f396767..726ed7b5 100644
--- a/checksum.h
+++ b/checksum.h
@@ -15,21 +15,21 @@ uint16_t csum_fold(uint32_t sum);
 uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init);
 uint16_t csum_ip4_header(uint16_t tot_len, uint8_t protocol,
 			 struct in_addr saddr, struct in_addr daddr);
-uint32_t proto_ipv4_header_psum(uint16_t tot_len, uint8_t protocol,
+uint32_t proto_ipv4_header_psum(uint16_t l4len, uint8_t protocol,
 				struct in_addr saddr, struct in_addr daddr);
 void csum_udp4(struct udphdr *udp4hr,
 	       struct in_addr saddr, struct in_addr daddr,
-	       const void *payload, size_t len);
-void csum_icmp4(struct icmphdr *icmp4hr, const void *payload, size_t len);
+	       const void *payload, size_t plen);
+void csum_icmp4(struct icmphdr *icmp4hr, const void *payload, size_t plen);
 uint32_t proto_ipv6_header_psum(uint16_t payload_len, uint8_t protocol,
 				const struct in6_addr *saddr,
 				const struct in6_addr *daddr);
 void csum_udp6(struct udphdr *udp6hr,
 	       const struct in6_addr *saddr, const struct in6_addr *daddr,
-	       const void *payload, size_t len);
+	       const void *payload, size_t plen);
 void csum_icmp6(struct icmp6hdr *icmp6hr,
 		const struct in6_addr *saddr, const struct in6_addr *daddr,
-		const void *payload, size_t len);
+		const void *payload, size_t plen);
 uint32_t csum_unfolded(const void *buf, size_t len, uint32_t init);
 uint16_t csum(const void *buf, size_t len, uint32_t init);
 uint16_t csum_iov(const struct iovec *iov, size_t n, uint32_t init);
diff --git a/icmp.c b/icmp.c
index 76bb9e9f..263cd9ca 100644
--- a/icmp.c
+++ b/icmp.c
@@ -224,8 +224,8 @@ int icmp_tap_handler(const struct ctx *c, uint8_t pif, sa_family_t af,
 	union sockaddr_inany sa = { .sa_family = af };
 	const socklen_t sl = af == AF_INET ? sizeof(sa.sa4) : sizeof(sa.sa6);
 	struct icmp_ping_flow *pingf, **id_sock;
+	size_t plen, l4len;
 	uint16_t id, seq;
-	size_t plen;
 	void *pkt;
 
 	(void)saddr;
@@ -238,7 +238,7 @@ int icmp_tap_handler(const struct ctx *c, uint8_t pif, sa_family_t af,
 			return 1;
 
 		ih =  (struct icmphdr *)pkt;
-		plen += sizeof(*ih);
+		l4len = plen + sizeof(*ih);
 
 		if (ih->type != ICMP_ECHO)
 			return 1;
@@ -254,7 +254,7 @@ int icmp_tap_handler(const struct ctx *c, uint8_t pif, sa_family_t af,
 			return 1;
 
 		ih = (struct icmp6hdr *)pkt;
-		plen += sizeof(*ih);
+		l4len = plen + sizeof(*ih);
 
 		if (ih->icmp6_type != ICMPV6_ECHO_REQUEST)
 			return 1;
@@ -274,7 +274,7 @@ int icmp_tap_handler(const struct ctx *c, uint8_t pif, sa_family_t af,
 
 	pingf->ts = now->tv_sec;
 
-	if (sendto(pingf->sock, pkt, plen, MSG_NOSIGNAL, &sa.sa, sl) < 0) {
+	if (sendto(pingf->sock, pkt, l4len, MSG_NOSIGNAL, &sa.sa, sl) < 0) {
 		flow_dbg(pingf, "failed to relay request to socket: %s",
 			 strerror(errno));
 	} else {
diff --git a/passt.h b/passt.h
index 76026f09..d1add470 100644
--- a/passt.h
+++ b/passt.h
@@ -22,11 +22,11 @@ struct tap_msg {
 /**
  * struct tap_l4_msg - Layer-4 message descriptor for protocol handlers
  * @pkt_buf_offset:	Offset of message from @pkt_buf
- * @l4_len:		Length of Layer-4 payload, host order
+ * @l4len:		Length of Layer-4 payload, host order
  */
 struct tap_l4_msg {
 	uint32_t pkt_buf_offset;
-	uint16_t l4_len;
+	uint16_t l4len;
 };
 
 union epoll_ref;
diff --git a/tap.c b/tap.c
index 13e4da79..6a08cca6 100644
--- a/tap.c
+++ b/tap.c
@@ -177,15 +177,15 @@ void tap_udp4_send(const struct ctx *c, struct in_addr src, in_port_t sport,
 		   struct in_addr dst, in_port_t dport,
 		   const void *in, size_t len)
 {
-	size_t udplen = len + sizeof(struct udphdr);
+	size_t l4len = len + sizeof(struct udphdr);
 	char buf[USHRT_MAX];
 	struct iphdr *ip4h = tap_push_l2h(c, buf, ETH_P_IP);
-	struct udphdr *uh = tap_push_ip4h(ip4h, src, dst, udplen, IPPROTO_UDP);
+	struct udphdr *uh = tap_push_ip4h(ip4h, src, dst, l4len, IPPROTO_UDP);
 	char *data = (char *)(uh + 1);
 
 	uh->source = htons(sport);
 	uh->dest = htons(dport);
-	uh->len = htons(udplen);
+	uh->len = htons(l4len);
 	csum_udp4(uh, src, dst, in, len);
 	memcpy(data, in, len);
 
@@ -259,16 +259,16 @@ void tap_udp6_send(const struct ctx *c,
 		   const struct in6_addr *dst, in_port_t dport,
 		   uint32_t flow, const void *in, size_t len)
 {
-	size_t udplen = len + sizeof(struct udphdr);
+	size_t l4len = len + sizeof(struct udphdr);
 	char buf[USHRT_MAX];
 	struct ipv6hdr *ip6h = tap_push_l2h(c, buf, ETH_P_IPV6);
 	struct udphdr *uh = tap_push_ip6h(ip6h, src, dst,
-					  udplen, IPPROTO_UDP, flow);
+					  l4len, IPPROTO_UDP, flow);
 	char *data = (char *)(uh + 1);
 
 	uh->source = htons(sport);
 	uh->dest = htons(dport);
-	uh->len = htons(udplen);
+	uh->len = htons(l4len);
 	csum_udp6(uh, src, dst, in, len);
 	memcpy(data, in, len);
 
@@ -589,21 +589,21 @@ static int tap4_handler(struct ctx *c, const struct pool *in,
 	i = 0;
 resume:
 	for (seq_count = 0, seq = NULL; i < in->count; i++) {
-		size_t l2_len, l3_len, hlen, l4_len;
+		size_t l2len, l3len, hlen, l4len;
 		const struct ethhdr *eh;
 		const struct udphdr *uh;
 		struct iphdr *iph;
 		const char *l4h;
 
-		packet_get(in, i, 0, 0, &l2_len);
+		packet_get(in, i, 0, 0, &l2len);
 
-		eh = packet_get(in, i, 0, sizeof(*eh), &l3_len);
+		eh = packet_get(in, i, 0, sizeof(*eh), &l3len);
 		if (!eh)
 			continue;
 		if (ntohs(eh->h_proto) == ETH_P_ARP) {
 			PACKET_POOL_P(pkt, 1, in->buf, sizeof(pkt_buf));
 
-			packet_add(pkt, l2_len, (char *)eh);
+			packet_add(pkt, l2len, (char *)eh);
 			arp(c, pkt);
 			continue;
 		}
@@ -613,15 +613,15 @@ resume:
 			continue;
 
 		hlen = iph->ihl * 4UL;
-		if (hlen < sizeof(*iph) || htons(iph->tot_len) > l3_len ||
-		    hlen > l3_len)
+		if (hlen < sizeof(*iph) || htons(iph->tot_len) > l3len ||
+		    hlen > l3len)
 			continue;
 
 		/* We don't handle IP fragments, drop them */
 		if (tap4_is_fragment(iph, now))
 			continue;
 
-		l4_len = htons(iph->tot_len) - hlen;
+		l4len = htons(iph->tot_len) - hlen;
 
 		if (IN4_IS_ADDR_LOOPBACK(&iph->saddr) ||
 		    IN4_IS_ADDR_LOOPBACK(&iph->daddr)) {
@@ -636,7 +636,7 @@ resume:
 		if (iph->saddr && c->ip4.addr_seen.s_addr != iph->saddr)
 			c->ip4.addr_seen.s_addr = iph->saddr;
 
-		l4h = packet_get(in, i, sizeof(*eh) + hlen, l4_len, NULL);
+		l4h = packet_get(in, i, sizeof(*eh) + hlen, l4len, NULL);
 		if (!l4h)
 			continue;
 
@@ -648,7 +648,7 @@ resume:
 
 			tap_packet_debug(iph, NULL, NULL, 0, NULL, 1);
 
-			packet_add(pkt, l4_len, l4h);
+			packet_add(pkt, l4len, l4h);
 			icmp_tap_handler(c, PIF_TAP, AF_INET,
 					 &iph->saddr, &iph->daddr,
 					 pkt, now);
@@ -662,7 +662,7 @@ resume:
 		if (iph->protocol == IPPROTO_UDP) {
 			PACKET_POOL_P(pkt, 1, in->buf, sizeof(pkt_buf));
 
-			packet_add(pkt, l2_len, (char *)eh);
+			packet_add(pkt, l2len, (char *)eh);
 			if (dhcp(c, pkt))
 				continue;
 		}
@@ -711,7 +711,7 @@ resume:
 #undef L4_SET
 
 append:
-		packet_add((struct pool *)&seq->p, l4_len, l4h);
+		packet_add((struct pool *)&seq->p, l4len, l4h);
 	}
 
 	for (j = 0, seq = tap4_l4; j < seq_count; j++, seq++) {
@@ -763,7 +763,7 @@ static int tap6_handler(struct ctx *c, const struct pool *in,
 	i = 0;
 resume:
 	for (seq_count = 0, seq = NULL; i < in->count; i++) {
-		size_t l4_len, plen, check;
+		size_t l4len, plen, check;
 		struct in6_addr *saddr, *daddr;
 		const struct ethhdr *eh;
 		const struct udphdr *uh;
@@ -786,7 +786,7 @@ resume:
 		if (plen != check)
 			continue;
 
-		if (!(l4h = ipv6_l4hdr(in, i, sizeof(*eh), &proto, &l4_len)))
+		if (!(l4h = ipv6_l4hdr(in, i, sizeof(*eh), &proto, &l4len)))
 			continue;
 
 		if (IN6_IS_ADDR_LOOPBACK(saddr) || IN6_IS_ADDR_LOOPBACK(daddr)) {
@@ -814,7 +814,7 @@ resume:
 			if (c->no_icmp)
 				continue;
 
-			if (l4_len < sizeof(struct icmp6hdr))
+			if (l4len < sizeof(struct icmp6hdr))
 				continue;
 
 			if (ndp(c, (struct icmp6hdr *)l4h, saddr))
@@ -822,20 +822,20 @@ resume:
 
 			tap_packet_debug(NULL, ip6h, NULL, proto, NULL, 1);
 
-			packet_add(pkt, l4_len, l4h);
+			packet_add(pkt, l4len, l4h);
 			icmp_tap_handler(c, PIF_TAP, AF_INET6,
 					 saddr, daddr, pkt, now);
 			continue;
 		}
 
-		if (l4_len < sizeof(*uh))
+		if (l4len < sizeof(*uh))
 			continue;
 		uh = (struct udphdr *)l4h;
 
 		if (proto == IPPROTO_UDP) {
 			PACKET_POOL_P(pkt, 1, in->buf, sizeof(pkt_buf));
 
-			packet_add(pkt, l4_len, l4h);
+			packet_add(pkt, l4len, l4h);
 
 			if (dhcpv6(c, pkt, saddr, daddr))
 				continue;
@@ -886,7 +886,7 @@ resume:
 #undef L4_SET
 
 append:
-		packet_add((struct pool *)&seq->p, l4_len, l4h);
+		packet_add((struct pool *)&seq->p, l4len, l4h);
 	}
 
 	for (j = 0, seq = tap6_l4; j < seq_count; j++, seq++) {
diff --git a/tcp.c b/tcp.c
index 24f99cdf..23d408f2 100644
--- a/tcp.c
+++ b/tcp.c
@@ -891,13 +891,13 @@ static void tcp_sock_set_bufsize(const struct ctx *c, int s)
  */
 static void tcp_update_check_tcp4(const struct iphdr *iph, struct tcphdr *th)
 {
-	uint16_t tlen = ntohs(iph->tot_len) - sizeof(struct iphdr);
+	uint16_t l4len = ntohs(iph->tot_len) - sizeof(struct iphdr);
 	struct in_addr saddr = { .s_addr = iph->saddr };
 	struct in_addr daddr = { .s_addr = iph->daddr };
-	uint32_t sum = proto_ipv4_header_psum(tlen, IPPROTO_TCP, saddr, daddr);
+	uint32_t sum = proto_ipv4_header_psum(l4len, IPPROTO_TCP, saddr, daddr);
 
 	th->check = 0;
-	th->check = csum(th, tlen, sum);
+	th->check = csum(th, l4len, sum);
 }
 
 /**
@@ -907,12 +907,12 @@ static void tcp_update_check_tcp4(const struct iphdr *iph, struct tcphdr *th)
  */
 static void tcp_update_check_tcp6(struct ipv6hdr *ip6h, struct tcphdr *th)
 {
-	uint16_t payload_len = ntohs(ip6h->payload_len);
-	uint32_t sum = proto_ipv6_header_psum(payload_len, IPPROTO_TCP,
+	uint16_t l4len = ntohs(ip6h->payload_len);
+	uint32_t sum = proto_ipv6_header_psum(l4len, IPPROTO_TCP,
 					      &ip6h->saddr, &ip6h->daddr);
 
 	th->check = 0;
-	th->check = csum(th, payload_len, sum);
+	th->check = csum(th, l4len, sum);
 }
 
 /**
@@ -1349,12 +1349,13 @@ static size_t tcp_fill_headers4(const struct ctx *c,
 				size_t plen, const uint16_t *check,
 				uint32_t seq)
 {
-	size_t ip_len = plen + sizeof(struct iphdr) + sizeof(struct tcphdr);
 	const struct in_addr *a4 = inany_v4(&conn->faddr);
+	size_t l4len = plen + sizeof(*th);
+	size_t l3len = l4len + sizeof(*iph);
 
 	ASSERT(a4);
 
-	iph->tot_len = htons(ip_len);
+	iph->tot_len = htons(l3len);
 	iph->saddr = a4->s_addr;
 	iph->daddr = c->ip4.addr_seen.s_addr;
 
@@ -1366,7 +1367,7 @@ static size_t tcp_fill_headers4(const struct ctx *c,
 
 	tcp_update_check_tcp4(iph, th);
 
-	return ip_len;
+	return l3len;
 }
 
 /**
@@ -1386,9 +1387,10 @@ static size_t tcp_fill_headers6(const struct ctx *c,
 				struct ipv6hdr *ip6h, struct tcphdr *th,
 				size_t plen, uint32_t seq)
 {
-	size_t ip_len = plen + sizeof(struct ipv6hdr) + sizeof(struct tcphdr);
+	size_t l4len = plen + sizeof(*th);
+	size_t l3len = l4len + sizeof(*ip6h);
 
-	ip6h->payload_len = htons(plen + sizeof(struct tcphdr));
+	ip6h->payload_len = htons(l4len);
 	ip6h->saddr = conn->faddr.a6;
 	if (IN6_IS_ADDR_LINKLOCAL(&ip6h->saddr))
 		ip6h->daddr = c->ip6.addr_ll_seen;
@@ -1407,7 +1409,7 @@ static size_t tcp_fill_headers6(const struct ctx *c,
 
 	tcp_update_check_tcp6(ip6h, th);
 
-	return ip_len;
+	return l3len;
 }
 
 /**
@@ -1427,21 +1429,21 @@ static size_t tcp_l2_buf_fill_headers(const struct ctx *c,
 				      const uint16_t *check, uint32_t seq)
 {
 	const struct in_addr *a4 = inany_v4(&conn->faddr);
-	size_t ip_len, tlen;
+	size_t l3len, l4len;
 
 	if (a4) {
-		ip_len = tcp_fill_headers4(c, conn, iov[TCP_IOV_IP].iov_base,
+		l3len = tcp_fill_headers4(c, conn, iov[TCP_IOV_IP].iov_base,
 					   iov[TCP_IOV_PAYLOAD].iov_base, plen,
 					   check, seq);
-		tlen = ip_len - sizeof(struct iphdr);
+		l4len = l3len - sizeof(struct iphdr);
 	} else {
-		ip_len = tcp_fill_headers6(c, conn, iov[TCP_IOV_IP].iov_base,
+		l3len = tcp_fill_headers6(c, conn, iov[TCP_IOV_IP].iov_base,
 					   iov[TCP_IOV_PAYLOAD].iov_base, plen,
 					   seq);
-		tlen = ip_len - sizeof(struct ipv6hdr);
+		l4len = l3len - sizeof(struct ipv6hdr);
 	}
 
-	return tlen;
+	return l4len;
 }
 
 /**
@@ -1578,7 +1580,7 @@ static int tcp_send_flag(struct ctx *c, struct tcp_tap_conn *conn, int flags)
 	size_t optlen = 0;
 	struct tcphdr *th;
 	struct iovec *iov;
-	size_t ip_len;
+	size_t l4len;
 	char *data;
 
 	if (SEQ_GE(conn->seq_ack_to_tap, conn->seq_from_tap) &&
@@ -1658,11 +1660,11 @@ static int tcp_send_flag(struct ctx *c, struct tcp_tap_conn *conn, int flags)
 	th->syn = !!(flags & SYN);
 	th->fin = !!(flags & FIN);
 
-	ip_len = tcp_l2_buf_fill_headers(c, conn, iov, optlen, NULL,
-					 conn->seq_to_tap);
-	iov[TCP_IOV_PAYLOAD].iov_len = ip_len;
+	l4len = tcp_l2_buf_fill_headers(c, conn, iov, optlen, NULL,
+					conn->seq_to_tap);
+	iov[TCP_IOV_PAYLOAD].iov_len = l4len;
 
-	*(uint32_t *)iov[TCP_IOV_VLEN].iov_base = htonl(vnet_len + ip_len);
+	*(uint32_t *)iov[TCP_IOV_VLEN].iov_base = htonl(vnet_len + l4len);
 
 	if (th->ack) {
 		if (SEQ_GE(conn->seq_ack_to_tap, conn->seq_from_tap))
@@ -2145,8 +2147,8 @@ static void tcp_data_to_tap(const struct ctx *c, struct tcp_tap_conn *conn,
 {
 	uint32_t *seq_update = &conn->seq_to_tap;
 	struct iovec *iov;
-	size_t ip_len;
 	uint32_t vnet_len;
+	size_t l4len;
 
 	if (CONN_V4(conn)) {
 		struct iovec *iov_prev = tcp4_l2_iov[tcp4_payload_used - 1];
@@ -2161,11 +2163,9 @@ static void tcp_data_to_tap(const struct ctx *c, struct tcp_tap_conn *conn,
 		tcp4_seq_update[tcp4_payload_used].len = plen;
 
 		iov = tcp4_l2_iov[tcp4_payload_used++];
-		ip_len = tcp_l2_buf_fill_headers(c, conn, iov, plen, check,
-						 seq);
-		iov[TCP_IOV_PAYLOAD].iov_len = ip_len;
-		vnet_len = sizeof(struct ethhdr) + sizeof(struct iphdr) +
-                           ip_len;
+		l4len = tcp_l2_buf_fill_headers(c, conn, iov, plen, check, seq);
+		iov[TCP_IOV_PAYLOAD].iov_len = l4len;
+		vnet_len = sizeof(struct ethhdr) + sizeof(struct iphdr) + l4len;
 		*(uint32_t *)iov[TCP_IOV_VLEN].iov_base = htonl(vnet_len);
 		if (tcp4_payload_used > TCP_FRAMES_MEM - 1)
 			tcp_payload_flush(c);
@@ -2174,10 +2174,10 @@ static void tcp_data_to_tap(const struct ctx *c, struct tcp_tap_conn *conn,
 		tcp6_seq_update[tcp6_payload_used].len = plen;
 
 		iov = tcp6_l2_iov[tcp6_payload_used++];
-		ip_len = tcp_l2_buf_fill_headers(c, conn, iov, plen, NULL, seq);
-		iov[TCP_IOV_PAYLOAD].iov_len = ip_len;
-		vnet_len = sizeof(struct ethhdr) + sizeof(struct ipv6hdr) +
-			   ip_len;
+		l4len = tcp_l2_buf_fill_headers(c, conn, iov, plen, NULL, seq);
+		iov[TCP_IOV_PAYLOAD].iov_len = l4len;
+		vnet_len = sizeof(struct ethhdr) + sizeof(struct ipv6hdr)
+			+ l4len;
 		*(uint32_t *)iov[TCP_IOV_VLEN].iov_base = htonl(vnet_len);
 		if (tcp6_payload_used > TCP_FRAMES_MEM - 1)
 			tcp_payload_flush(c);
diff --git a/udp.c b/udp.c
index c3e4f6b6..1a4c02f4 100644
--- a/udp.c
+++ b/udp.c
@@ -570,16 +570,16 @@ static void udp_splice_sendfrom(const struct ctx *c, unsigned start, unsigned n,
  * @c:		Execution context
  * @b:		Pointer to udp4_l2_buf to update
  * @dstport:	Destination port number
- * @datalen:	Length of UDP payload
+ * @plen:	Length of UDP payload
  * @now:	Current timestamp
  *
  * Return: size of tap frame with headers
  */
 static size_t udp_update_hdr4(const struct ctx *c, struct udp4_l2_buf_t *b,
-			      in_port_t dstport, size_t datalen,
+			      in_port_t dstport, size_t plen,
 			      const struct timespec *now)
 {
-	size_t ip_len = datalen + sizeof(b->iph) + sizeof(b->uh);
+	size_t l3len = plen + sizeof(b->iph) + sizeof(b->uh);
 	in_port_t srcport = ntohs(b->s_in.sin_port);
 	struct in_addr src = b->s_in.sin_addr;
 
@@ -602,7 +602,7 @@ static size_t udp_update_hdr4(const struct ctx *c, struct udp4_l2_buf_t *b,
 		src = c->ip4.gw;
 	}
 
-	b->iph.tot_len = htons(ip_len);
+	b->iph.tot_len = htons(l3len);
 	b->iph.daddr = c->ip4.addr_seen.s_addr;
 	b->iph.saddr = src.s_addr;
 	b->iph.check = csum_ip4_header(b->iph.tot_len, IPPROTO_UDP,
@@ -610,9 +610,9 @@ static size_t udp_update_hdr4(const struct ctx *c, struct udp4_l2_buf_t *b,
 
 	b->uh.source = b->s_in.sin_port;
 	b->uh.dest = htons(dstport);
-	b->uh.len = htons(datalen + sizeof(b->uh));
+	b->uh.len = htons(plen + sizeof(b->uh));
 
-	return tap_frame_len(c, &b->taph, ip_len + sizeof(b->eh));
+	return tap_frame_len(c, &b->taph, l3len + sizeof(b->eh));
 }
 
 /**
@@ -620,19 +620,19 @@ static size_t udp_update_hdr4(const struct ctx *c, struct udp4_l2_buf_t *b,
  * @c:		Execution context
  * @b:		Pointer to udp6_l2_buf to update
  * @dstport:	Destination port number
- * @datalen:	Length of UDP payload
+ * @plen:	Length of UDP payload
  * @now:	Current timestamp
  *
  * Return: size of tap frame with headers
  */
 static size_t udp_update_hdr6(const struct ctx *c, struct udp6_l2_buf_t *b,
-			      in_port_t dstport, size_t datalen,
+			      in_port_t dstport, size_t plen,
 			      const struct timespec *now)
 {
 	const struct in6_addr *src = &b->s_in6.sin6_addr;
 	const struct in6_addr *dst = &c->ip6.addr_seen;
-	uint16_t payload_len = datalen + sizeof(b->uh);
 	in_port_t srcport = ntohs(b->s_in6.sin6_port);
+	uint16_t l4len = plen + sizeof(b->uh);
 
 	if (IN6_IS_ADDR_LINKLOCAL(src)) {
 		dst = &c->ip6.addr_ll_seen;
@@ -668,7 +668,7 @@ static size_t udp_update_hdr6(const struct ctx *c, struct udp6_l2_buf_t *b,
 
 	}
 
-	b->ip6h.payload_len = htons(payload_len);
+	b->ip6h.payload_len = htons(l4len);
 	b->ip6h.daddr = *dst;
 	b->ip6h.saddr = *src;
 	b->ip6h.version = 6;
@@ -678,9 +678,9 @@ static size_t udp_update_hdr6(const struct ctx *c, struct udp6_l2_buf_t *b,
 	b->uh.source = b->s_in6.sin6_port;
 	b->uh.dest = htons(dstport);
 	b->uh.len = b->ip6h.payload_len;
-	csum_udp6(&b->uh, src, dst, b->data, datalen);
+	csum_udp6(&b->uh, src, dst, b->data, plen);
 
-	return tap_frame_len(c, &b->taph, payload_len + sizeof(b->ip6h)
+	return tap_frame_len(c, &b->taph, l4len + sizeof(b->ip6h)
 			     + sizeof(b->eh));
 }
 
-- 
@@ -570,16 +570,16 @@ static void udp_splice_sendfrom(const struct ctx *c, unsigned start, unsigned n,
  * @c:		Execution context
  * @b:		Pointer to udp4_l2_buf to update
  * @dstport:	Destination port number
- * @datalen:	Length of UDP payload
+ * @plen:	Length of UDP payload
  * @now:	Current timestamp
  *
  * Return: size of tap frame with headers
  */
 static size_t udp_update_hdr4(const struct ctx *c, struct udp4_l2_buf_t *b,
-			      in_port_t dstport, size_t datalen,
+			      in_port_t dstport, size_t plen,
 			      const struct timespec *now)
 {
-	size_t ip_len = datalen + sizeof(b->iph) + sizeof(b->uh);
+	size_t l3len = plen + sizeof(b->iph) + sizeof(b->uh);
 	in_port_t srcport = ntohs(b->s_in.sin_port);
 	struct in_addr src = b->s_in.sin_addr;
 
@@ -602,7 +602,7 @@ static size_t udp_update_hdr4(const struct ctx *c, struct udp4_l2_buf_t *b,
 		src = c->ip4.gw;
 	}
 
-	b->iph.tot_len = htons(ip_len);
+	b->iph.tot_len = htons(l3len);
 	b->iph.daddr = c->ip4.addr_seen.s_addr;
 	b->iph.saddr = src.s_addr;
 	b->iph.check = csum_ip4_header(b->iph.tot_len, IPPROTO_UDP,
@@ -610,9 +610,9 @@ static size_t udp_update_hdr4(const struct ctx *c, struct udp4_l2_buf_t *b,
 
 	b->uh.source = b->s_in.sin_port;
 	b->uh.dest = htons(dstport);
-	b->uh.len = htons(datalen + sizeof(b->uh));
+	b->uh.len = htons(plen + sizeof(b->uh));
 
-	return tap_frame_len(c, &b->taph, ip_len + sizeof(b->eh));
+	return tap_frame_len(c, &b->taph, l3len + sizeof(b->eh));
 }
 
 /**
@@ -620,19 +620,19 @@ static size_t udp_update_hdr4(const struct ctx *c, struct udp4_l2_buf_t *b,
  * @c:		Execution context
  * @b:		Pointer to udp6_l2_buf to update
  * @dstport:	Destination port number
- * @datalen:	Length of UDP payload
+ * @plen:	Length of UDP payload
  * @now:	Current timestamp
  *
  * Return: size of tap frame with headers
  */
 static size_t udp_update_hdr6(const struct ctx *c, struct udp6_l2_buf_t *b,
-			      in_port_t dstport, size_t datalen,
+			      in_port_t dstport, size_t plen,
 			      const struct timespec *now)
 {
 	const struct in6_addr *src = &b->s_in6.sin6_addr;
 	const struct in6_addr *dst = &c->ip6.addr_seen;
-	uint16_t payload_len = datalen + sizeof(b->uh);
 	in_port_t srcport = ntohs(b->s_in6.sin6_port);
+	uint16_t l4len = plen + sizeof(b->uh);
 
 	if (IN6_IS_ADDR_LINKLOCAL(src)) {
 		dst = &c->ip6.addr_ll_seen;
@@ -668,7 +668,7 @@ static size_t udp_update_hdr6(const struct ctx *c, struct udp6_l2_buf_t *b,
 
 	}
 
-	b->ip6h.payload_len = htons(payload_len);
+	b->ip6h.payload_len = htons(l4len);
 	b->ip6h.daddr = *dst;
 	b->ip6h.saddr = *src;
 	b->ip6h.version = 6;
@@ -678,9 +678,9 @@ static size_t udp_update_hdr6(const struct ctx *c, struct udp6_l2_buf_t *b,
 	b->uh.source = b->s_in6.sin6_port;
 	b->uh.dest = htons(dstport);
 	b->uh.len = b->ip6h.payload_len;
-	csum_udp6(&b->uh, src, dst, b->data, datalen);
+	csum_udp6(&b->uh, src, dst, b->data, plen);
 
-	return tap_frame_len(c, &b->taph, payload_len + sizeof(b->ip6h)
+	return tap_frame_len(c, &b->taph, l4len + sizeof(b->ip6h)
 			     + sizeof(b->eh));
 }
 
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/7] tcp: Simplify packet length calculation when preparing headers
  2024-04-29  7:09 [PATCH 0/7] Small improvements to IOV handling David Gibson
                   ` (2 preceding siblings ...)
  2024-04-29  7:09 ` [PATCH 3/7] treewide: Standardise variable names for various packet lengths David Gibson
@ 2024-04-29  7:09 ` David Gibson
  2024-04-29  7:09 ` [PATCH 5/7] tap, tcp: (Re-)abstract TAP specific header handling David Gibson
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: David Gibson @ 2024-04-29  7:09 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: Laurent Vivier, David Gibson

tcp_fill_headers[46]() compute the L3 packet length from the L4 packet
length, then their caller tcp_l2_buf_fill_headers() converts it back to the
L4 packet length.  We can just use the L4 length throughout.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>eewwee
---
 tcp.c | 26 ++++++++++----------------
 1 file changed, 10 insertions(+), 16 deletions(-)

diff --git a/tcp.c b/tcp.c
index 23d408f2..08899367 100644
--- a/tcp.c
+++ b/tcp.c
@@ -1341,7 +1341,7 @@ static void tcp_fill_header(struct tcphdr *th,
  * @check:	Checksum, if already known
  * @seq:	Sequence number for this segment
  *
- * Return: The total length of the IPv4 packet, host order
+ * Return: The IPv4 payload length, host order
  */
 static size_t tcp_fill_headers4(const struct ctx *c,
 				const struct tcp_tap_conn *conn,
@@ -1367,7 +1367,7 @@ static size_t tcp_fill_headers4(const struct ctx *c,
 
 	tcp_update_check_tcp4(iph, th);
 
-	return l3len;
+	return l4len;
 }
 
 /**
@@ -1380,7 +1380,7 @@ static size_t tcp_fill_headers4(const struct ctx *c,
  * @check:	Checksum, if already known
  * @seq:	Sequence number for this segment
  *
- * Return: The total length of the IPv6 packet, host order
+ * Return: The IPv6 payload length, host order
  */
 static size_t tcp_fill_headers6(const struct ctx *c,
 				const struct tcp_tap_conn *conn,
@@ -1388,7 +1388,6 @@ static size_t tcp_fill_headers6(const struct ctx *c,
 				size_t plen, uint32_t seq)
 {
 	size_t l4len = plen + sizeof(*th);
-	size_t l3len = l4len + sizeof(*ip6h);
 
 	ip6h->payload_len = htons(l4len);
 	ip6h->saddr = conn->faddr.a6;
@@ -1409,7 +1408,7 @@ static size_t tcp_fill_headers6(const struct ctx *c,
 
 	tcp_update_check_tcp6(ip6h, th);
 
-	return l3len;
+	return l4len;
 }
 
 /**
@@ -1429,21 +1428,16 @@ static size_t tcp_l2_buf_fill_headers(const struct ctx *c,
 				      const uint16_t *check, uint32_t seq)
 {
 	const struct in_addr *a4 = inany_v4(&conn->faddr);
-	size_t l3len, l4len;
 
 	if (a4) {
-		l3len = tcp_fill_headers4(c, conn, iov[TCP_IOV_IP].iov_base,
-					   iov[TCP_IOV_PAYLOAD].iov_base, plen,
-					   check, seq);
-		l4len = l3len - sizeof(struct iphdr);
-	} else {
-		l3len = tcp_fill_headers6(c, conn, iov[TCP_IOV_IP].iov_base,
-					   iov[TCP_IOV_PAYLOAD].iov_base, plen,
-					   seq);
-		l4len = l3len - sizeof(struct ipv6hdr);
+		return tcp_fill_headers4(c, conn, iov[TCP_IOV_IP].iov_base,
+					 iov[TCP_IOV_PAYLOAD].iov_base, plen,
+					 check, seq);
 	}
 
-	return l4len;
+	return tcp_fill_headers6(c, conn, iov[TCP_IOV_IP].iov_base,
+				 iov[TCP_IOV_PAYLOAD].iov_base, plen,
+				 seq);
 }
 
 /**
-- 
@@ -1341,7 +1341,7 @@ static void tcp_fill_header(struct tcphdr *th,
  * @check:	Checksum, if already known
  * @seq:	Sequence number for this segment
  *
- * Return: The total length of the IPv4 packet, host order
+ * Return: The IPv4 payload length, host order
  */
 static size_t tcp_fill_headers4(const struct ctx *c,
 				const struct tcp_tap_conn *conn,
@@ -1367,7 +1367,7 @@ static size_t tcp_fill_headers4(const struct ctx *c,
 
 	tcp_update_check_tcp4(iph, th);
 
-	return l3len;
+	return l4len;
 }
 
 /**
@@ -1380,7 +1380,7 @@ static size_t tcp_fill_headers4(const struct ctx *c,
  * @check:	Checksum, if already known
  * @seq:	Sequence number for this segment
  *
- * Return: The total length of the IPv6 packet, host order
+ * Return: The IPv6 payload length, host order
  */
 static size_t tcp_fill_headers6(const struct ctx *c,
 				const struct tcp_tap_conn *conn,
@@ -1388,7 +1388,6 @@ static size_t tcp_fill_headers6(const struct ctx *c,
 				size_t plen, uint32_t seq)
 {
 	size_t l4len = plen + sizeof(*th);
-	size_t l3len = l4len + sizeof(*ip6h);
 
 	ip6h->payload_len = htons(l4len);
 	ip6h->saddr = conn->faddr.a6;
@@ -1409,7 +1408,7 @@ static size_t tcp_fill_headers6(const struct ctx *c,
 
 	tcp_update_check_tcp6(ip6h, th);
 
-	return l3len;
+	return l4len;
 }
 
 /**
@@ -1429,21 +1428,16 @@ static size_t tcp_l2_buf_fill_headers(const struct ctx *c,
 				      const uint16_t *check, uint32_t seq)
 {
 	const struct in_addr *a4 = inany_v4(&conn->faddr);
-	size_t l3len, l4len;
 
 	if (a4) {
-		l3len = tcp_fill_headers4(c, conn, iov[TCP_IOV_IP].iov_base,
-					   iov[TCP_IOV_PAYLOAD].iov_base, plen,
-					   check, seq);
-		l4len = l3len - sizeof(struct iphdr);
-	} else {
-		l3len = tcp_fill_headers6(c, conn, iov[TCP_IOV_IP].iov_base,
-					   iov[TCP_IOV_PAYLOAD].iov_base, plen,
-					   seq);
-		l4len = l3len - sizeof(struct ipv6hdr);
+		return tcp_fill_headers4(c, conn, iov[TCP_IOV_IP].iov_base,
+					 iov[TCP_IOV_PAYLOAD].iov_base, plen,
+					 check, seq);
 	}
 
-	return l4len;
+	return tcp_fill_headers6(c, conn, iov[TCP_IOV_IP].iov_base,
+				 iov[TCP_IOV_PAYLOAD].iov_base, plen,
+				 seq);
 }
 
 /**
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 5/7] tap, tcp: (Re-)abstract TAP specific header handling
  2024-04-29  7:09 [PATCH 0/7] Small improvements to IOV handling David Gibson
                   ` (3 preceding siblings ...)
  2024-04-29  7:09 ` [PATCH 4/7] tcp: Simplify packet length calculation when preparing headers David Gibson
@ 2024-04-29  7:09 ` David Gibson
  2024-04-30 18:47   ` Stefano Brivio
  2024-04-29  7:09 ` [PATCH 6/7] iov: Helper macro to construct iovs covering existing variables or fields David Gibson
  2024-04-29  7:09 ` [PATCH 7/7] tcp: Update tap specific header too in tcp_fill_headers[46]() David Gibson
  6 siblings, 1 reply; 18+ messages in thread
From: David Gibson @ 2024-04-29  7:09 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: Laurent Vivier, David Gibson

Recent changes to the TCP code (reworking of the buffer handling) have
meant that it now (again) deals explicitly with the MODE_PASST specific
vnet_len field, instead of using the (partial) abstractions provided by the
tap layer.

The abstractions we had don't work for the new TCP structure, so make some
new ones that do: tap_hdr_iov() which constructs an iovec suitable for
containing (just) the TAP specific header and tap_hdr_update() which
updates it as necessary per-packet.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 tap.h | 27 +++++++++++++++++++++++++++
 tcp.c | 40 +++++++++++++++-------------------------
 2 files changed, 42 insertions(+), 25 deletions(-)

diff --git a/tap.h b/tap.h
index dbc23b31..75aa3f03 100644
--- a/tap.h
+++ b/tap.h
@@ -16,6 +16,33 @@ struct tap_hdr {
 	uint32_t vnet_len;
 } __attribute__((packed));
 
+/**
+ * tap_hdr_iov() - struct iovec for a tap header
+ * @c:		Execution context
+ * @taph:	Pointer to tap specific header buffer
+ *
+ * Returns: A struct iovec covering the correct portion of @taph to use as the
+ *          TAP specific header in the current configuration.
+ */
+static inline struct iovec tap_hdr_iov(const struct ctx *c,
+				       struct tap_hdr *thdr)
+{
+	return (struct iovec){
+		.iov_base = thdr,
+		.iov_len = c->mode == MODE_PASST ? sizeof(*thdr) : 0,
+	};
+}
+
+/**
+ * tap_hdr_update() - Update the tap specific header for a frame
+ * @taph:	Tap specific header buffer to update
+ * @l2len:	Frame length (including L2 headers)
+ */
+static inline void tap_hdr_update(struct tap_hdr *thdr, size_t l2len)
+{
+	thdr->vnet_len = htonl(l2len);
+}
+
 static inline size_t tap_hdr_len_(const struct ctx *c)
 {
 	if (c->mode == MODE_PASST)
diff --git a/tcp.c b/tcp.c
index 08899367..ad409fc5 100644
--- a/tcp.c
+++ b/tcp.c
@@ -452,7 +452,7 @@ struct tcp_flags_t {
 /* Ethernet header for IPv4 frames */
 static struct ethhdr		tcp4_eth_src;
 
-static uint32_t			tcp4_payload_vnet_len[TCP_FRAMES_MEM];
+static struct tap_hdr		tcp4_payload_tap_hdr[TCP_FRAMES_MEM];
 /* IPv4 headers */
 static struct iphdr		tcp4_payload_ip[TCP_FRAMES_MEM];
 /* TCP segments with payload for IPv4 frames */
@@ -463,7 +463,7 @@ static_assert(MSS4 <= sizeof(tcp4_payload[0].data), "MSS4 is greater than 65516"
 static struct tcp_buf_seq_update tcp4_seq_update[TCP_FRAMES_MEM];
 static unsigned int tcp4_payload_used;
 
-static uint32_t			tcp4_flags_vnet_len[TCP_FRAMES_MEM];
+static struct tap_hdr		tcp4_flags_tap_hdr[TCP_FRAMES_MEM];
 /* IPv4 headers for TCP segment without payload */
 static struct iphdr		tcp4_flags_ip[TCP_FRAMES_MEM];
 /* TCP segments without payload for IPv4 frames */
@@ -474,7 +474,7 @@ static unsigned int tcp4_flags_used;
 /* Ethernet header for IPv6 frames */
 static struct ethhdr		tcp6_eth_src;
 
-static uint32_t			tcp6_payload_vnet_len[TCP_FRAMES_MEM];
+static struct tap_hdr		tcp6_payload_tap_hdr[TCP_FRAMES_MEM];
 /* IPv6 headers */
 static struct ipv6hdr		tcp6_payload_ip[TCP_FRAMES_MEM];
 /* TCP headers and data for IPv6 frames */
@@ -485,7 +485,7 @@ static_assert(MSS6 <= sizeof(tcp6_payload[0].data), "MSS6 is greater than 65516"
 static struct tcp_buf_seq_update tcp6_seq_update[TCP_FRAMES_MEM];
 static unsigned int tcp6_payload_used;
 
-static uint32_t			tcp6_flags_vnet_len[TCP_FRAMES_MEM];
+static struct tap_hdr		tcp6_flags_tap_hdr[TCP_FRAMES_MEM];
 /* IPv6 headers for TCP segment without payload */
 static struct ipv6hdr		tcp6_flags_ip[TCP_FRAMES_MEM];
 /* TCP segment without payload for IPv6 frames */
@@ -499,14 +499,14 @@ static struct iovec	iov_sock		[TCP_FRAMES_MEM + 1];
 
 /*
  * enum tcp_iov_parts - I/O vector parts for one TCP frame
- * @TCP_IOV_VLEN	virtio net header
+ * @TCP_IOV_TAP		tap backend specific header
  * @TCP_IOV_ETH		Ethernet header
  * @TCP_IOV_IP		IP (v4/v6) header
  * @TCP_IOV_PAYLOAD	IP payload (TCP header + data)
  * @TCP_NUM_IOVS 	the number of entries in the iovec array
  */
 enum tcp_iov_parts {
-	TCP_IOV_VLEN	= 0,
+	TCP_IOV_TAP	= 0,
 	TCP_IOV_ETH	= 1,
 	TCP_IOV_IP	= 2,
 	TCP_IOV_PAYLOAD	= 3,
@@ -953,9 +953,7 @@ static void tcp_sock4_iov_init(const struct ctx *c)
 	for (i = 0; i < TCP_FRAMES_MEM; i++) {
 		iov = tcp4_l2_iov[i];
 
-		iov[TCP_IOV_VLEN].iov_base = &tcp4_payload_vnet_len[i];
-		iov[TCP_IOV_VLEN].iov_len = c->mode == MODE_PASTA ? 0 :
-				           sizeof(tcp4_payload_vnet_len[i]);
+		iov[TCP_IOV_TAP] = tap_hdr_iov(c, &tcp4_payload_tap_hdr[i]);
 		iov[TCP_IOV_ETH].iov_base = &tcp4_eth_src;
 		iov[TCP_IOV_ETH].iov_len = sizeof(tcp4_eth_src);
 		iov[TCP_IOV_IP].iov_base = &tcp4_payload_ip[i];
@@ -966,9 +964,7 @@ static void tcp_sock4_iov_init(const struct ctx *c)
 	for (i = 0; i < TCP_FRAMES_MEM; i++) {
 		iov = tcp4_l2_flags_iov[i];
 
-		iov[TCP_IOV_VLEN].iov_base = &tcp4_flags_vnet_len[i];
-		iov[TCP_IOV_VLEN].iov_len = c->mode == MODE_PASTA ? 0 :
-					     sizeof(tcp4_flags_vnet_len[i]);
+		iov[TCP_IOV_TAP] = tap_hdr_iov(c, &tcp4_flags_tap_hdr[i]);
 		iov[TCP_IOV_ETH].iov_base = &tcp4_eth_src;
 		iov[TCP_IOV_ETH].iov_len = sizeof(tcp4_eth_src);
 		iov[TCP_IOV_IP].iov_base = &tcp4_flags_ip[i];
@@ -1004,9 +1000,7 @@ static void tcp_sock6_iov_init(const struct ctx *c)
 	for (i = 0; i < TCP_FRAMES_MEM; i++) {
 		iov = tcp6_l2_iov[i];
 
-		iov[TCP_IOV_VLEN].iov_base = &tcp6_payload_vnet_len[i];
-		iov[TCP_IOV_VLEN].iov_len = c->mode == MODE_PASTA ? 0 :
-					   sizeof(tcp6_payload_vnet_len[i]);
+		iov[TCP_IOV_TAP] = tap_hdr_iov(c, &tcp6_payload_tap_hdr[i]);
 		iov[TCP_IOV_ETH].iov_base = &tcp6_eth_src;
 		iov[TCP_IOV_ETH].iov_len = sizeof(tcp6_eth_src);
 		iov[TCP_IOV_IP].iov_base = &tcp6_payload_ip[i];
@@ -1017,9 +1011,7 @@ static void tcp_sock6_iov_init(const struct ctx *c)
 	for (i = 0; i < TCP_FRAMES_MEM; i++) {
 		iov = tcp6_l2_flags_iov[i];
 
-		iov[TCP_IOV_VLEN].iov_base = &tcp6_flags_vnet_len[i];
-		iov[TCP_IOV_VLEN].iov_len = c->mode == MODE_PASTA ? 0 :
-					     sizeof(tcp6_flags_vnet_len[i]);
+		iov[TCP_IOV_TAP] = tap_hdr_iov(c, &tcp6_flags_tap_hdr[i]);
 		iov[TCP_IOV_ETH].iov_base = &tcp6_eth_src;
 		iov[TCP_IOV_ETH].iov_len = sizeof(tcp6_eth_src);
 		iov[TCP_IOV_IP].iov_base = &tcp6_flags_ip[i];
@@ -1658,7 +1650,7 @@ static int tcp_send_flag(struct ctx *c, struct tcp_tap_conn *conn, int flags)
 					conn->seq_to_tap);
 	iov[TCP_IOV_PAYLOAD].iov_len = l4len;
 
-	*(uint32_t *)iov[TCP_IOV_VLEN].iov_base = htonl(vnet_len + l4len);
+	tap_hdr_update(iov[TCP_IOV_TAP].iov_base, vnet_len + l4len);
 
 	if (th->ack) {
 		if (SEQ_GE(conn->seq_ack_to_tap, conn->seq_from_tap))
@@ -2141,7 +2133,6 @@ static void tcp_data_to_tap(const struct ctx *c, struct tcp_tap_conn *conn,
 {
 	uint32_t *seq_update = &conn->seq_to_tap;
 	struct iovec *iov;
-	uint32_t vnet_len;
 	size_t l4len;
 
 	if (CONN_V4(conn)) {
@@ -2159,8 +2150,8 @@ static void tcp_data_to_tap(const struct ctx *c, struct tcp_tap_conn *conn,
 		iov = tcp4_l2_iov[tcp4_payload_used++];
 		l4len = tcp_l2_buf_fill_headers(c, conn, iov, plen, check, seq);
 		iov[TCP_IOV_PAYLOAD].iov_len = l4len;
-		vnet_len = sizeof(struct ethhdr) + sizeof(struct iphdr) + l4len;
-		*(uint32_t *)iov[TCP_IOV_VLEN].iov_base = htonl(vnet_len);
+		tap_hdr_update(iov[TCP_IOV_TAP].iov_base, l4len
+			       + sizeof(struct iphdr) + sizeof(struct ethhdr));
 		if (tcp4_payload_used > TCP_FRAMES_MEM - 1)
 			tcp_payload_flush(c);
 	} else if (CONN_V6(conn)) {
@@ -2170,9 +2161,8 @@ static void tcp_data_to_tap(const struct ctx *c, struct tcp_tap_conn *conn,
 		iov = tcp6_l2_iov[tcp6_payload_used++];
 		l4len = tcp_l2_buf_fill_headers(c, conn, iov, plen, NULL, seq);
 		iov[TCP_IOV_PAYLOAD].iov_len = l4len;
-		vnet_len = sizeof(struct ethhdr) + sizeof(struct ipv6hdr)
-			+ l4len;
-		*(uint32_t *)iov[TCP_IOV_VLEN].iov_base = htonl(vnet_len);
+		tap_hdr_update(iov[TCP_IOV_TAP].iov_base, l4len
+			       + sizeof(struct ipv6hdr) + sizeof(struct ethhdr));
 		if (tcp6_payload_used > TCP_FRAMES_MEM - 1)
 			tcp_payload_flush(c);
 	}
-- 
@@ -452,7 +452,7 @@ struct tcp_flags_t {
 /* Ethernet header for IPv4 frames */
 static struct ethhdr		tcp4_eth_src;
 
-static uint32_t			tcp4_payload_vnet_len[TCP_FRAMES_MEM];
+static struct tap_hdr		tcp4_payload_tap_hdr[TCP_FRAMES_MEM];
 /* IPv4 headers */
 static struct iphdr		tcp4_payload_ip[TCP_FRAMES_MEM];
 /* TCP segments with payload for IPv4 frames */
@@ -463,7 +463,7 @@ static_assert(MSS4 <= sizeof(tcp4_payload[0].data), "MSS4 is greater than 65516"
 static struct tcp_buf_seq_update tcp4_seq_update[TCP_FRAMES_MEM];
 static unsigned int tcp4_payload_used;
 
-static uint32_t			tcp4_flags_vnet_len[TCP_FRAMES_MEM];
+static struct tap_hdr		tcp4_flags_tap_hdr[TCP_FRAMES_MEM];
 /* IPv4 headers for TCP segment without payload */
 static struct iphdr		tcp4_flags_ip[TCP_FRAMES_MEM];
 /* TCP segments without payload for IPv4 frames */
@@ -474,7 +474,7 @@ static unsigned int tcp4_flags_used;
 /* Ethernet header for IPv6 frames */
 static struct ethhdr		tcp6_eth_src;
 
-static uint32_t			tcp6_payload_vnet_len[TCP_FRAMES_MEM];
+static struct tap_hdr		tcp6_payload_tap_hdr[TCP_FRAMES_MEM];
 /* IPv6 headers */
 static struct ipv6hdr		tcp6_payload_ip[TCP_FRAMES_MEM];
 /* TCP headers and data for IPv6 frames */
@@ -485,7 +485,7 @@ static_assert(MSS6 <= sizeof(tcp6_payload[0].data), "MSS6 is greater than 65516"
 static struct tcp_buf_seq_update tcp6_seq_update[TCP_FRAMES_MEM];
 static unsigned int tcp6_payload_used;
 
-static uint32_t			tcp6_flags_vnet_len[TCP_FRAMES_MEM];
+static struct tap_hdr		tcp6_flags_tap_hdr[TCP_FRAMES_MEM];
 /* IPv6 headers for TCP segment without payload */
 static struct ipv6hdr		tcp6_flags_ip[TCP_FRAMES_MEM];
 /* TCP segment without payload for IPv6 frames */
@@ -499,14 +499,14 @@ static struct iovec	iov_sock		[TCP_FRAMES_MEM + 1];
 
 /*
  * enum tcp_iov_parts - I/O vector parts for one TCP frame
- * @TCP_IOV_VLEN	virtio net header
+ * @TCP_IOV_TAP		tap backend specific header
  * @TCP_IOV_ETH		Ethernet header
  * @TCP_IOV_IP		IP (v4/v6) header
  * @TCP_IOV_PAYLOAD	IP payload (TCP header + data)
  * @TCP_NUM_IOVS 	the number of entries in the iovec array
  */
 enum tcp_iov_parts {
-	TCP_IOV_VLEN	= 0,
+	TCP_IOV_TAP	= 0,
 	TCP_IOV_ETH	= 1,
 	TCP_IOV_IP	= 2,
 	TCP_IOV_PAYLOAD	= 3,
@@ -953,9 +953,7 @@ static void tcp_sock4_iov_init(const struct ctx *c)
 	for (i = 0; i < TCP_FRAMES_MEM; i++) {
 		iov = tcp4_l2_iov[i];
 
-		iov[TCP_IOV_VLEN].iov_base = &tcp4_payload_vnet_len[i];
-		iov[TCP_IOV_VLEN].iov_len = c->mode == MODE_PASTA ? 0 :
-				           sizeof(tcp4_payload_vnet_len[i]);
+		iov[TCP_IOV_TAP] = tap_hdr_iov(c, &tcp4_payload_tap_hdr[i]);
 		iov[TCP_IOV_ETH].iov_base = &tcp4_eth_src;
 		iov[TCP_IOV_ETH].iov_len = sizeof(tcp4_eth_src);
 		iov[TCP_IOV_IP].iov_base = &tcp4_payload_ip[i];
@@ -966,9 +964,7 @@ static void tcp_sock4_iov_init(const struct ctx *c)
 	for (i = 0; i < TCP_FRAMES_MEM; i++) {
 		iov = tcp4_l2_flags_iov[i];
 
-		iov[TCP_IOV_VLEN].iov_base = &tcp4_flags_vnet_len[i];
-		iov[TCP_IOV_VLEN].iov_len = c->mode == MODE_PASTA ? 0 :
-					     sizeof(tcp4_flags_vnet_len[i]);
+		iov[TCP_IOV_TAP] = tap_hdr_iov(c, &tcp4_flags_tap_hdr[i]);
 		iov[TCP_IOV_ETH].iov_base = &tcp4_eth_src;
 		iov[TCP_IOV_ETH].iov_len = sizeof(tcp4_eth_src);
 		iov[TCP_IOV_IP].iov_base = &tcp4_flags_ip[i];
@@ -1004,9 +1000,7 @@ static void tcp_sock6_iov_init(const struct ctx *c)
 	for (i = 0; i < TCP_FRAMES_MEM; i++) {
 		iov = tcp6_l2_iov[i];
 
-		iov[TCP_IOV_VLEN].iov_base = &tcp6_payload_vnet_len[i];
-		iov[TCP_IOV_VLEN].iov_len = c->mode == MODE_PASTA ? 0 :
-					   sizeof(tcp6_payload_vnet_len[i]);
+		iov[TCP_IOV_TAP] = tap_hdr_iov(c, &tcp6_payload_tap_hdr[i]);
 		iov[TCP_IOV_ETH].iov_base = &tcp6_eth_src;
 		iov[TCP_IOV_ETH].iov_len = sizeof(tcp6_eth_src);
 		iov[TCP_IOV_IP].iov_base = &tcp6_payload_ip[i];
@@ -1017,9 +1011,7 @@ static void tcp_sock6_iov_init(const struct ctx *c)
 	for (i = 0; i < TCP_FRAMES_MEM; i++) {
 		iov = tcp6_l2_flags_iov[i];
 
-		iov[TCP_IOV_VLEN].iov_base = &tcp6_flags_vnet_len[i];
-		iov[TCP_IOV_VLEN].iov_len = c->mode == MODE_PASTA ? 0 :
-					     sizeof(tcp6_flags_vnet_len[i]);
+		iov[TCP_IOV_TAP] = tap_hdr_iov(c, &tcp6_flags_tap_hdr[i]);
 		iov[TCP_IOV_ETH].iov_base = &tcp6_eth_src;
 		iov[TCP_IOV_ETH].iov_len = sizeof(tcp6_eth_src);
 		iov[TCP_IOV_IP].iov_base = &tcp6_flags_ip[i];
@@ -1658,7 +1650,7 @@ static int tcp_send_flag(struct ctx *c, struct tcp_tap_conn *conn, int flags)
 					conn->seq_to_tap);
 	iov[TCP_IOV_PAYLOAD].iov_len = l4len;
 
-	*(uint32_t *)iov[TCP_IOV_VLEN].iov_base = htonl(vnet_len + l4len);
+	tap_hdr_update(iov[TCP_IOV_TAP].iov_base, vnet_len + l4len);
 
 	if (th->ack) {
 		if (SEQ_GE(conn->seq_ack_to_tap, conn->seq_from_tap))
@@ -2141,7 +2133,6 @@ static void tcp_data_to_tap(const struct ctx *c, struct tcp_tap_conn *conn,
 {
 	uint32_t *seq_update = &conn->seq_to_tap;
 	struct iovec *iov;
-	uint32_t vnet_len;
 	size_t l4len;
 
 	if (CONN_V4(conn)) {
@@ -2159,8 +2150,8 @@ static void tcp_data_to_tap(const struct ctx *c, struct tcp_tap_conn *conn,
 		iov = tcp4_l2_iov[tcp4_payload_used++];
 		l4len = tcp_l2_buf_fill_headers(c, conn, iov, plen, check, seq);
 		iov[TCP_IOV_PAYLOAD].iov_len = l4len;
-		vnet_len = sizeof(struct ethhdr) + sizeof(struct iphdr) + l4len;
-		*(uint32_t *)iov[TCP_IOV_VLEN].iov_base = htonl(vnet_len);
+		tap_hdr_update(iov[TCP_IOV_TAP].iov_base, l4len
+			       + sizeof(struct iphdr) + sizeof(struct ethhdr));
 		if (tcp4_payload_used > TCP_FRAMES_MEM - 1)
 			tcp_payload_flush(c);
 	} else if (CONN_V6(conn)) {
@@ -2170,9 +2161,8 @@ static void tcp_data_to_tap(const struct ctx *c, struct tcp_tap_conn *conn,
 		iov = tcp6_l2_iov[tcp6_payload_used++];
 		l4len = tcp_l2_buf_fill_headers(c, conn, iov, plen, NULL, seq);
 		iov[TCP_IOV_PAYLOAD].iov_len = l4len;
-		vnet_len = sizeof(struct ethhdr) + sizeof(struct ipv6hdr)
-			+ l4len;
-		*(uint32_t *)iov[TCP_IOV_VLEN].iov_base = htonl(vnet_len);
+		tap_hdr_update(iov[TCP_IOV_TAP].iov_base, l4len
+			       + sizeof(struct ipv6hdr) + sizeof(struct ethhdr));
 		if (tcp6_payload_used > TCP_FRAMES_MEM - 1)
 			tcp_payload_flush(c);
 	}
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 6/7] iov: Helper macro to construct iovs covering existing variables or fields
  2024-04-29  7:09 [PATCH 0/7] Small improvements to IOV handling David Gibson
                   ` (4 preceding siblings ...)
  2024-04-29  7:09 ` [PATCH 5/7] tap, tcp: (Re-)abstract TAP specific header handling David Gibson
@ 2024-04-29  7:09 ` David Gibson
  2024-04-30 18:47   ` Stefano Brivio
  2024-04-29  7:09 ` [PATCH 7/7] tcp: Update tap specific header too in tcp_fill_headers[46]() David Gibson
  6 siblings, 1 reply; 18+ messages in thread
From: David Gibson @ 2024-04-29  7:09 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: Laurent Vivier, David Gibson

Laurent's recent changes mean we use IO vectors much more heavily in the
TCP code.  In many of those cases, and few others around the code base,
individual iovs of these vectors are constructed to exactly cover existing
variables or fields.  We can make initializing such iovs shorter and
clearer with a macro for the purpose.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 iov.h |  3 +++
 tap.c |  3 +--
 tcp.c | 24 +++++++++---------------
 udp.c |  7 +++----
 4 files changed, 16 insertions(+), 21 deletions(-)

diff --git a/iov.h b/iov.h
index 6058af77..5668ca5f 100644
--- a/iov.h
+++ b/iov.h
@@ -18,6 +18,9 @@
 #include <unistd.h>
 #include <string.h>
 
+#define IOV_OF_LVALUE(lval) \
+	(struct iovec){ .iov_base = &(lval), .iov_len = sizeof(lval) }
+
 size_t iov_skip_bytes(const struct iovec *iov, size_t n,
 		      size_t skip, size_t *offset);
 size_t iov_from_buf(const struct iovec *iov, size_t iov_cnt,
diff --git a/tap.c b/tap.c
index 6a08cca6..5db7b169 100644
--- a/tap.c
+++ b/tap.c
@@ -79,8 +79,7 @@ void tap_send_single(const struct ctx *c, const void *data, size_t len)
 	size_t iovcnt = 0;
 
 	if (c->mode == MODE_PASST) {
-		iov[iovcnt].iov_base = &vnet_len;
-		iov[iovcnt].iov_len = sizeof(vnet_len);
+		iov[iovcnt] = IOV_OF_LVALUE(vnet_len);
 		iovcnt++;
 	}
 
diff --git a/tcp.c b/tcp.c
index ad409fc5..27c06958 100644
--- a/tcp.c
+++ b/tcp.c
@@ -290,6 +290,7 @@
 
 #include "checksum.h"
 #include "util.h"
+#include "iov.h"
 #include "ip.h"
 #include "passt.h"
 #include "tap.h"
@@ -954,10 +955,8 @@ static void tcp_sock4_iov_init(const struct ctx *c)
 		iov = tcp4_l2_iov[i];
 
 		iov[TCP_IOV_TAP] = tap_hdr_iov(c, &tcp4_payload_tap_hdr[i]);
-		iov[TCP_IOV_ETH].iov_base = &tcp4_eth_src;
-		iov[TCP_IOV_ETH].iov_len = sizeof(tcp4_eth_src);
-		iov[TCP_IOV_IP].iov_base = &tcp4_payload_ip[i];
-		iov[TCP_IOV_IP].iov_len = sizeof(tcp4_payload_ip[i]);
+		iov[TCP_IOV_ETH] = IOV_OF_LVALUE(tcp4_eth_src);
+		iov[TCP_IOV_IP] = IOV_OF_LVALUE(tcp4_payload_ip[i]);
 		iov[TCP_IOV_PAYLOAD].iov_base = &tcp4_payload[i];
 	}
 
@@ -966,9 +965,8 @@ static void tcp_sock4_iov_init(const struct ctx *c)
 
 		iov[TCP_IOV_TAP] = tap_hdr_iov(c, &tcp4_flags_tap_hdr[i]);
 		iov[TCP_IOV_ETH].iov_base = &tcp4_eth_src;
-		iov[TCP_IOV_ETH].iov_len = sizeof(tcp4_eth_src);
-		iov[TCP_IOV_IP].iov_base = &tcp4_flags_ip[i];
-		iov[TCP_IOV_IP].iov_len = sizeof(tcp4_flags_ip[i]);
+		iov[TCP_IOV_ETH] = IOV_OF_LVALUE(tcp4_eth_src);
+		iov[TCP_IOV_IP] = IOV_OF_LVALUE(tcp4_flags_ip[i]);
 		iov[TCP_IOV_PAYLOAD].iov_base = &tcp4_flags[i];
 	}
 }
@@ -1001,10 +999,8 @@ static void tcp_sock6_iov_init(const struct ctx *c)
 		iov = tcp6_l2_iov[i];
 
 		iov[TCP_IOV_TAP] = tap_hdr_iov(c, &tcp6_payload_tap_hdr[i]);
-		iov[TCP_IOV_ETH].iov_base = &tcp6_eth_src;
-		iov[TCP_IOV_ETH].iov_len = sizeof(tcp6_eth_src);
-		iov[TCP_IOV_IP].iov_base = &tcp6_payload_ip[i];
-		iov[TCP_IOV_IP].iov_len = sizeof(tcp6_payload_ip[i]);
+		iov[TCP_IOV_ETH] = IOV_OF_LVALUE(tcp6_eth_src);
+		iov[TCP_IOV_IP] = IOV_OF_LVALUE(tcp6_payload_ip[i]);
 		iov[TCP_IOV_PAYLOAD].iov_base = &tcp6_payload[i];
 	}
 
@@ -1012,10 +1008,8 @@ static void tcp_sock6_iov_init(const struct ctx *c)
 		iov = tcp6_l2_flags_iov[i];
 
 		iov[TCP_IOV_TAP] = tap_hdr_iov(c, &tcp6_flags_tap_hdr[i]);
-		iov[TCP_IOV_ETH].iov_base = &tcp6_eth_src;
-		iov[TCP_IOV_ETH].iov_len = sizeof(tcp6_eth_src);
-		iov[TCP_IOV_IP].iov_base = &tcp6_flags_ip[i];
-		iov[TCP_IOV_IP].iov_len = sizeof(tcp6_flags_ip[i]);
+		iov[TCP_IOV_ETH] = IOV_OF_LVALUE(tcp6_eth_src);
+		iov[TCP_IOV_IP] = IOV_OF_LVALUE(tcp6_flags_ip[i]);
 		iov[TCP_IOV_PAYLOAD].iov_base = &tcp6_flags[i];
 	}
 }
diff --git a/udp.c b/udp.c
index 1a4c02f4..545212c5 100644
--- a/udp.c
+++ b/udp.c
@@ -113,6 +113,7 @@
 
 #include "checksum.h"
 #include "util.h"
+#include "iov.h"
 #include "ip.h"
 #include "siphash.h"
 #include "inany.h"
@@ -315,8 +316,7 @@ static void udp_sock4_iov_init_one(const struct ctx *c, size_t i)
 		.iph = L2_BUF_IP4_INIT(IPPROTO_UDP)
 	};
 
-	siov->iov_base	= buf->data;
-	siov->iov_len	= sizeof(buf->data);
+	*siov		= IOV_OF_LVALUE(buf->data);
 
 	mh->msg_name	= &buf->s_in;
 	mh->msg_namelen	= sizeof(buf->s_in);
@@ -343,8 +343,7 @@ static void udp_sock6_iov_init_one(const struct ctx *c, size_t i)
 		.ip6h = L2_BUF_IP6_INIT(IPPROTO_UDP)
 	};
 
-	siov->iov_base	= buf->data;
-	siov->iov_len	= sizeof(buf->data);
+	*siov		 = IOV_OF_LVALUE(buf->data);
 
 	mh->msg_name	= &buf->s_in6;
 	mh->msg_namelen	= sizeof(buf->s_in6);
-- 
@@ -113,6 +113,7 @@
 
 #include "checksum.h"
 #include "util.h"
+#include "iov.h"
 #include "ip.h"
 #include "siphash.h"
 #include "inany.h"
@@ -315,8 +316,7 @@ static void udp_sock4_iov_init_one(const struct ctx *c, size_t i)
 		.iph = L2_BUF_IP4_INIT(IPPROTO_UDP)
 	};
 
-	siov->iov_base	= buf->data;
-	siov->iov_len	= sizeof(buf->data);
+	*siov		= IOV_OF_LVALUE(buf->data);
 
 	mh->msg_name	= &buf->s_in;
 	mh->msg_namelen	= sizeof(buf->s_in);
@@ -343,8 +343,7 @@ static void udp_sock6_iov_init_one(const struct ctx *c, size_t i)
 		.ip6h = L2_BUF_IP6_INIT(IPPROTO_UDP)
 	};
 
-	siov->iov_base	= buf->data;
-	siov->iov_len	= sizeof(buf->data);
+	*siov		 = IOV_OF_LVALUE(buf->data);
 
 	mh->msg_name	= &buf->s_in6;
 	mh->msg_namelen	= sizeof(buf->s_in6);
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 7/7] tcp: Update tap specific header too in tcp_fill_headers[46]()
  2024-04-29  7:09 [PATCH 0/7] Small improvements to IOV handling David Gibson
                   ` (5 preceding siblings ...)
  2024-04-29  7:09 ` [PATCH 6/7] iov: Helper macro to construct iovs covering existing variables or fields David Gibson
@ 2024-04-29  7:09 ` David Gibson
  2024-04-30 18:48   ` Stefano Brivio
  6 siblings, 1 reply; 18+ messages in thread
From: David Gibson @ 2024-04-29  7:09 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: Laurent Vivier, David Gibson

tcp_fill_headers[46]() fill most of the headers, but the tap specific
header (the frame length for qemu sockets) is filled in afterwards.
Filling this as well:
  * Removes a little redundancy between the tcp_send_flag() and
    tcp_data_to_tap() path
  * Makes calculation of the correct length a little easier
  * Removes the now misleadingly named 'vnet_len' variable in
    tcp_send_flag()

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 tcp.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/tcp.c b/tcp.c
index 27c06958..01987c04 100644
--- a/tcp.c
+++ b/tcp.c
@@ -1321,6 +1321,7 @@ static void tcp_fill_header(struct tcphdr *th,
  * tcp_fill_headers4() - Fill 802.3, IPv4, TCP headers in pre-cooked buffers
  * @c:		Execution context
  * @conn:	Connection pointer
+ * @taph:	TAP backend specific header
  * @iph:	Pointer to IPv4 header
  * @th:		Pointer to TCP header
  * @plen:	Payload length (including TCP header options)
@@ -1331,6 +1332,7 @@ static void tcp_fill_header(struct tcphdr *th,
  */
 static size_t tcp_fill_headers4(const struct ctx *c,
 				const struct tcp_tap_conn *conn,
+				struct tap_hdr *taph,
 				struct iphdr *iph, struct tcphdr *th,
 				size_t plen, const uint16_t *check,
 				uint32_t seq)
@@ -1353,6 +1355,8 @@ static size_t tcp_fill_headers4(const struct ctx *c,
 
 	tcp_update_check_tcp4(iph, th);
 
+	tap_hdr_update(taph, l3len + sizeof(struct ethhdr));
+
 	return l4len;
 }
 
@@ -1360,6 +1364,7 @@ static size_t tcp_fill_headers4(const struct ctx *c,
  * tcp_fill_headers6() - Fill 802.3, IPv6, TCP headers in pre-cooked buffers
  * @c:		Execution context
  * @conn:	Connection pointer
+ * @taph:	TAP backend specific header
  * @ip6h:	Pointer to IPv6 header
  * @th:		Pointer to TCP header
  * @plen:	Payload length (including TCP header options)
@@ -1370,6 +1375,7 @@ static size_t tcp_fill_headers4(const struct ctx *c,
  */
 static size_t tcp_fill_headers6(const struct ctx *c,
 				const struct tcp_tap_conn *conn,
+				struct tap_hdr *taph,
 				struct ipv6hdr *ip6h, struct tcphdr *th,
 				size_t plen, uint32_t seq)
 {
@@ -1394,6 +1400,8 @@ static size_t tcp_fill_headers6(const struct ctx *c,
 
 	tcp_update_check_tcp6(ip6h, th);
 
+	tap_hdr_update(taph, l4len + sizeof(*ip6h) + sizeof(struct ethhdr));
+
 	return l4len;
 }
 
@@ -1416,12 +1424,14 @@ static size_t tcp_l2_buf_fill_headers(const struct ctx *c,
 	const struct in_addr *a4 = inany_v4(&conn->faddr);
 
 	if (a4) {
-		return tcp_fill_headers4(c, conn, iov[TCP_IOV_IP].iov_base,
+		return tcp_fill_headers4(c, conn, iov[TCP_IOV_TAP].iov_base,
+					 iov[TCP_IOV_IP].iov_base,
 					 iov[TCP_IOV_PAYLOAD].iov_base, plen,
 					 check, seq);
 	}
 
-	return tcp_fill_headers6(c, conn, iov[TCP_IOV_IP].iov_base,
+	return tcp_fill_headers6(c, conn, iov[TCP_IOV_TAP].iov_base,
+				 iov[TCP_IOV_IP].iov_base,
 				 iov[TCP_IOV_PAYLOAD].iov_base, plen,
 				 seq);
 }
@@ -1556,7 +1566,6 @@ static int tcp_send_flag(struct ctx *c, struct tcp_tap_conn *conn, int flags)
 	struct tcp_info tinfo = { 0 };
 	socklen_t sl = sizeof(tinfo);
 	int s = conn->sock;
-	uint32_t vnet_len;
 	size_t optlen = 0;
 	struct tcphdr *th;
 	struct iovec *iov;
@@ -1583,13 +1592,10 @@ static int tcp_send_flag(struct ctx *c, struct tcp_tap_conn *conn, int flags)
 	if (!tcp_update_seqack_wnd(c, conn, flags, &tinfo) && !flags)
 		return 0;
 
-	if (CONN_V4(conn)) {
+	if (CONN_V4(conn))
 		iov = tcp4_l2_flags_iov[tcp4_flags_used++];
-		vnet_len = sizeof(struct ethhdr) + sizeof(struct iphdr);
-	} else {
+	else
 		iov = tcp6_l2_flags_iov[tcp6_flags_used++];
-		vnet_len = sizeof(struct ethhdr) + sizeof(struct ipv6hdr);
-	}
 
 	payload = iov[TCP_IOV_PAYLOAD].iov_base;
 	th = &payload->th;
@@ -1644,8 +1650,6 @@ static int tcp_send_flag(struct ctx *c, struct tcp_tap_conn *conn, int flags)
 					conn->seq_to_tap);
 	iov[TCP_IOV_PAYLOAD].iov_len = l4len;
 
-	tap_hdr_update(iov[TCP_IOV_TAP].iov_base, vnet_len + l4len);
-
 	if (th->ack) {
 		if (SEQ_GE(conn->seq_ack_to_tap, conn->seq_from_tap))
 			conn_flag(c, conn, ~ACK_TO_TAP_DUE);
@@ -2144,8 +2148,6 @@ static void tcp_data_to_tap(const struct ctx *c, struct tcp_tap_conn *conn,
 		iov = tcp4_l2_iov[tcp4_payload_used++];
 		l4len = tcp_l2_buf_fill_headers(c, conn, iov, plen, check, seq);
 		iov[TCP_IOV_PAYLOAD].iov_len = l4len;
-		tap_hdr_update(iov[TCP_IOV_TAP].iov_base, l4len
-			       + sizeof(struct iphdr) + sizeof(struct ethhdr));
 		if (tcp4_payload_used > TCP_FRAMES_MEM - 1)
 			tcp_payload_flush(c);
 	} else if (CONN_V6(conn)) {
@@ -2155,8 +2157,6 @@ static void tcp_data_to_tap(const struct ctx *c, struct tcp_tap_conn *conn,
 		iov = tcp6_l2_iov[tcp6_payload_used++];
 		l4len = tcp_l2_buf_fill_headers(c, conn, iov, plen, NULL, seq);
 		iov[TCP_IOV_PAYLOAD].iov_len = l4len;
-		tap_hdr_update(iov[TCP_IOV_TAP].iov_base, l4len
-			       + sizeof(struct ipv6hdr) + sizeof(struct ethhdr));
 		if (tcp6_payload_used > TCP_FRAMES_MEM - 1)
 			tcp_payload_flush(c);
 	}
-- 
@@ -1321,6 +1321,7 @@ static void tcp_fill_header(struct tcphdr *th,
  * tcp_fill_headers4() - Fill 802.3, IPv4, TCP headers in pre-cooked buffers
  * @c:		Execution context
  * @conn:	Connection pointer
+ * @taph:	TAP backend specific header
  * @iph:	Pointer to IPv4 header
  * @th:		Pointer to TCP header
  * @plen:	Payload length (including TCP header options)
@@ -1331,6 +1332,7 @@ static void tcp_fill_header(struct tcphdr *th,
  */
 static size_t tcp_fill_headers4(const struct ctx *c,
 				const struct tcp_tap_conn *conn,
+				struct tap_hdr *taph,
 				struct iphdr *iph, struct tcphdr *th,
 				size_t plen, const uint16_t *check,
 				uint32_t seq)
@@ -1353,6 +1355,8 @@ static size_t tcp_fill_headers4(const struct ctx *c,
 
 	tcp_update_check_tcp4(iph, th);
 
+	tap_hdr_update(taph, l3len + sizeof(struct ethhdr));
+
 	return l4len;
 }
 
@@ -1360,6 +1364,7 @@ static size_t tcp_fill_headers4(const struct ctx *c,
  * tcp_fill_headers6() - Fill 802.3, IPv6, TCP headers in pre-cooked buffers
  * @c:		Execution context
  * @conn:	Connection pointer
+ * @taph:	TAP backend specific header
  * @ip6h:	Pointer to IPv6 header
  * @th:		Pointer to TCP header
  * @plen:	Payload length (including TCP header options)
@@ -1370,6 +1375,7 @@ static size_t tcp_fill_headers4(const struct ctx *c,
  */
 static size_t tcp_fill_headers6(const struct ctx *c,
 				const struct tcp_tap_conn *conn,
+				struct tap_hdr *taph,
 				struct ipv6hdr *ip6h, struct tcphdr *th,
 				size_t plen, uint32_t seq)
 {
@@ -1394,6 +1400,8 @@ static size_t tcp_fill_headers6(const struct ctx *c,
 
 	tcp_update_check_tcp6(ip6h, th);
 
+	tap_hdr_update(taph, l4len + sizeof(*ip6h) + sizeof(struct ethhdr));
+
 	return l4len;
 }
 
@@ -1416,12 +1424,14 @@ static size_t tcp_l2_buf_fill_headers(const struct ctx *c,
 	const struct in_addr *a4 = inany_v4(&conn->faddr);
 
 	if (a4) {
-		return tcp_fill_headers4(c, conn, iov[TCP_IOV_IP].iov_base,
+		return tcp_fill_headers4(c, conn, iov[TCP_IOV_TAP].iov_base,
+					 iov[TCP_IOV_IP].iov_base,
 					 iov[TCP_IOV_PAYLOAD].iov_base, plen,
 					 check, seq);
 	}
 
-	return tcp_fill_headers6(c, conn, iov[TCP_IOV_IP].iov_base,
+	return tcp_fill_headers6(c, conn, iov[TCP_IOV_TAP].iov_base,
+				 iov[TCP_IOV_IP].iov_base,
 				 iov[TCP_IOV_PAYLOAD].iov_base, plen,
 				 seq);
 }
@@ -1556,7 +1566,6 @@ static int tcp_send_flag(struct ctx *c, struct tcp_tap_conn *conn, int flags)
 	struct tcp_info tinfo = { 0 };
 	socklen_t sl = sizeof(tinfo);
 	int s = conn->sock;
-	uint32_t vnet_len;
 	size_t optlen = 0;
 	struct tcphdr *th;
 	struct iovec *iov;
@@ -1583,13 +1592,10 @@ static int tcp_send_flag(struct ctx *c, struct tcp_tap_conn *conn, int flags)
 	if (!tcp_update_seqack_wnd(c, conn, flags, &tinfo) && !flags)
 		return 0;
 
-	if (CONN_V4(conn)) {
+	if (CONN_V4(conn))
 		iov = tcp4_l2_flags_iov[tcp4_flags_used++];
-		vnet_len = sizeof(struct ethhdr) + sizeof(struct iphdr);
-	} else {
+	else
 		iov = tcp6_l2_flags_iov[tcp6_flags_used++];
-		vnet_len = sizeof(struct ethhdr) + sizeof(struct ipv6hdr);
-	}
 
 	payload = iov[TCP_IOV_PAYLOAD].iov_base;
 	th = &payload->th;
@@ -1644,8 +1650,6 @@ static int tcp_send_flag(struct ctx *c, struct tcp_tap_conn *conn, int flags)
 					conn->seq_to_tap);
 	iov[TCP_IOV_PAYLOAD].iov_len = l4len;
 
-	tap_hdr_update(iov[TCP_IOV_TAP].iov_base, vnet_len + l4len);
-
 	if (th->ack) {
 		if (SEQ_GE(conn->seq_ack_to_tap, conn->seq_from_tap))
 			conn_flag(c, conn, ~ACK_TO_TAP_DUE);
@@ -2144,8 +2148,6 @@ static void tcp_data_to_tap(const struct ctx *c, struct tcp_tap_conn *conn,
 		iov = tcp4_l2_iov[tcp4_payload_used++];
 		l4len = tcp_l2_buf_fill_headers(c, conn, iov, plen, check, seq);
 		iov[TCP_IOV_PAYLOAD].iov_len = l4len;
-		tap_hdr_update(iov[TCP_IOV_TAP].iov_base, l4len
-			       + sizeof(struct iphdr) + sizeof(struct ethhdr));
 		if (tcp4_payload_used > TCP_FRAMES_MEM - 1)
 			tcp_payload_flush(c);
 	} else if (CONN_V6(conn)) {
@@ -2155,8 +2157,6 @@ static void tcp_data_to_tap(const struct ctx *c, struct tcp_tap_conn *conn,
 		iov = tcp6_l2_iov[tcp6_payload_used++];
 		l4len = tcp_l2_buf_fill_headers(c, conn, iov, plen, NULL, seq);
 		iov[TCP_IOV_PAYLOAD].iov_len = l4len;
-		tap_hdr_update(iov[TCP_IOV_TAP].iov_base, l4len
-			       + sizeof(struct ipv6hdr) + sizeof(struct ethhdr));
 		if (tcp6_payload_used > TCP_FRAMES_MEM - 1)
 			tcp_payload_flush(c);
 	}
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/7] tap: Split tap specific and L2 (ethernet) headers
  2024-04-29  7:09 ` [PATCH 2/7] tap: Split tap specific and L2 (ethernet) headers David Gibson
@ 2024-04-30 18:46   ` Stefano Brivio
  2024-04-30 23:53     ` David Gibson
  0 siblings, 1 reply; 18+ messages in thread
From: Stefano Brivio @ 2024-04-30 18:46 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev, Laurent Vivier

On Mon, 29 Apr 2024 17:09:28 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> In some places (well, actually only UDP now) we use struct tap_hdr to
> represent both tap backend specific and L2 ethernet headers.  Handling
> these together seemed like a good idea at the time, but Laurent's changes
> in the TCP code working towards vhost-user support suggest that treating
> them separately is more useful, more often.
> 
> Alter struct tap_hdr to represent only the TAP backend specific headers.
> Updated related helpers and the UDP code to match.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  tap.h | 21 +++++++++------------
>  udp.c | 23 ++++++++++++++---------
>  2 files changed, 23 insertions(+), 21 deletions(-)
> 
> diff --git a/tap.h b/tap.h
> index 2adc4e2b..dbc23b31 100644
> --- a/tap.h
> +++ b/tap.h
> @@ -6,30 +6,28 @@
>  #ifndef TAP_H
>  #define TAP_H
>  
> +#define ETH_HDR_INIT(proto) { .h_proto = htons_constant(proto) }
> +
>  /**
> - * struct tap_hdr - L2 and tap specific headers
> + * struct tap_hdr - tap backend specific headers
>   * @vnet_len:	Frame length (for qemu socket transport)
> - * @eh:		Ethernet header
>   */
>  struct tap_hdr {
>  	uint32_t vnet_len;
> -	struct ethhdr eh;
>  } __attribute__((packed));

No need to have it packed anymore.

> -#define TAP_HDR_INIT(proto) { .eh.h_proto = htons_constant(proto) }
> -
>  static inline size_t tap_hdr_len_(const struct ctx *c)
>  {
>  	if (c->mode == MODE_PASST)
>  		return sizeof(struct tap_hdr);
>  	else
> -		return sizeof(struct ethhdr);
> +		return 0;
>  }
>  
>  /**
>   * tap_frame_base() - Find start of tap frame
>   * @c:		Execution context
> - * @taph:	Pointer to L2 and tap specific header buffer
> + * @taph:	Pointer to tap specific header buffer
>   *
>   * Returns: pointer to the start of tap frame - suitable for an
>   *          iov_base to be passed to tap_send_frames())
> @@ -43,17 +41,16 @@ static inline void *tap_frame_base(const struct ctx *c, struct tap_hdr *taph)
>   * tap_frame_len() - Finalize tap frame and return total length
>   * @c:		Execution context
>   * @taph:	Tap header to finalize
> - * @plen:	L3 packet length (excludes L2 and tap specific headers)
> + * @plen:	L2 packet length (includes L2, excludes tap specific headers)
>   *
> - * Returns: length of the tap frame including L2 and tap specific
> - *          headers - suitable for an iov_len to be passed to
> - *          tap_send_frames()
> + * Returns: length of the tap frame including tap specific headers - suitable
> + *          for an iov_len to be passed to tap_send_frames()
>   */
>  static inline size_t tap_frame_len(const struct ctx *c, struct tap_hdr *taph,
>  				   size_t plen)
>  {
>  	if (c->mode == MODE_PASST)
> -		taph->vnet_len = htonl(plen + sizeof(taph->eh));
> +		taph->vnet_len = htonl(plen);
>  	return plen + tap_hdr_len_(c);
>  }
>  
> diff --git a/udp.c b/udp.c
> index 594ea191..c3e4f6b6 100644
> --- a/udp.c
> +++ b/udp.c
> @@ -173,7 +173,8 @@ static uint8_t udp_act[IP_VERSIONS][UDP_ACT_TYPE_MAX][DIV_ROUND_UP(NUM_PORTS, 8)
>  /**
>   * udp4_l2_buf_t - Pre-cooked IPv4 packet buffers for tap connections
>   * @s_in:	Source socket address, filled in by recvmmsg()
> - * @taph:	Tap-level headers (partially pre-filled)
> + * @taph:	Tap backend specific header
> + * @eh:		Prefilled ethernet header
>   * @iph:	Pre-filled IP header (except for tot_len and saddr)
>   * @uh:		Headroom for UDP header
>   * @data:	Storage for UDP payload
> @@ -182,6 +183,7 @@ static struct udp4_l2_buf_t {
>  	struct sockaddr_in s_in;
>  
>  	struct tap_hdr taph;
> +	struct ethhdr eh;
>  	struct iphdr iph;
>  	struct udphdr uh;
>  	uint8_t data[USHRT_MAX -
> @@ -192,7 +194,8 @@ udp4_l2_buf[UDP_MAX_FRAMES];
>  /**
>   * udp6_l2_buf_t - Pre-cooked IPv6 packet buffers for tap connections
>   * @s_in6:	Source socket address, filled in by recvmmsg()
> - * @taph:	Tap-level headers (partially pre-filled)
> + * @taph:	Tap backend specific header
> + * @eh:		Pre-filled ethernet header
>   * @ip6h:	Pre-filled IP header (except for payload_len and addresses)
>   * @uh:		Headroom for UDP header
>   * @data:	Storage for UDP payload
> @@ -202,10 +205,11 @@ struct udp6_l2_buf_t {
>  #ifdef __AVX2__
>  	/* Align ip6h to 32-byte boundary. */
>  	uint8_t pad[64 - (sizeof(struct sockaddr_in6) + sizeof(struct ethhdr) +
> -			  sizeof(uint32_t))];
> +			  sizeof(struct tap_hdr))];
>  #endif
>  
>  	struct tap_hdr taph;
> +	struct ethhdr eh;
>  	struct ipv6hdr ip6h;
>  	struct udphdr uh;
>  	uint8_t data[USHRT_MAX -
> @@ -289,8 +293,8 @@ void udp_update_l2_buf(const unsigned char *eth_d, const unsigned char *eth_s)
>  		struct udp4_l2_buf_t *b4 = &udp4_l2_buf[i];
>  		struct udp6_l2_buf_t *b6 = &udp6_l2_buf[i];
>  
> -		eth_update_mac(&b4->taph.eh, eth_d, eth_s);
> -		eth_update_mac(&b6->taph.eh, eth_d, eth_s);
> +		eth_update_mac(&b4->eh, eth_d, eth_s);
> +		eth_update_mac(&b6->eh, eth_d, eth_s);
>  	}
>  }
>  
> @@ -307,7 +311,7 @@ static void udp_sock4_iov_init_one(const struct ctx *c, size_t i)
>  	struct iovec *tiov = &udp4_l2_iov_tap[i];
>  
>  	*buf = (struct udp4_l2_buf_t) {
> -		.taph = TAP_HDR_INIT(ETH_P_IP),
> +		.eh  = ETH_HDR_INIT(ETH_P_IP),
>  		.iph = L2_BUF_IP4_INIT(IPPROTO_UDP)
>  	};
>  
> @@ -335,7 +339,7 @@ static void udp_sock6_iov_init_one(const struct ctx *c, size_t i)
>  	struct iovec *tiov = &udp6_l2_iov_tap[i];
>  
>  	*buf = (struct udp6_l2_buf_t) {
> -		.taph = TAP_HDR_INIT(ETH_P_IPV6),
> +		.eh   = ETH_HDR_INIT(ETH_P_IPV6),
>  		.ip6h = L2_BUF_IP6_INIT(IPPROTO_UDP)
>  	};
>  
> @@ -608,7 +612,7 @@ static size_t udp_update_hdr4(const struct ctx *c, struct udp4_l2_buf_t *b,
>  	b->uh.dest = htons(dstport);
>  	b->uh.len = htons(datalen + sizeof(b->uh));
>  
> -	return tap_frame_len(c, &b->taph, ip_len);
> +	return tap_frame_len(c, &b->taph, ip_len + sizeof(b->eh));
>  }
>  
>  /**
> @@ -676,7 +680,8 @@ static size_t udp_update_hdr6(const struct ctx *c, struct udp6_l2_buf_t *b,
>  	b->uh.len = b->ip6h.payload_len;
>  	csum_udp6(&b->uh, src, dst, b->data, datalen);
>  
> -	return tap_frame_len(c, &b->taph, payload_len + sizeof(b->ip6h));
> +	return tap_frame_len(c, &b->taph, payload_len + sizeof(b->ip6h)
> +			     + sizeof(b->eh));

Nit: the + operator should be on the first line for consistency, and
for readability I'd rather do:

	return tap_frame_len(c, &b->taph, payload_len + sizeof(b->ip6h) +
							sizeof(b->eh));

-- 
Stefano


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/7] treewide: Standardise variable names for various packet lengths
  2024-04-29  7:09 ` [PATCH 3/7] treewide: Standardise variable names for various packet lengths David Gibson
@ 2024-04-30 18:46   ` Stefano Brivio
  2024-05-01  0:05     ` David Gibson
  0 siblings, 1 reply; 18+ messages in thread
From: Stefano Brivio @ 2024-04-30 18:46 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev, Laurent Vivier

On Mon, 29 Apr 2024 17:09:29 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> At various points we need to track the lengths of a packet including or
> excluding various different sets of headers.  We don't always use the same
> variable names for doing so.  Worse in some places we use the same name
> for different things: e.g. tcp_fill_headers[46]() use ip_len for the
> length including the IP headers, but then tcp_send_flag() which calls it
> uses it to mean the IP payload length only.
> 
> To improve clarity, standardise on these names:
>    plen:		L4 protocol payload length
>    l4len:		plen + length of L4 protocol header
>    l3len:		l4len + length of IPv4/IPv6 header

If we're dealing with IPv4, I guess tot_len is still a reasonable
synonym for this, as you left for csum_ip4_header().

>    l2len:		l3len + length of L2 (ethernet) header
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  checksum.c | 42 +++++++++++++++++-----------------
>  checksum.h | 10 ++++-----
>  icmp.c     |  8 +++----
>  passt.h    |  4 ++--
>  tap.c      | 48 +++++++++++++++++++--------------------
>  tcp.c      | 66 +++++++++++++++++++++++++++---------------------------
>  udp.c      | 24 ++++++++++----------
>  7 files changed, 101 insertions(+), 101 deletions(-)
> 
> diff --git a/checksum.c b/checksum.c
> index 9cbe0b29..3602215a 100644
> --- a/checksum.c
> +++ b/checksum.c
> @@ -116,7 +116,7 @@ uint16_t csum_fold(uint32_t sum)
>  
>  /**
>   * csum_ip4_header() - Calculate IPv4 header checksum
> - * @tot_len:	IPv4 payload length (data + IP header, network order)
> + * @tot_len:	IPv4 packet length (data + IP header, network order)
>   * @protocol:	Protocol number (network order)
>   * @saddr:	IPv4 source address (network order)
>   * @daddr:	IPv4 destination address (network order)
> @@ -140,13 +140,13 @@ uint16_t csum_ip4_header(uint16_t tot_len, uint8_t protocol,
>  /**
>   * proto_ipv4_header_psum() - Calculates the partial checksum of an
>   * 			      IPv4 header for UDP or TCP
> - * @tot_len:	IPv4 Payload length (host order)
> + * @l4len:	IPv4 Payload length (host order)

Ouch, how did this end up being called tot_len...

>   * @proto:	Protocol number (host order)
>   * @saddr:	Source address  (network order)
>   * @daddr:	Destination address (network order)
>   * Returns:	Partial checksum of the IPv4 header
>   */
> -uint32_t proto_ipv4_header_psum(uint16_t tot_len, uint8_t protocol,
> +uint32_t proto_ipv4_header_psum(uint16_t l4len, uint8_t protocol,
>  				struct in_addr saddr, struct in_addr daddr)
>  {
>  	uint32_t psum = htons(protocol);
> @@ -155,7 +155,7 @@ uint32_t proto_ipv4_header_psum(uint16_t tot_len, uint8_t protocol,
>  	psum += saddr.s_addr & 0xffff;
>  	psum += (daddr.s_addr >> 16) & 0xffff;
>  	psum += daddr.s_addr & 0xffff;
> -	psum += htons(tot_len);
> +	psum += htons(l4len);
>  
>  	return psum;
>  }
> @@ -165,22 +165,22 @@ uint32_t proto_ipv4_header_psum(uint16_t tot_len, uint8_t protocol,
>   * @udp4hr:	UDP header, initialised apart from checksum
>   * @saddr:	IPv4 source address
>   * @daddr:	IPv4 destination address
> - * @payload:	ICMPv4 packet payload
> - * @len:	Length of @payload (not including UDP)
> + * @payload:	UDP packet payload
> + * @plen:	Length of @payload (not including UDP header)
>   */
>  void csum_udp4(struct udphdr *udp4hr,
>  	       struct in_addr saddr, struct in_addr daddr,
> -	       const void *payload, size_t len)
> +	       const void *payload, size_t plen)
>  {
>  	/* UDP checksums are optional, so don't bother */
>  	udp4hr->check = 0;
>  
>  	if (UDP4_REAL_CHECKSUMS) {
> -		uint16_t tot_len = len + sizeof(struct udphdr);
> -		uint32_t psum = proto_ipv4_header_psum(tot_len, IPPROTO_UDP,
> +		uint16_t l4len = plen + sizeof(struct udphdr);
> +		uint32_t psum = proto_ipv4_header_psum(l4len, IPPROTO_UDP,
>  						       saddr, daddr);
>  		psum = csum_unfolded(udp4hr, sizeof(struct udphdr), psum);
> -		udp4hr->check = csum(payload, len, psum);
> +		udp4hr->check = csum(payload, plen, psum);
>  	}
>  }
>  
> @@ -188,9 +188,9 @@ void csum_udp4(struct udphdr *udp4hr,
>   * csum_icmp4() - Calculate and set checksum for an ICMP packet
>   * @icmp4hr:	ICMP header, initialised apart from checksum
>   * @payload:	ICMP packet payload
> - * @len:	Length of @payload (not including ICMP header)
> + * @plen:	Length of @payload (not including ICMP header)
>   */
> -void csum_icmp4(struct icmphdr *icmp4hr, const void *payload, size_t len)
> +void csum_icmp4(struct icmphdr *icmp4hr, const void *payload, size_t plen)
>  {
>  	uint32_t psum;
>  
> @@ -199,7 +199,7 @@ void csum_icmp4(struct icmphdr *icmp4hr, const void *payload, size_t len)
>  	/* Partial checksum for ICMP header alone */
>  	psum = sum_16b(icmp4hr, sizeof(*icmp4hr));
>  
> -	icmp4hr->checksum = csum(payload, len, psum);
> +	icmp4hr->checksum = csum(payload, plen, psum);
>  }
>  
>  /**
> @@ -227,18 +227,18 @@ uint32_t proto_ipv6_header_psum(uint16_t payload_len, uint8_t protocol,
>   * csum_udp6() - Calculate and set checksum for a UDP over IPv6 packet
>   * @udp6hr:	UDP header, initialised apart from checksum
>   * @payload:	UDP packet payload
> - * @len:	Length of @payload (not including UDP header)
> + * @plen:	Length of @payload (not including UDP header)
>   */
>  void csum_udp6(struct udphdr *udp6hr,
>  	       const struct in6_addr *saddr, const struct in6_addr *daddr,
> -	       const void *payload, size_t len)
> +	       const void *payload, size_t plen)
>  {
> -	uint32_t psum = proto_ipv6_header_psum(len + sizeof(struct udphdr),
> +	uint32_t psum = proto_ipv6_header_psum(plen + sizeof(struct udphdr),
>  					       IPPROTO_UDP, saddr, daddr);
>  	udp6hr->check = 0;
>  
>  	psum = csum_unfolded(udp6hr, sizeof(struct udphdr), psum);
> -	udp6hr->check = csum(payload, len, psum);
> +	udp6hr->check = csum(payload, plen, psum);
>  }
>  
>  /**
> @@ -247,19 +247,19 @@ void csum_udp6(struct udphdr *udp6hr,
>   * @saddr:	IPv6 source address
>   * @daddr:	IPv6 destination address
>   * @payload:	ICMP packet payload
> - * @len:	Length of @payload (not including ICMPv6 header)
> + * @plen:	Length of @payload (not including ICMPv6 header)
>   */
>  void csum_icmp6(struct icmp6hdr *icmp6hr,
>  		const struct in6_addr *saddr, const struct in6_addr *daddr,
> -		const void *payload, size_t len)
> +		const void *payload, size_t plen)
>  {
> -	uint32_t psum = proto_ipv6_header_psum(len + sizeof(*icmp6hr),
> +	uint32_t psum = proto_ipv6_header_psum(plen + sizeof(*icmp6hr),
>  					       IPPROTO_ICMPV6, saddr, daddr);
>  
>  	icmp6hr->icmp6_cksum = 0;
>  	/* Add in partial checksum for the ICMPv6 header alone */
>  	psum += sum_16b(icmp6hr, sizeof(*icmp6hr));
> -	icmp6hr->icmp6_cksum = csum(payload, len, psum);
> +	icmp6hr->icmp6_cksum = csum(payload, plen, psum);
>  }
>  
>  #ifdef __AVX2__
> diff --git a/checksum.h b/checksum.h
> index 0f396767..726ed7b5 100644
> --- a/checksum.h
> +++ b/checksum.h
> @@ -15,21 +15,21 @@ uint16_t csum_fold(uint32_t sum);
>  uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init);
>  uint16_t csum_ip4_header(uint16_t tot_len, uint8_t protocol,
>  			 struct in_addr saddr, struct in_addr daddr);
> -uint32_t proto_ipv4_header_psum(uint16_t tot_len, uint8_t protocol,
> +uint32_t proto_ipv4_header_psum(uint16_t l4len, uint8_t protocol,
>  				struct in_addr saddr, struct in_addr daddr);
>  void csum_udp4(struct udphdr *udp4hr,
>  	       struct in_addr saddr, struct in_addr daddr,
> -	       const void *payload, size_t len);
> -void csum_icmp4(struct icmphdr *icmp4hr, const void *payload, size_t len);
> +	       const void *payload, size_t plen);
> +void csum_icmp4(struct icmphdr *icmp4hr, const void *payload, size_t plen);
>  uint32_t proto_ipv6_header_psum(uint16_t payload_len, uint8_t protocol,
>  				const struct in6_addr *saddr,
>  				const struct in6_addr *daddr);
>  void csum_udp6(struct udphdr *udp6hr,
>  	       const struct in6_addr *saddr, const struct in6_addr *daddr,
> -	       const void *payload, size_t len);
> +	       const void *payload, size_t plen);
>  void csum_icmp6(struct icmp6hdr *icmp6hr,
>  		const struct in6_addr *saddr, const struct in6_addr *daddr,
> -		const void *payload, size_t len);
> +		const void *payload, size_t plen);
>  uint32_t csum_unfolded(const void *buf, size_t len, uint32_t init);
>  uint16_t csum(const void *buf, size_t len, uint32_t init);
>  uint16_t csum_iov(const struct iovec *iov, size_t n, uint32_t init);
> diff --git a/icmp.c b/icmp.c
> index 76bb9e9f..263cd9ca 100644
> --- a/icmp.c
> +++ b/icmp.c
> @@ -224,8 +224,8 @@ int icmp_tap_handler(const struct ctx *c, uint8_t pif, sa_family_t af,
>  	union sockaddr_inany sa = { .sa_family = af };
>  	const socklen_t sl = af == AF_INET ? sizeof(sa.sa4) : sizeof(sa.sa6);
>  	struct icmp_ping_flow *pingf, **id_sock;
> +	size_t plen, l4len;
>  	uint16_t id, seq;
> -	size_t plen;
>  	void *pkt;
>  
>  	(void)saddr;
> @@ -238,7 +238,7 @@ int icmp_tap_handler(const struct ctx *c, uint8_t pif, sa_family_t af,
>  			return 1;
>  
>  		ih =  (struct icmphdr *)pkt;
> -		plen += sizeof(*ih);
> +		l4len = plen + sizeof(*ih);
>  
>  		if (ih->type != ICMP_ECHO)
>  			return 1;
> @@ -254,7 +254,7 @@ int icmp_tap_handler(const struct ctx *c, uint8_t pif, sa_family_t af,
>  			return 1;
>  
>  		ih = (struct icmp6hdr *)pkt;
> -		plen += sizeof(*ih);
> +		l4len = plen + sizeof(*ih);
>  
>  		if (ih->icmp6_type != ICMPV6_ECHO_REQUEST)
>  			return 1;
> @@ -274,7 +274,7 @@ int icmp_tap_handler(const struct ctx *c, uint8_t pif, sa_family_t af,
>  
>  	pingf->ts = now->tv_sec;
>  
> -	if (sendto(pingf->sock, pkt, plen, MSG_NOSIGNAL, &sa.sa, sl) < 0) {
> +	if (sendto(pingf->sock, pkt, l4len, MSG_NOSIGNAL, &sa.sa, sl) < 0) {
>  		flow_dbg(pingf, "failed to relay request to socket: %s",
>  			 strerror(errno));
>  	} else {
> diff --git a/passt.h b/passt.h
> index 76026f09..d1add470 100644
> --- a/passt.h
> +++ b/passt.h
> @@ -22,11 +22,11 @@ struct tap_msg {
>  /**
>   * struct tap_l4_msg - Layer-4 message descriptor for protocol handlers
>   * @pkt_buf_offset:	Offset of message from @pkt_buf
> - * @l4_len:		Length of Layer-4 payload, host order
> + * @l4len:		Length of Layer-4 payload, host order

This is (and was) ambiguous, but in any case, we don't use the struct
anymore since commit bb708111833e ("treewide: Packet abstraction with
mandatory boundary checks"), so I'm fine either way, we don't necessarily
need to drop it here, I think.

>   */
>  struct tap_l4_msg {
>  	uint32_t pkt_buf_offset;
> -	uint16_t l4_len;
> +	uint16_t l4len;
>  };
>  
>  union epoll_ref;
> diff --git a/tap.c b/tap.c
> index 13e4da79..6a08cca6 100644
> --- a/tap.c
> +++ b/tap.c
> @@ -177,15 +177,15 @@ void tap_udp4_send(const struct ctx *c, struct in_addr src, in_port_t sport,
>  		   struct in_addr dst, in_port_t dport,
>  		   const void *in, size_t len)
>  {
> -	size_t udplen = len + sizeof(struct udphdr);
> +	size_t l4len = len + sizeof(struct udphdr);
>  	char buf[USHRT_MAX];
>  	struct iphdr *ip4h = tap_push_l2h(c, buf, ETH_P_IP);
> -	struct udphdr *uh = tap_push_ip4h(ip4h, src, dst, udplen, IPPROTO_UDP);
> +	struct udphdr *uh = tap_push_ip4h(ip4h, src, dst, l4len, IPPROTO_UDP);
>  	char *data = (char *)(uh + 1);
>  
>  	uh->source = htons(sport);
>  	uh->dest = htons(dport);
> -	uh->len = htons(udplen);
> +	uh->len = htons(l4len);
>  	csum_udp4(uh, src, dst, in, len);
>  	memcpy(data, in, len);
>  
> @@ -259,16 +259,16 @@ void tap_udp6_send(const struct ctx *c,
>  		   const struct in6_addr *dst, in_port_t dport,
>  		   uint32_t flow, const void *in, size_t len)
>  {
> -	size_t udplen = len + sizeof(struct udphdr);
> +	size_t l4len = len + sizeof(struct udphdr);
>  	char buf[USHRT_MAX];
>  	struct ipv6hdr *ip6h = tap_push_l2h(c, buf, ETH_P_IPV6);
>  	struct udphdr *uh = tap_push_ip6h(ip6h, src, dst,
> -					  udplen, IPPROTO_UDP, flow);
> +					  l4len, IPPROTO_UDP, flow);
>  	char *data = (char *)(uh + 1);
>  
>  	uh->source = htons(sport);
>  	uh->dest = htons(dport);
> -	uh->len = htons(udplen);
> +	uh->len = htons(l4len);
>  	csum_udp6(uh, src, dst, in, len);
>  	memcpy(data, in, len);
>  
> @@ -589,21 +589,21 @@ static int tap4_handler(struct ctx *c, const struct pool *in,
>  	i = 0;
>  resume:
>  	for (seq_count = 0, seq = NULL; i < in->count; i++) {
> -		size_t l2_len, l3_len, hlen, l4_len;
> +		size_t l2len, l3len, hlen, l4len;
>  		const struct ethhdr *eh;
>  		const struct udphdr *uh;
>  		struct iphdr *iph;
>  		const char *l4h;
>  
> -		packet_get(in, i, 0, 0, &l2_len);
> +		packet_get(in, i, 0, 0, &l2len);
>  
> -		eh = packet_get(in, i, 0, sizeof(*eh), &l3_len);
> +		eh = packet_get(in, i, 0, sizeof(*eh), &l3len);
>  		if (!eh)
>  			continue;
>  		if (ntohs(eh->h_proto) == ETH_P_ARP) {
>  			PACKET_POOL_P(pkt, 1, in->buf, sizeof(pkt_buf));
>  
> -			packet_add(pkt, l2_len, (char *)eh);
> +			packet_add(pkt, l2len, (char *)eh);
>  			arp(c, pkt);
>  			continue;
>  		}
> @@ -613,15 +613,15 @@ resume:
>  			continue;
>  
>  		hlen = iph->ihl * 4UL;
> -		if (hlen < sizeof(*iph) || htons(iph->tot_len) > l3_len ||
> -		    hlen > l3_len)
> +		if (hlen < sizeof(*iph) || htons(iph->tot_len) > l3len ||
> +		    hlen > l3len)
>  			continue;
>  
>  		/* We don't handle IP fragments, drop them */
>  		if (tap4_is_fragment(iph, now))
>  			continue;
>  
> -		l4_len = htons(iph->tot_len) - hlen;
> +		l4len = htons(iph->tot_len) - hlen;
>  
>  		if (IN4_IS_ADDR_LOOPBACK(&iph->saddr) ||
>  		    IN4_IS_ADDR_LOOPBACK(&iph->daddr)) {
> @@ -636,7 +636,7 @@ resume:
>  		if (iph->saddr && c->ip4.addr_seen.s_addr != iph->saddr)
>  			c->ip4.addr_seen.s_addr = iph->saddr;
>  
> -		l4h = packet_get(in, i, sizeof(*eh) + hlen, l4_len, NULL);
> +		l4h = packet_get(in, i, sizeof(*eh) + hlen, l4len, NULL);
>  		if (!l4h)
>  			continue;
>  
> @@ -648,7 +648,7 @@ resume:
>  
>  			tap_packet_debug(iph, NULL, NULL, 0, NULL, 1);
>  
> -			packet_add(pkt, l4_len, l4h);
> +			packet_add(pkt, l4len, l4h);
>  			icmp_tap_handler(c, PIF_TAP, AF_INET,
>  					 &iph->saddr, &iph->daddr,
>  					 pkt, now);
> @@ -662,7 +662,7 @@ resume:
>  		if (iph->protocol == IPPROTO_UDP) {
>  			PACKET_POOL_P(pkt, 1, in->buf, sizeof(pkt_buf));
>  
> -			packet_add(pkt, l2_len, (char *)eh);
> +			packet_add(pkt, l2len, (char *)eh);
>  			if (dhcp(c, pkt))
>  				continue;
>  		}
> @@ -711,7 +711,7 @@ resume:
>  #undef L4_SET
>  
>  append:
> -		packet_add((struct pool *)&seq->p, l4_len, l4h);
> +		packet_add((struct pool *)&seq->p, l4len, l4h);
>  	}
>  
>  	for (j = 0, seq = tap4_l4; j < seq_count; j++, seq++) {
> @@ -763,7 +763,7 @@ static int tap6_handler(struct ctx *c, const struct pool *in,
>  	i = 0;
>  resume:
>  	for (seq_count = 0, seq = NULL; i < in->count; i++) {
> -		size_t l4_len, plen, check;
> +		size_t l4len, plen, check;
>  		struct in6_addr *saddr, *daddr;
>  		const struct ethhdr *eh;
>  		const struct udphdr *uh;
> @@ -786,7 +786,7 @@ resume:
>  		if (plen != check)
>  			continue;
>  
> -		if (!(l4h = ipv6_l4hdr(in, i, sizeof(*eh), &proto, &l4_len)))
> +		if (!(l4h = ipv6_l4hdr(in, i, sizeof(*eh), &proto, &l4len)))
>  			continue;
>  
>  		if (IN6_IS_ADDR_LOOPBACK(saddr) || IN6_IS_ADDR_LOOPBACK(daddr)) {
> @@ -814,7 +814,7 @@ resume:
>  			if (c->no_icmp)
>  				continue;
>  
> -			if (l4_len < sizeof(struct icmp6hdr))
> +			if (l4len < sizeof(struct icmp6hdr))
>  				continue;
>  
>  			if (ndp(c, (struct icmp6hdr *)l4h, saddr))
> @@ -822,20 +822,20 @@ resume:
>  
>  			tap_packet_debug(NULL, ip6h, NULL, proto, NULL, 1);
>  
> -			packet_add(pkt, l4_len, l4h);
> +			packet_add(pkt, l4len, l4h);
>  			icmp_tap_handler(c, PIF_TAP, AF_INET6,
>  					 saddr, daddr, pkt, now);
>  			continue;
>  		}
>  
> -		if (l4_len < sizeof(*uh))
> +		if (l4len < sizeof(*uh))
>  			continue;
>  		uh = (struct udphdr *)l4h;
>  
>  		if (proto == IPPROTO_UDP) {
>  			PACKET_POOL_P(pkt, 1, in->buf, sizeof(pkt_buf));
>  
> -			packet_add(pkt, l4_len, l4h);
> +			packet_add(pkt, l4len, l4h);
>  
>  			if (dhcpv6(c, pkt, saddr, daddr))
>  				continue;
> @@ -886,7 +886,7 @@ resume:
>  #undef L4_SET
>  
>  append:
> -		packet_add((struct pool *)&seq->p, l4_len, l4h);
> +		packet_add((struct pool *)&seq->p, l4len, l4h);
>  	}
>  
>  	for (j = 0, seq = tap6_l4; j < seq_count; j++, seq++) {
> diff --git a/tcp.c b/tcp.c
> index 24f99cdf..23d408f2 100644
> --- a/tcp.c
> +++ b/tcp.c
> @@ -891,13 +891,13 @@ static void tcp_sock_set_bufsize(const struct ctx *c, int s)
>   */
>  static void tcp_update_check_tcp4(const struct iphdr *iph, struct tcphdr *th)
>  {
> -	uint16_t tlen = ntohs(iph->tot_len) - sizeof(struct iphdr);
> +	uint16_t l4len = ntohs(iph->tot_len) - sizeof(struct iphdr);
>  	struct in_addr saddr = { .s_addr = iph->saddr };
>  	struct in_addr daddr = { .s_addr = iph->daddr };
> -	uint32_t sum = proto_ipv4_header_psum(tlen, IPPROTO_TCP, saddr, daddr);
> +	uint32_t sum = proto_ipv4_header_psum(l4len, IPPROTO_TCP, saddr, daddr);
>  
>  	th->check = 0;
> -	th->check = csum(th, tlen, sum);
> +	th->check = csum(th, l4len, sum);
>  }
>  
>  /**
> @@ -907,12 +907,12 @@ static void tcp_update_check_tcp4(const struct iphdr *iph, struct tcphdr *th)
>   */
>  static void tcp_update_check_tcp6(struct ipv6hdr *ip6h, struct tcphdr *th)
>  {
> -	uint16_t payload_len = ntohs(ip6h->payload_len);
> -	uint32_t sum = proto_ipv6_header_psum(payload_len, IPPROTO_TCP,
> +	uint16_t l4len = ntohs(ip6h->payload_len);
> +	uint32_t sum = proto_ipv6_header_psum(l4len, IPPROTO_TCP,
>  					      &ip6h->saddr, &ip6h->daddr);
>  
>  	th->check = 0;
> -	th->check = csum(th, payload_len, sum);
> +	th->check = csum(th, l4len, sum);
>  }
>  
>  /**
> @@ -1349,12 +1349,13 @@ static size_t tcp_fill_headers4(const struct ctx *c,
>  				size_t plen, const uint16_t *check,
>  				uint32_t seq)
>  {
> -	size_t ip_len = plen + sizeof(struct iphdr) + sizeof(struct tcphdr);
>  	const struct in_addr *a4 = inany_v4(&conn->faddr);
> +	size_t l4len = plen + sizeof(*th);
> +	size_t l3len = l4len + sizeof(*iph);
>  
>  	ASSERT(a4);
>  
> -	iph->tot_len = htons(ip_len);
> +	iph->tot_len = htons(l3len);
>  	iph->saddr = a4->s_addr;
>  	iph->daddr = c->ip4.addr_seen.s_addr;
>  
> @@ -1366,7 +1367,7 @@ static size_t tcp_fill_headers4(const struct ctx *c,
>  
>  	tcp_update_check_tcp4(iph, th);
>  
> -	return ip_len;
> +	return l3len;
>  }
>  
>  /**
> @@ -1386,9 +1387,10 @@ static size_t tcp_fill_headers6(const struct ctx *c,
>  				struct ipv6hdr *ip6h, struct tcphdr *th,
>  				size_t plen, uint32_t seq)
>  {
> -	size_t ip_len = plen + sizeof(struct ipv6hdr) + sizeof(struct tcphdr);
> +	size_t l4len = plen + sizeof(*th);
> +	size_t l3len = l4len + sizeof(*ip6h);
>  
> -	ip6h->payload_len = htons(plen + sizeof(struct tcphdr));
> +	ip6h->payload_len = htons(l4len);
>  	ip6h->saddr = conn->faddr.a6;
>  	if (IN6_IS_ADDR_LINKLOCAL(&ip6h->saddr))
>  		ip6h->daddr = c->ip6.addr_ll_seen;
> @@ -1407,7 +1409,7 @@ static size_t tcp_fill_headers6(const struct ctx *c,
>  
>  	tcp_update_check_tcp6(ip6h, th);
>  
> -	return ip_len;
> +	return l3len;
>  }
>  
>  /**
> @@ -1427,21 +1429,21 @@ static size_t tcp_l2_buf_fill_headers(const struct ctx *c,
>  				      const uint16_t *check, uint32_t seq)
>  {
>  	const struct in_addr *a4 = inany_v4(&conn->faddr);
> -	size_t ip_len, tlen;
> +	size_t l3len, l4len;
>  
>  	if (a4) {
> -		ip_len = tcp_fill_headers4(c, conn, iov[TCP_IOV_IP].iov_base,
> +		l3len = tcp_fill_headers4(c, conn, iov[TCP_IOV_IP].iov_base,
>  					   iov[TCP_IOV_PAYLOAD].iov_base, plen,
>  					   check, seq);
> -		tlen = ip_len - sizeof(struct iphdr);
> +		l4len = l3len - sizeof(struct iphdr);
>  	} else {
> -		ip_len = tcp_fill_headers6(c, conn, iov[TCP_IOV_IP].iov_base,
> +		l3len = tcp_fill_headers6(c, conn, iov[TCP_IOV_IP].iov_base,
>  					   iov[TCP_IOV_PAYLOAD].iov_base, plen,
>  					   seq);
> -		tlen = ip_len - sizeof(struct ipv6hdr);
> +		l4len = l3len - sizeof(struct ipv6hdr);
>  	}
>  
> -	return tlen;
> +	return l4len;
>  }
>  
>  /**
> @@ -1578,7 +1580,7 @@ static int tcp_send_flag(struct ctx *c, struct tcp_tap_conn *conn, int flags)
>  	size_t optlen = 0;
>  	struct tcphdr *th;
>  	struct iovec *iov;
> -	size_t ip_len;
> +	size_t l4len;
>  	char *data;
>  
>  	if (SEQ_GE(conn->seq_ack_to_tap, conn->seq_from_tap) &&
> @@ -1658,11 +1660,11 @@ static int tcp_send_flag(struct ctx *c, struct tcp_tap_conn *conn, int flags)
>  	th->syn = !!(flags & SYN);
>  	th->fin = !!(flags & FIN);
>  
> -	ip_len = tcp_l2_buf_fill_headers(c, conn, iov, optlen, NULL,
> -					 conn->seq_to_tap);
> -	iov[TCP_IOV_PAYLOAD].iov_len = ip_len;
> +	l4len = tcp_l2_buf_fill_headers(c, conn, iov, optlen, NULL,
> +					conn->seq_to_tap);
> +	iov[TCP_IOV_PAYLOAD].iov_len = l4len;
>  
> -	*(uint32_t *)iov[TCP_IOV_VLEN].iov_base = htonl(vnet_len + ip_len);
> +	*(uint32_t *)iov[TCP_IOV_VLEN].iov_base = htonl(vnet_len + l4len);

I was completely puzzled by this until I reached 7/7. Okay, this works
because vnet_len includes everything that's not in l4len.

There's one bit of confusion left after this series, though. This series
actually highlights that. I'm not sure if it's convenient to fix that
here as well:

		l4len = tcp_l2_buf_fill_headers(c, conn, iov, plen,
		check, seq);

This fills in Layer-2 buffers, and returns Layer-3 payload length (i.e.
payload plus size of Layer-4 headers). I didn't really think about it
when the variable was ip_len, but now it's apparent.

Further on:

                iov[TCP_IOV_PAYLOAD].iov_len = l4len;

...so l4len is the length of the payload plus Layer-4 header, but we
use that to set the buffer for TCP_IOV_PAYLOAD...

-- 
Stefano


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/7] tap, tcp: (Re-)abstract TAP specific header handling
  2024-04-29  7:09 ` [PATCH 5/7] tap, tcp: (Re-)abstract TAP specific header handling David Gibson
@ 2024-04-30 18:47   ` Stefano Brivio
  2024-05-01  0:06     ` David Gibson
  0 siblings, 1 reply; 18+ messages in thread
From: Stefano Brivio @ 2024-04-30 18:47 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev, Laurent Vivier

On Mon, 29 Apr 2024 17:09:31 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> Recent changes to the TCP code (reworking of the buffer handling) have
> meant that it now (again) deals explicitly with the MODE_PASST specific
> vnet_len field, instead of using the (partial) abstractions provided by the
> tap layer.
> 
> The abstractions we had don't work for the new TCP structure, so make some
> new ones that do: tap_hdr_iov() which constructs an iovec suitable for
> containing (just) the TAP specific header and tap_hdr_update() which
> updates it as necessary per-packet.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  tap.h | 27 +++++++++++++++++++++++++++
>  tcp.c | 40 +++++++++++++++-------------------------
>  2 files changed, 42 insertions(+), 25 deletions(-)
> 
> diff --git a/tap.h b/tap.h
> index dbc23b31..75aa3f03 100644
> --- a/tap.h
> +++ b/tap.h
> @@ -16,6 +16,33 @@ struct tap_hdr {
>  	uint32_t vnet_len;
>  } __attribute__((packed));
>  
> +/**
> + * tap_hdr_iov() - struct iovec for a tap header
> + * @c:		Execution context
> + * @taph:	Pointer to tap specific header buffer
> + *
> + * Returns: A struct iovec covering the correct portion of @taph to use as the
> + *          TAP specific header in the current configuration.

Nit: s/TAP/tap/ (not an acronym).

-- 
Stefano


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 6/7] iov: Helper macro to construct iovs covering existing variables or fields
  2024-04-29  7:09 ` [PATCH 6/7] iov: Helper macro to construct iovs covering existing variables or fields David Gibson
@ 2024-04-30 18:47   ` Stefano Brivio
  2024-05-01  0:09     ` David Gibson
  0 siblings, 1 reply; 18+ messages in thread
From: Stefano Brivio @ 2024-04-30 18:47 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev, Laurent Vivier

On Mon, 29 Apr 2024 17:09:32 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> Laurent's recent changes mean we use IO vectors much more heavily in the
> TCP code.  In many of those cases, and few others around the code base,
> individual iovs of these vectors are constructed to exactly cover existing
> variables or fields.  We can make initializing such iovs shorter and
> clearer with a macro for the purpose.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  iov.h |  3 +++
>  tap.c |  3 +--
>  tcp.c | 24 +++++++++---------------
>  udp.c |  7 +++----
>  4 files changed, 16 insertions(+), 21 deletions(-)
> 
> diff --git a/iov.h b/iov.h
> index 6058af77..5668ca5f 100644
> --- a/iov.h
> +++ b/iov.h
> @@ -18,6 +18,9 @@
>  #include <unistd.h>
>  #include <string.h>
>  
> +#define IOV_OF_LVALUE(lval) \
> +	(struct iovec){ .iov_base = &(lval), .iov_len = sizeof(lval) }

I'm not sure if IOV_OF_LVALUE() is much clearer than IOV_FROM() or
IOV_OF(). No strong preference though.

> +
>  size_t iov_skip_bytes(const struct iovec *iov, size_t n,
>  		      size_t skip, size_t *offset);
>  size_t iov_from_buf(const struct iovec *iov, size_t iov_cnt,
> diff --git a/tap.c b/tap.c
> index 6a08cca6..5db7b169 100644
> --- a/tap.c
> +++ b/tap.c
> @@ -79,8 +79,7 @@ void tap_send_single(const struct ctx *c, const void *data, size_t len)
>  	size_t iovcnt = 0;
>  
>  	if (c->mode == MODE_PASST) {
> -		iov[iovcnt].iov_base = &vnet_len;
> -		iov[iovcnt].iov_len = sizeof(vnet_len);
> +		iov[iovcnt] = IOV_OF_LVALUE(vnet_len);
>  		iovcnt++;
>  	}
>  
> diff --git a/tcp.c b/tcp.c
> index ad409fc5..27c06958 100644
> --- a/tcp.c
> +++ b/tcp.c
> @@ -290,6 +290,7 @@
>  
>  #include "checksum.h"
>  #include "util.h"
> +#include "iov.h"
>  #include "ip.h"
>  #include "passt.h"
>  #include "tap.h"
> @@ -954,10 +955,8 @@ static void tcp_sock4_iov_init(const struct ctx *c)
>  		iov = tcp4_l2_iov[i];
>  
>  		iov[TCP_IOV_TAP] = tap_hdr_iov(c, &tcp4_payload_tap_hdr[i]);
> -		iov[TCP_IOV_ETH].iov_base = &tcp4_eth_src;
> -		iov[TCP_IOV_ETH].iov_len = sizeof(tcp4_eth_src);
> -		iov[TCP_IOV_IP].iov_base = &tcp4_payload_ip[i];
> -		iov[TCP_IOV_IP].iov_len = sizeof(tcp4_payload_ip[i]);
> +		iov[TCP_IOV_ETH] = IOV_OF_LVALUE(tcp4_eth_src);
> +		iov[TCP_IOV_IP] = IOV_OF_LVALUE(tcp4_payload_ip[i]);
>  		iov[TCP_IOV_PAYLOAD].iov_base = &tcp4_payload[i];
>  	}
>  
> @@ -966,9 +965,8 @@ static void tcp_sock4_iov_init(const struct ctx *c)
>  
>  		iov[TCP_IOV_TAP] = tap_hdr_iov(c, &tcp4_flags_tap_hdr[i]);
>  		iov[TCP_IOV_ETH].iov_base = &tcp4_eth_src;
> -		iov[TCP_IOV_ETH].iov_len = sizeof(tcp4_eth_src);
> -		iov[TCP_IOV_IP].iov_base = &tcp4_flags_ip[i];
> -		iov[TCP_IOV_IP].iov_len = sizeof(tcp4_flags_ip[i]);
> +		iov[TCP_IOV_ETH] = IOV_OF_LVALUE(tcp4_eth_src);
> +		iov[TCP_IOV_IP] = IOV_OF_LVALUE(tcp4_flags_ip[i]);
>  		iov[TCP_IOV_PAYLOAD].iov_base = &tcp4_flags[i];
>  	}
>  }
> @@ -1001,10 +999,8 @@ static void tcp_sock6_iov_init(const struct ctx *c)
>  		iov = tcp6_l2_iov[i];
>  
>  		iov[TCP_IOV_TAP] = tap_hdr_iov(c, &tcp6_payload_tap_hdr[i]);
> -		iov[TCP_IOV_ETH].iov_base = &tcp6_eth_src;
> -		iov[TCP_IOV_ETH].iov_len = sizeof(tcp6_eth_src);
> -		iov[TCP_IOV_IP].iov_base = &tcp6_payload_ip[i];
> -		iov[TCP_IOV_IP].iov_len = sizeof(tcp6_payload_ip[i]);
> +		iov[TCP_IOV_ETH] = IOV_OF_LVALUE(tcp6_eth_src);
> +		iov[TCP_IOV_IP] = IOV_OF_LVALUE(tcp6_payload_ip[i]);
>  		iov[TCP_IOV_PAYLOAD].iov_base = &tcp6_payload[i];
>  	}
>  
> @@ -1012,10 +1008,8 @@ static void tcp_sock6_iov_init(const struct ctx *c)
>  		iov = tcp6_l2_flags_iov[i];
>  
>  		iov[TCP_IOV_TAP] = tap_hdr_iov(c, &tcp6_flags_tap_hdr[i]);
> -		iov[TCP_IOV_ETH].iov_base = &tcp6_eth_src;
> -		iov[TCP_IOV_ETH].iov_len = sizeof(tcp6_eth_src);
> -		iov[TCP_IOV_IP].iov_base = &tcp6_flags_ip[i];
> -		iov[TCP_IOV_IP].iov_len = sizeof(tcp6_flags_ip[i]);
> +		iov[TCP_IOV_ETH] = IOV_OF_LVALUE(tcp6_eth_src);
> +		iov[TCP_IOV_IP] = IOV_OF_LVALUE(tcp6_flags_ip[i]);
>  		iov[TCP_IOV_PAYLOAD].iov_base = &tcp6_flags[i];
>  	}
>  }
> diff --git a/udp.c b/udp.c
> index 1a4c02f4..545212c5 100644
> --- a/udp.c
> +++ b/udp.c
> @@ -113,6 +113,7 @@
>  
>  #include "checksum.h"
>  #include "util.h"
> +#include "iov.h"
>  #include "ip.h"
>  #include "siphash.h"
>  #include "inany.h"
> @@ -315,8 +316,7 @@ static void udp_sock4_iov_init_one(const struct ctx *c, size_t i)
>  		.iph = L2_BUF_IP4_INIT(IPPROTO_UDP)
>  	};
>  
> -	siov->iov_base	= buf->data;
> -	siov->iov_len	= sizeof(buf->data);
> +	*siov		= IOV_OF_LVALUE(buf->data);
>  
>  	mh->msg_name	= &buf->s_in;
>  	mh->msg_namelen	= sizeof(buf->s_in);
> @@ -343,8 +343,7 @@ static void udp_sock6_iov_init_one(const struct ctx *c, size_t i)
>  		.ip6h = L2_BUF_IP6_INIT(IPPROTO_UDP)
>  	};
>  
> -	siov->iov_base	= buf->data;
> -	siov->iov_len	= sizeof(buf->data);
> +	*siov		 = IOV_OF_LVALUE(buf->data);

Extra whitespace between tabs and =.

-- 
Stefano


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 7/7] tcp: Update tap specific header too in tcp_fill_headers[46]()
  2024-04-29  7:09 ` [PATCH 7/7] tcp: Update tap specific header too in tcp_fill_headers[46]() David Gibson
@ 2024-04-30 18:48   ` Stefano Brivio
  2024-05-01  0:10     ` David Gibson
  0 siblings, 1 reply; 18+ messages in thread
From: Stefano Brivio @ 2024-04-30 18:48 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev, Laurent Vivier

On Mon, 29 Apr 2024 17:09:33 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> tcp_fill_headers[46]() fill most of the headers, but the tap specific
> header (the frame length for qemu sockets) is filled in afterwards.
> Filling this as well:
>   * Removes a little redundancy between the tcp_send_flag() and
>     tcp_data_to_tap() path
>   * Makes calculation of the correct length a little easier
>   * Removes the now misleadingly named 'vnet_len' variable in
>     tcp_send_flag()
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  tcp.c | 28 ++++++++++++++--------------
>  1 file changed, 14 insertions(+), 14 deletions(-)
> 
> diff --git a/tcp.c b/tcp.c
> index 27c06958..01987c04 100644
> --- a/tcp.c
> +++ b/tcp.c
> @@ -1321,6 +1321,7 @@ static void tcp_fill_header(struct tcphdr *th,
>   * tcp_fill_headers4() - Fill 802.3, IPv4, TCP headers in pre-cooked buffers
>   * @c:		Execution context
>   * @conn:	Connection pointer
> + * @taph:	TAP backend specific header

Here and below: s/TAP/tap/. Other than the minor comments I shared the
series looks good to me (and looks like a big improvement in terms of
possible sources of confusion).

-- 
Stefano


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/7] tap: Split tap specific and L2 (ethernet) headers
  2024-04-30 18:46   ` Stefano Brivio
@ 2024-04-30 23:53     ` David Gibson
  0 siblings, 0 replies; 18+ messages in thread
From: David Gibson @ 2024-04-30 23:53 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev, Laurent Vivier

[-- Attachment #1: Type: text/plain, Size: 7540 bytes --]

On Tue, Apr 30, 2024 at 08:46:29PM +0200, Stefano Brivio wrote:
> On Mon, 29 Apr 2024 17:09:28 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > In some places (well, actually only UDP now) we use struct tap_hdr to
> > represent both tap backend specific and L2 ethernet headers.  Handling
> > these together seemed like a good idea at the time, but Laurent's changes
> > in the TCP code working towards vhost-user support suggest that treating
> > them separately is more useful, more often.
> > 
> > Alter struct tap_hdr to represent only the TAP backend specific headers.
> > Updated related helpers and the UDP code to match.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  tap.h | 21 +++++++++------------
> >  udp.c | 23 ++++++++++++++---------
> >  2 files changed, 23 insertions(+), 21 deletions(-)
> > 
> > diff --git a/tap.h b/tap.h
> > index 2adc4e2b..dbc23b31 100644
> > --- a/tap.h
> > +++ b/tap.h
> > @@ -6,30 +6,28 @@
> >  #ifndef TAP_H
> >  #define TAP_H
> >  
> > +#define ETH_HDR_INIT(proto) { .h_proto = htons_constant(proto) }
> > +
> >  /**
> > - * struct tap_hdr - L2 and tap specific headers
> > + * struct tap_hdr - tap backend specific headers
> >   * @vnet_len:	Frame length (for qemu socket transport)
> > - * @eh:		Ethernet header
> >   */
> >  struct tap_hdr {
> >  	uint32_t vnet_len;
> > -	struct ethhdr eh;
> >  } __attribute__((packed));
> 
> No need to have it packed anymore.

I'd arguably that conceptually it should still be packed.  This is
still expected to be a contiguous block of bytes, laid out in a
specific way.  Of course, since it's just a single u32, ((packed))
doesn't have any effect, but that could change if we need any
different tap-specific headers in future.

> > -#define TAP_HDR_INIT(proto) { .eh.h_proto = htons_constant(proto) }
> > -
> >  static inline size_t tap_hdr_len_(const struct ctx *c)
> >  {
> >  	if (c->mode == MODE_PASST)
> >  		return sizeof(struct tap_hdr);
> >  	else
> > -		return sizeof(struct ethhdr);
> > +		return 0;
> >  }
> >  
> >  /**
> >   * tap_frame_base() - Find start of tap frame
> >   * @c:		Execution context
> > - * @taph:	Pointer to L2 and tap specific header buffer
> > + * @taph:	Pointer to tap specific header buffer
> >   *
> >   * Returns: pointer to the start of tap frame - suitable for an
> >   *          iov_base to be passed to tap_send_frames())
> > @@ -43,17 +41,16 @@ static inline void *tap_frame_base(const struct ctx *c, struct tap_hdr *taph)
> >   * tap_frame_len() - Finalize tap frame and return total length
> >   * @c:		Execution context
> >   * @taph:	Tap header to finalize
> > - * @plen:	L3 packet length (excludes L2 and tap specific headers)
> > + * @plen:	L2 packet length (includes L2, excludes tap specific headers)
> >   *
> > - * Returns: length of the tap frame including L2 and tap specific
> > - *          headers - suitable for an iov_len to be passed to
> > - *          tap_send_frames()
> > + * Returns: length of the tap frame including tap specific headers - suitable
> > + *          for an iov_len to be passed to tap_send_frames()
> >   */
> >  static inline size_t tap_frame_len(const struct ctx *c, struct tap_hdr *taph,
> >  				   size_t plen)
> >  {
> >  	if (c->mode == MODE_PASST)
> > -		taph->vnet_len = htonl(plen + sizeof(taph->eh));
> > +		taph->vnet_len = htonl(plen);
> >  	return plen + tap_hdr_len_(c);
> >  }
> >  
> > diff --git a/udp.c b/udp.c
> > index 594ea191..c3e4f6b6 100644
> > --- a/udp.c
> > +++ b/udp.c
> > @@ -173,7 +173,8 @@ static uint8_t udp_act[IP_VERSIONS][UDP_ACT_TYPE_MAX][DIV_ROUND_UP(NUM_PORTS, 8)
> >  /**
> >   * udp4_l2_buf_t - Pre-cooked IPv4 packet buffers for tap connections
> >   * @s_in:	Source socket address, filled in by recvmmsg()
> > - * @taph:	Tap-level headers (partially pre-filled)
> > + * @taph:	Tap backend specific header
> > + * @eh:		Prefilled ethernet header
> >   * @iph:	Pre-filled IP header (except for tot_len and saddr)
> >   * @uh:		Headroom for UDP header
> >   * @data:	Storage for UDP payload
> > @@ -182,6 +183,7 @@ static struct udp4_l2_buf_t {
> >  	struct sockaddr_in s_in;
> >  
> >  	struct tap_hdr taph;
> > +	struct ethhdr eh;
> >  	struct iphdr iph;
> >  	struct udphdr uh;
> >  	uint8_t data[USHRT_MAX -
> > @@ -192,7 +194,8 @@ udp4_l2_buf[UDP_MAX_FRAMES];
> >  /**
> >   * udp6_l2_buf_t - Pre-cooked IPv6 packet buffers for tap connections
> >   * @s_in6:	Source socket address, filled in by recvmmsg()
> > - * @taph:	Tap-level headers (partially pre-filled)
> > + * @taph:	Tap backend specific header
> > + * @eh:		Pre-filled ethernet header
> >   * @ip6h:	Pre-filled IP header (except for payload_len and addresses)
> >   * @uh:		Headroom for UDP header
> >   * @data:	Storage for UDP payload
> > @@ -202,10 +205,11 @@ struct udp6_l2_buf_t {
> >  #ifdef __AVX2__
> >  	/* Align ip6h to 32-byte boundary. */
> >  	uint8_t pad[64 - (sizeof(struct sockaddr_in6) + sizeof(struct ethhdr) +
> > -			  sizeof(uint32_t))];
> > +			  sizeof(struct tap_hdr))];
> >  #endif
> >  
> >  	struct tap_hdr taph;
> > +	struct ethhdr eh;
> >  	struct ipv6hdr ip6h;
> >  	struct udphdr uh;
> >  	uint8_t data[USHRT_MAX -
> > @@ -289,8 +293,8 @@ void udp_update_l2_buf(const unsigned char *eth_d, const unsigned char *eth_s)
> >  		struct udp4_l2_buf_t *b4 = &udp4_l2_buf[i];
> >  		struct udp6_l2_buf_t *b6 = &udp6_l2_buf[i];
> >  
> > -		eth_update_mac(&b4->taph.eh, eth_d, eth_s);
> > -		eth_update_mac(&b6->taph.eh, eth_d, eth_s);
> > +		eth_update_mac(&b4->eh, eth_d, eth_s);
> > +		eth_update_mac(&b6->eh, eth_d, eth_s);
> >  	}
> >  }
> >  
> > @@ -307,7 +311,7 @@ static void udp_sock4_iov_init_one(const struct ctx *c, size_t i)
> >  	struct iovec *tiov = &udp4_l2_iov_tap[i];
> >  
> >  	*buf = (struct udp4_l2_buf_t) {
> > -		.taph = TAP_HDR_INIT(ETH_P_IP),
> > +		.eh  = ETH_HDR_INIT(ETH_P_IP),
> >  		.iph = L2_BUF_IP4_INIT(IPPROTO_UDP)
> >  	};
> >  
> > @@ -335,7 +339,7 @@ static void udp_sock6_iov_init_one(const struct ctx *c, size_t i)
> >  	struct iovec *tiov = &udp6_l2_iov_tap[i];
> >  
> >  	*buf = (struct udp6_l2_buf_t) {
> > -		.taph = TAP_HDR_INIT(ETH_P_IPV6),
> > +		.eh   = ETH_HDR_INIT(ETH_P_IPV6),
> >  		.ip6h = L2_BUF_IP6_INIT(IPPROTO_UDP)
> >  	};
> >  
> > @@ -608,7 +612,7 @@ static size_t udp_update_hdr4(const struct ctx *c, struct udp4_l2_buf_t *b,
> >  	b->uh.dest = htons(dstport);
> >  	b->uh.len = htons(datalen + sizeof(b->uh));
> >  
> > -	return tap_frame_len(c, &b->taph, ip_len);
> > +	return tap_frame_len(c, &b->taph, ip_len + sizeof(b->eh));
> >  }
> >  
> >  /**
> > @@ -676,7 +680,8 @@ static size_t udp_update_hdr6(const struct ctx *c, struct udp6_l2_buf_t *b,
> >  	b->uh.len = b->ip6h.payload_len;
> >  	csum_udp6(&b->uh, src, dst, b->data, datalen);
> >  
> > -	return tap_frame_len(c, &b->taph, payload_len + sizeof(b->ip6h));
> > +	return tap_frame_len(c, &b->taph, payload_len + sizeof(b->ip6h)
> > +			     + sizeof(b->eh));
> 
> Nit: the + operator should be on the first line for consistency, and
> for readability I'd rather do:
> 
> 	return tap_frame_len(c, &b->taph, payload_len + sizeof(b->ip6h) +
> 							sizeof(b->eh));

Tweaked, thanks.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/7] treewide: Standardise variable names for various packet lengths
  2024-04-30 18:46   ` Stefano Brivio
@ 2024-05-01  0:05     ` David Gibson
  0 siblings, 0 replies; 18+ messages in thread
From: David Gibson @ 2024-05-01  0:05 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev, Laurent Vivier

[-- Attachment #1: Type: text/plain, Size: 5829 bytes --]

On Tue, Apr 30, 2024 at 08:46:54PM +0200, Stefano Brivio wrote:
> On Mon, 29 Apr 2024 17:09:29 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > At various points we need to track the lengths of a packet including or
> > excluding various different sets of headers.  We don't always use the same
> > variable names for doing so.  Worse in some places we use the same name
> > for different things: e.g. tcp_fill_headers[46]() use ip_len for the
> > length including the IP headers, but then tcp_send_flag() which calls it
> > uses it to mean the IP payload length only.
> > 
> > To improve clarity, standardise on these names:
> >    plen:		L4 protocol payload length
> >    l4len:		plen + length of L4 protocol header
> >    l3len:		l4len + length of IPv4/IPv6 header
> 
> If we're dealing with IPv4, I guess tot_len is still a reasonable
> synonym for this, as you left for csum_ip4_header().

I'd prefer to stick with l3len, so we're consistent across both IPv4
and IPv6.  csum_ip4_header() is a special case, becase..

> 
> >    l2len:		l3len + length of L2 (ethernet) header
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  checksum.c | 42 +++++++++++++++++-----------------
> >  checksum.h | 10 ++++-----
> >  icmp.c     |  8 +++----
> >  passt.h    |  4 ++--
> >  tap.c      | 48 +++++++++++++++++++--------------------
> >  tcp.c      | 66 +++++++++++++++++++++++++++---------------------------
> >  udp.c      | 24 ++++++++++----------
> >  7 files changed, 101 insertions(+), 101 deletions(-)
> > 
> > diff --git a/checksum.c b/checksum.c
> > index 9cbe0b29..3602215a 100644
> > --- a/checksum.c
> > +++ b/checksum.c
> > @@ -116,7 +116,7 @@ uint16_t csum_fold(uint32_t sum)
> >  
> >  /**
> >   * csum_ip4_header() - Calculate IPv4 header checksum
> > - * @tot_len:	IPv4 payload length (data + IP header, network order)
> > + * @tot_len:	IPv4 packet length (data + IP header, network order)

..the value here is network order.  I'm sticking with "tot_len" to
emphasise that this is exactly the same bytes as the header field,
rather than a logical integer value.  I don't love that - I never like
passing integers in non-native endianness - but I prefer it to using
the standard name for a value that doesn't have standard encoding.

I'm considering a separate cleanup for the endianness here, but that's
out of scope for this patch.

> >   * @protocol:	Protocol number (network order)
> >   * @saddr:	IPv4 source address (network order)
> >   * @daddr:	IPv4 destination address (network order)
> > @@ -140,13 +140,13 @@ uint16_t csum_ip4_header(uint16_t tot_len, uint8_t protocol,
> >  /**
> >   * proto_ipv4_header_psum() - Calculates the partial checksum of an
> >   * 			      IPv4 header for UDP or TCP
> > - * @tot_len:	IPv4 Payload length (host order)
> > + * @l4len:	IPv4 Payload length (host order)
> 
> Ouch, how did this end up being called tot_len...

Yeah :/

[snip]
> > diff --git a/passt.h b/passt.h
> > index 76026f09..d1add470 100644
> > --- a/passt.h
> > +++ b/passt.h
> > @@ -22,11 +22,11 @@ struct tap_msg {
> >  /**
> >   * struct tap_l4_msg - Layer-4 message descriptor for protocol handlers
> >   * @pkt_buf_offset:	Offset of message from @pkt_buf
> > - * @l4_len:		Length of Layer-4 payload, host order
> > + * @l4len:		Length of Layer-4 payload, host order
> 
> This is (and was) ambiguous, but in any case, we don't use the struct
> anymore since commit bb708111833e ("treewide: Packet abstraction with
> mandatory boundary checks"), so I'm fine either way, we don't necessarily
> need to drop it here, I think.

Huh, hadn't noticed that.  I'll insert a patch removing the unused
structure before this one.

[snip]
> > @@ -1658,11 +1660,11 @@ static int tcp_send_flag(struct ctx *c, struct tcp_tap_conn *conn, int flags)
> >  	th->syn = !!(flags & SYN);
> >  	th->fin = !!(flags & FIN);
> >  
> > -	ip_len = tcp_l2_buf_fill_headers(c, conn, iov, optlen, NULL,
> > -					 conn->seq_to_tap);
> > -	iov[TCP_IOV_PAYLOAD].iov_len = ip_len;
> > +	l4len = tcp_l2_buf_fill_headers(c, conn, iov, optlen, NULL,
> > +					conn->seq_to_tap);
> > +	iov[TCP_IOV_PAYLOAD].iov_len = l4len;
> >  
> > -	*(uint32_t *)iov[TCP_IOV_VLEN].iov_base = htonl(vnet_len + ip_len);
> > +	*(uint32_t *)iov[TCP_IOV_VLEN].iov_base = htonl(vnet_len + l4len);
> 
> I was completely puzzled by this until I reached 7/7. Okay, this works
> because vnet_len includes everything that's not in l4len.

Yeah, 'vnet_len' is a bad name here , but fixing that is not within
this patch's scope.

> There's one bit of confusion left after this series, though. This series
> actually highlights that. I'm not sure if it's convenient to fix that
> here as well:
> 
> 		l4len = tcp_l2_buf_fill_headers(c, conn, iov, plen,
> 		check, seq);
> 
> This fills in Layer-2 buffers, and returns Layer-3 payload length (i.e.
> payload plus size of Layer-4 headers). I didn't really think about it
> when the variable was ip_len, but now it's apparent.

I'm not entirely clear what fix you're envisaging, and I wonder if
it's covered in my subsequent patches.

> Further on:
> 
>                 iov[TCP_IOV_PAYLOAD].iov_len = l4len;
> 
> ...so l4len is the length of the payload plus Layer-4 header, but we
> use that to set the buffer for TCP_IOV_PAYLOAD...

Yeah.. there's some ambiguity in the meaning of "payload".  In
TCP_IOV_PAYLOAD it means the IP payload, but in 'plen' it means the L4
payload.  I'm open to improved names for either "plen" or
"TCP_IOV_PAYLOAD".

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/7] tap, tcp: (Re-)abstract TAP specific header handling
  2024-04-30 18:47   ` Stefano Brivio
@ 2024-05-01  0:06     ` David Gibson
  0 siblings, 0 replies; 18+ messages in thread
From: David Gibson @ 2024-05-01  0:06 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev, Laurent Vivier

[-- Attachment #1: Type: text/plain, Size: 1709 bytes --]

On Tue, Apr 30, 2024 at 08:47:18PM +0200, Stefano Brivio wrote:
> On Mon, 29 Apr 2024 17:09:31 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > Recent changes to the TCP code (reworking of the buffer handling) have
> > meant that it now (again) deals explicitly with the MODE_PASST specific
> > vnet_len field, instead of using the (partial) abstractions provided by the
> > tap layer.
> > 
> > The abstractions we had don't work for the new TCP structure, so make some
> > new ones that do: tap_hdr_iov() which constructs an iovec suitable for
> > containing (just) the TAP specific header and tap_hdr_update() which
> > updates it as necessary per-packet.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  tap.h | 27 +++++++++++++++++++++++++++
> >  tcp.c | 40 +++++++++++++++-------------------------
> >  2 files changed, 42 insertions(+), 25 deletions(-)
> > 
> > diff --git a/tap.h b/tap.h
> > index dbc23b31..75aa3f03 100644
> > --- a/tap.h
> > +++ b/tap.h
> > @@ -16,6 +16,33 @@ struct tap_hdr {
> >  	uint32_t vnet_len;
> >  } __attribute__((packed));
> >  
> > +/**
> > + * tap_hdr_iov() - struct iovec for a tap header
> > + * @c:		Execution context
> > + * @taph:	Pointer to tap specific header buffer
> > + *
> > + * Returns: A struct iovec covering the correct portion of @taph to use as the
> > + *          TAP specific header in the current configuration.
> 
> Nit: s/TAP/tap/ (not an acronym).

Fixed, thanks.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 6/7] iov: Helper macro to construct iovs covering existing variables or fields
  2024-04-30 18:47   ` Stefano Brivio
@ 2024-05-01  0:09     ` David Gibson
  0 siblings, 0 replies; 18+ messages in thread
From: David Gibson @ 2024-05-01  0:09 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev, Laurent Vivier

[-- Attachment #1: Type: text/plain, Size: 2253 bytes --]

On Tue, Apr 30, 2024 at 08:47:46PM +0200, Stefano Brivio wrote:
> On Mon, 29 Apr 2024 17:09:32 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > Laurent's recent changes mean we use IO vectors much more heavily in the
> > TCP code.  In many of those cases, and few others around the code base,
> > individual iovs of these vectors are constructed to exactly cover existing
> > variables or fields.  We can make initializing such iovs shorter and
> > clearer with a macro for the purpose.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  iov.h |  3 +++
> >  tap.c |  3 +--
> >  tcp.c | 24 +++++++++---------------
> >  udp.c |  7 +++----
> >  4 files changed, 16 insertions(+), 21 deletions(-)
> > 
> > diff --git a/iov.h b/iov.h
> > index 6058af77..5668ca5f 100644
> > --- a/iov.h
> > +++ b/iov.h
> > @@ -18,6 +18,9 @@
> >  #include <unistd.h>
> >  #include <string.h>
> >  
> > +#define IOV_OF_LVALUE(lval) \
> > +	(struct iovec){ .iov_base = &(lval), .iov_len = sizeof(lval) }
> 
> I'm not sure if IOV_OF_LVALUE() is much clearer than IOV_FROM() or
> IOV_OF(). No strong preference though.

Yeah, I don't love the name but it was the best I came up with.  I
wanted to emphasise the fact that it must be given an lvalue - this
has to be a macro, not a function (even inline).

[snip]
> > @@ -315,8 +316,7 @@ static void udp_sock4_iov_init_one(const struct ctx *c, size_t i)
> >  		.iph = L2_BUF_IP4_INIT(IPPROTO_UDP)
> >  	};
> >  
> > -	siov->iov_base	= buf->data;
> > -	siov->iov_len	= sizeof(buf->data);
> > +	*siov		= IOV_OF_LVALUE(buf->data);
> >  
> >  	mh->msg_name	= &buf->s_in;
> >  	mh->msg_namelen	= sizeof(buf->s_in);
> > @@ -343,8 +343,7 @@ static void udp_sock6_iov_init_one(const struct ctx *c, size_t i)
> >  		.ip6h = L2_BUF_IP6_INIT(IPPROTO_UDP)
> >  	};
> >  
> > -	siov->iov_base	= buf->data;
> > -	siov->iov_len	= sizeof(buf->data);
> > +	*siov		 = IOV_OF_LVALUE(buf->data);
> 
> Extra whitespace between tabs and =.

Oops, fixed.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 7/7] tcp: Update tap specific header too in tcp_fill_headers[46]()
  2024-04-30 18:48   ` Stefano Brivio
@ 2024-05-01  0:10     ` David Gibson
  0 siblings, 0 replies; 18+ messages in thread
From: David Gibson @ 2024-05-01  0:10 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev, Laurent Vivier

[-- Attachment #1: Type: text/plain, Size: 1555 bytes --]

On Tue, Apr 30, 2024 at 08:48:40PM +0200, Stefano Brivio wrote:
> On Mon, 29 Apr 2024 17:09:33 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > tcp_fill_headers[46]() fill most of the headers, but the tap specific
> > header (the frame length for qemu sockets) is filled in afterwards.
> > Filling this as well:
> >   * Removes a little redundancy between the tcp_send_flag() and
> >     tcp_data_to_tap() path
> >   * Makes calculation of the correct length a little easier
> >   * Removes the now misleadingly named 'vnet_len' variable in
> >     tcp_send_flag()
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  tcp.c | 28 ++++++++++++++--------------
> >  1 file changed, 14 insertions(+), 14 deletions(-)
> > 
> > diff --git a/tcp.c b/tcp.c
> > index 27c06958..01987c04 100644
> > --- a/tcp.c
> > +++ b/tcp.c
> > @@ -1321,6 +1321,7 @@ static void tcp_fill_header(struct tcphdr *th,
> >   * tcp_fill_headers4() - Fill 802.3, IPv4, TCP headers in pre-cooked buffers
> >   * @c:		Execution context
> >   * @conn:	Connection pointer
> > + * @taph:	TAP backend specific header
> 
> Here and below: s/TAP/tap/. Other than the minor comments I shared the

Adjusted, thanks.

> series looks good to me (and looks like a big improvement in terms of
> possible sources of confusion).

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2024-05-01  0:10 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-29  7:09 [PATCH 0/7] Small improvements to IOV handling David Gibson
2024-04-29  7:09 ` [PATCH 1/7] checksum: Use proto_ipv6_header_psum() for ICMPv6 as well David Gibson
2024-04-29  7:09 ` [PATCH 2/7] tap: Split tap specific and L2 (ethernet) headers David Gibson
2024-04-30 18:46   ` Stefano Brivio
2024-04-30 23:53     ` David Gibson
2024-04-29  7:09 ` [PATCH 3/7] treewide: Standardise variable names for various packet lengths David Gibson
2024-04-30 18:46   ` Stefano Brivio
2024-05-01  0:05     ` David Gibson
2024-04-29  7:09 ` [PATCH 4/7] tcp: Simplify packet length calculation when preparing headers David Gibson
2024-04-29  7:09 ` [PATCH 5/7] tap, tcp: (Re-)abstract TAP specific header handling David Gibson
2024-04-30 18:47   ` Stefano Brivio
2024-05-01  0:06     ` David Gibson
2024-04-29  7:09 ` [PATCH 6/7] iov: Helper macro to construct iovs covering existing variables or fields David Gibson
2024-04-30 18:47   ` Stefano Brivio
2024-05-01  0:09     ` David Gibson
2024-04-29  7:09 ` [PATCH 7/7] tcp: Update tap specific header too in tcp_fill_headers[46]() David Gibson
2024-04-30 18:48   ` Stefano Brivio
2024-05-01  0:10     ` David Gibson

Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).