* [PATCH 0/7] Rework some IOV handling in TCP code
@ 2024-10-28 9:40 David Gibson
2024-10-28 9:40 ` [PATCH 1/7] tcp: Pass TCP header and payload separately to tcp_update_check_tcp[46]() David Gibson
` (6 more replies)
0 siblings, 7 replies; 14+ messages in thread
From: David Gibson @ 2024-10-28 9:40 UTC (permalink / raw)
To: Stefano Brivio, Laurent Vivier, passt-dev; +Cc: David Gibson
These reworks are largely aimed at making the vhost-user integration
easier, and with luck allowing more logic to be shared between it and
the existing "buffer" paths. Of course, in the short term, these will
probably conflict with the patches... I hope it ends up as a net
positive, Laurent, let me know.
I think a number of similar changes should be possible for UDP, but I
haven't tackled that yet.
David Gibson (7):
tcp: Pass TCP header and payload separately to
tcp_update_check_tcp[46]()
tcp: Move tcp_l2_buf_fill_headers() to tcp_buf.c
tcp: Rework tcp_l2_buf_fill_headers() into tcp_buf_make_frame()
tcp: Don't use return value from tcp_fill_headers[46] to adjust
iov_len
tcp: Pass TCP header and payload separately to tcp_fill_headers[46]()
tcp: Merge tcp_update_check_tcp[46]()
tcp: Fold tcp_update_csum() into tcp_fill_header()
tcp.c | 232 +++++++++++--------------------------------------
tcp_buf.c | 48 +++++++---
tcp_internal.h | 15 +++-
3 files changed, 100 insertions(+), 195 deletions(-)
--
2.47.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/7] tcp: Pass TCP header and payload separately to tcp_update_check_tcp[46]()
2024-10-28 9:40 [PATCH 0/7] Rework some IOV handling in TCP code David Gibson
@ 2024-10-28 9:40 ` David Gibson
2024-10-28 18:42 ` Stefano Brivio
2024-10-28 9:40 ` [PATCH 2/7] tcp: Move tcp_l2_buf_fill_headers() to tcp_buf.c David Gibson
` (5 subsequent siblings)
6 siblings, 1 reply; 14+ messages in thread
From: David Gibson @ 2024-10-28 9:40 UTC (permalink / raw)
To: Stefano Brivio, Laurent Vivier, passt-dev; +Cc: David Gibson
Currently these expects both the TCP header and payload in a single IOV,
and goes to some trouble to locate the checksum field within it. In the
current caller we've already know where the TCP header is, so we might as
well just pass it in. This will need to work a bit differently for
vhost-user, but that code already needs to locate the TCP header for other
reasons, so again we can just pass it in.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
tcp.c | 104 ++++++++++++++--------------------------------------------
1 file changed, 24 insertions(+), 80 deletions(-)
diff --git a/tcp.c b/tcp.c
index 0569dc6..f2898ff 100644
--- a/tcp.c
+++ b/tcp.c
@@ -753,104 +753,48 @@ static void tcp_sock_set_bufsize(const struct ctx *c, int s)
/**
* tcp_update_check_tcp4() - Calculate TCP checksum for IPv4
* @iph: IPv4 header
- * @iov: Pointer to the array of IO vectors
- * @iov_cnt: Length of the array
- * @l4offset: IPv4 payload offset in the iovec array
+ * @th: TCP header (updated)
+ * @iov: IO vector containing the TCP payload
+ * @iov_cnt: Length of @iov
+ * @doffset: TCP payload offset in @iov
*/
-static void tcp_update_check_tcp4(const struct iphdr *iph,
+static void tcp_update_check_tcp4(const struct iphdr *iph, struct tcphdr *th,
const struct iovec *iov, int iov_cnt,
- size_t l4offset)
+ size_t doffset)
{
uint16_t l4len = ntohs(iph->tot_len) - sizeof(struct iphdr);
struct in_addr saddr = { .s_addr = iph->saddr };
struct in_addr daddr = { .s_addr = iph->daddr };
- size_t check_ofs;
- __sum16 *check;
- int check_idx;
uint32_t sum;
- char *ptr;
sum = proto_ipv4_header_psum(l4len, IPPROTO_TCP, saddr, daddr);
- check_idx = iov_skip_bytes(iov, iov_cnt,
- l4offset + offsetof(struct tcphdr, check),
- &check_ofs);
-
- if (check_idx >= iov_cnt) {
- err("TCP4 buffer is too small, iov size %zd, check offset %zd",
- iov_size(iov, iov_cnt),
- l4offset + offsetof(struct tcphdr, check));
- return;
- }
-
- if (check_ofs + sizeof(*check) > iov[check_idx].iov_len) {
- err("TCP4 checksum field memory is not contiguous "
- "check_ofs %zd check_idx %d iov_len %zd",
- check_ofs, check_idx, iov[check_idx].iov_len);
- return;
- }
-
- ptr = (char *)iov[check_idx].iov_base + check_ofs;
- if ((uintptr_t)ptr & (__alignof__(*check) - 1)) {
- err("TCP4 checksum field is not correctly aligned in memory");
- return;
- }
-
- check = (__sum16 *)ptr;
-
- *check = 0;
- *check = csum_iov(iov, iov_cnt, l4offset, sum);
+ th->check = 0;
+ sum = csum_unfolded(th, sizeof(*th), sum);
+ th->check = csum_iov(iov, iov_cnt, doffset, sum);
}
/**
* tcp_update_check_tcp6() - Calculate TCP checksum for IPv6
* @ip6h: IPv6 header
- * @iov: Pointer to the array of IO vectors
- * @iov_cnt: Length of the array
- * @l4offset: IPv6 payload offset in the iovec array
+ * @th: TCP header (updated)
+ * @iov: IO vector containing the TCP payload
+ * @iov_cnt: Length of @iov
+ * @doffset: TCP payload offset in @iov
*/
-static void tcp_update_check_tcp6(const struct ipv6hdr *ip6h,
+static void tcp_update_check_tcp6(const struct ipv6hdr *ip6h, struct tcphdr *th,
const struct iovec *iov, int iov_cnt,
- size_t l4offset)
+ size_t doffset)
{
uint16_t l4len = ntohs(ip6h->payload_len);
- size_t check_ofs;
- __sum16 *check;
- int check_idx;
uint32_t sum;
- char *ptr;
sum = proto_ipv6_header_psum(l4len, IPPROTO_TCP, &ip6h->saddr,
&ip6h->daddr);
- check_idx = iov_skip_bytes(iov, iov_cnt,
- l4offset + offsetof(struct tcphdr, check),
- &check_ofs);
-
- if (check_idx >= iov_cnt) {
- err("TCP6 buffer is too small, iov size %zd, check offset %zd",
- iov_size(iov, iov_cnt),
- l4offset + offsetof(struct tcphdr, check));
- return;
- }
-
- if (check_ofs + sizeof(*check) > iov[check_idx].iov_len) {
- err("TCP6 checksum field memory is not contiguous "
- "check_ofs %zd check_idx %d iov_len %zd",
- check_ofs, check_idx, iov[check_idx].iov_len);
- return;
- }
-
- ptr = (char *)iov[check_idx].iov_base + check_ofs;
- if ((uintptr_t)ptr & (__alignof__(*check) - 1)) {
- err("TCP6 checksum field is not correctly aligned in memory");
- return;
- }
-
- check = (__sum16 *)ptr;
-
- *check = 0;
- *check = csum_iov(iov, iov_cnt, l4offset, sum);
+ th->check = 0;
+ sum = csum_unfolded(th, sizeof(*th), sum);
+ th->check = csum_iov(iov, iov_cnt, doffset, sum);
}
/**
@@ -1005,11 +949,11 @@ static size_t tcp_fill_headers4(const struct tcp_tap_conn *conn,
bp->th.check = 0;
} else {
const struct iovec iov = {
- .iov_base = bp,
- .iov_len = ntohs(iph->tot_len) - sizeof(struct iphdr),
+ .iov_base = bp->data,
+ .iov_len = dlen,
};
- tcp_update_check_tcp4(iph, &iov, 1, 0);
+ tcp_update_check_tcp4(iph, &bp->th, &iov, 1, 0);
}
tap_hdr_update(taph, l3len + sizeof(struct ethhdr));
@@ -1056,11 +1000,11 @@ static size_t tcp_fill_headers6(const struct tcp_tap_conn *conn,
bp->th.check = 0;
} else {
const struct iovec iov = {
- .iov_base = bp,
- .iov_len = ntohs(ip6h->payload_len)
+ .iov_base = bp->data,
+ .iov_len = dlen,
};
- tcp_update_check_tcp6(ip6h, &iov, 1, 0);
+ tcp_update_check_tcp6(ip6h, &bp->th, &iov, 1, 0);
}
tap_hdr_update(taph, l4len + sizeof(*ip6h) + sizeof(struct ethhdr));
--
@@ -753,104 +753,48 @@ static void tcp_sock_set_bufsize(const struct ctx *c, int s)
/**
* tcp_update_check_tcp4() - Calculate TCP checksum for IPv4
* @iph: IPv4 header
- * @iov: Pointer to the array of IO vectors
- * @iov_cnt: Length of the array
- * @l4offset: IPv4 payload offset in the iovec array
+ * @th: TCP header (updated)
+ * @iov: IO vector containing the TCP payload
+ * @iov_cnt: Length of @iov
+ * @doffset: TCP payload offset in @iov
*/
-static void tcp_update_check_tcp4(const struct iphdr *iph,
+static void tcp_update_check_tcp4(const struct iphdr *iph, struct tcphdr *th,
const struct iovec *iov, int iov_cnt,
- size_t l4offset)
+ size_t doffset)
{
uint16_t l4len = ntohs(iph->tot_len) - sizeof(struct iphdr);
struct in_addr saddr = { .s_addr = iph->saddr };
struct in_addr daddr = { .s_addr = iph->daddr };
- size_t check_ofs;
- __sum16 *check;
- int check_idx;
uint32_t sum;
- char *ptr;
sum = proto_ipv4_header_psum(l4len, IPPROTO_TCP, saddr, daddr);
- check_idx = iov_skip_bytes(iov, iov_cnt,
- l4offset + offsetof(struct tcphdr, check),
- &check_ofs);
-
- if (check_idx >= iov_cnt) {
- err("TCP4 buffer is too small, iov size %zd, check offset %zd",
- iov_size(iov, iov_cnt),
- l4offset + offsetof(struct tcphdr, check));
- return;
- }
-
- if (check_ofs + sizeof(*check) > iov[check_idx].iov_len) {
- err("TCP4 checksum field memory is not contiguous "
- "check_ofs %zd check_idx %d iov_len %zd",
- check_ofs, check_idx, iov[check_idx].iov_len);
- return;
- }
-
- ptr = (char *)iov[check_idx].iov_base + check_ofs;
- if ((uintptr_t)ptr & (__alignof__(*check) - 1)) {
- err("TCP4 checksum field is not correctly aligned in memory");
- return;
- }
-
- check = (__sum16 *)ptr;
-
- *check = 0;
- *check = csum_iov(iov, iov_cnt, l4offset, sum);
+ th->check = 0;
+ sum = csum_unfolded(th, sizeof(*th), sum);
+ th->check = csum_iov(iov, iov_cnt, doffset, sum);
}
/**
* tcp_update_check_tcp6() - Calculate TCP checksum for IPv6
* @ip6h: IPv6 header
- * @iov: Pointer to the array of IO vectors
- * @iov_cnt: Length of the array
- * @l4offset: IPv6 payload offset in the iovec array
+ * @th: TCP header (updated)
+ * @iov: IO vector containing the TCP payload
+ * @iov_cnt: Length of @iov
+ * @doffset: TCP payload offset in @iov
*/
-static void tcp_update_check_tcp6(const struct ipv6hdr *ip6h,
+static void tcp_update_check_tcp6(const struct ipv6hdr *ip6h, struct tcphdr *th,
const struct iovec *iov, int iov_cnt,
- size_t l4offset)
+ size_t doffset)
{
uint16_t l4len = ntohs(ip6h->payload_len);
- size_t check_ofs;
- __sum16 *check;
- int check_idx;
uint32_t sum;
- char *ptr;
sum = proto_ipv6_header_psum(l4len, IPPROTO_TCP, &ip6h->saddr,
&ip6h->daddr);
- check_idx = iov_skip_bytes(iov, iov_cnt,
- l4offset + offsetof(struct tcphdr, check),
- &check_ofs);
-
- if (check_idx >= iov_cnt) {
- err("TCP6 buffer is too small, iov size %zd, check offset %zd",
- iov_size(iov, iov_cnt),
- l4offset + offsetof(struct tcphdr, check));
- return;
- }
-
- if (check_ofs + sizeof(*check) > iov[check_idx].iov_len) {
- err("TCP6 checksum field memory is not contiguous "
- "check_ofs %zd check_idx %d iov_len %zd",
- check_ofs, check_idx, iov[check_idx].iov_len);
- return;
- }
-
- ptr = (char *)iov[check_idx].iov_base + check_ofs;
- if ((uintptr_t)ptr & (__alignof__(*check) - 1)) {
- err("TCP6 checksum field is not correctly aligned in memory");
- return;
- }
-
- check = (__sum16 *)ptr;
-
- *check = 0;
- *check = csum_iov(iov, iov_cnt, l4offset, sum);
+ th->check = 0;
+ sum = csum_unfolded(th, sizeof(*th), sum);
+ th->check = csum_iov(iov, iov_cnt, doffset, sum);
}
/**
@@ -1005,11 +949,11 @@ static size_t tcp_fill_headers4(const struct tcp_tap_conn *conn,
bp->th.check = 0;
} else {
const struct iovec iov = {
- .iov_base = bp,
- .iov_len = ntohs(iph->tot_len) - sizeof(struct iphdr),
+ .iov_base = bp->data,
+ .iov_len = dlen,
};
- tcp_update_check_tcp4(iph, &iov, 1, 0);
+ tcp_update_check_tcp4(iph, &bp->th, &iov, 1, 0);
}
tap_hdr_update(taph, l3len + sizeof(struct ethhdr));
@@ -1056,11 +1000,11 @@ static size_t tcp_fill_headers6(const struct tcp_tap_conn *conn,
bp->th.check = 0;
} else {
const struct iovec iov = {
- .iov_base = bp,
- .iov_len = ntohs(ip6h->payload_len)
+ .iov_base = bp->data,
+ .iov_len = dlen,
};
- tcp_update_check_tcp6(ip6h, &iov, 1, 0);
+ tcp_update_check_tcp6(ip6h, &bp->th, &iov, 1, 0);
}
tap_hdr_update(taph, l4len + sizeof(*ip6h) + sizeof(struct ethhdr));
--
2.47.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/7] tcp: Move tcp_l2_buf_fill_headers() to tcp_buf.c
2024-10-28 9:40 [PATCH 0/7] Rework some IOV handling in TCP code David Gibson
2024-10-28 9:40 ` [PATCH 1/7] tcp: Pass TCP header and payload separately to tcp_update_check_tcp[46]() David Gibson
@ 2024-10-28 9:40 ` David Gibson
2024-10-28 9:40 ` [PATCH 3/7] tcp: Rework tcp_l2_buf_fill_headers() into tcp_buf_make_frame() David Gibson
` (4 subsequent siblings)
6 siblings, 0 replies; 14+ messages in thread
From: David Gibson @ 2024-10-28 9:40 UTC (permalink / raw)
To: Stefano Brivio, Laurent Vivier, passt-dev; +Cc: David Gibson
This function only has callers in tcp_buf.c. More importantly, it's
inherently tied to the "buf" path, because it uses internal knowledge of
how we lay out the various headers across our locally allocated buffers.
Therefore, move it to tcp_buf.c.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
tcp.c | 49 ++++++++-----------------------------------------
tcp_buf.c | 32 ++++++++++++++++++++++++++++++++
tcp_internal.h | 12 ++++++++----
3 files changed, 48 insertions(+), 45 deletions(-)
diff --git a/tcp.c b/tcp.c
index f2898ff..7c6f51a 100644
--- a/tcp.c
+++ b/tcp.c
@@ -922,11 +922,10 @@ static void tcp_fill_header(struct tcphdr *th,
*
* Return: The IPv4 payload length, host order
*/
-static size_t tcp_fill_headers4(const struct tcp_tap_conn *conn,
- struct tap_hdr *taph,
- struct iphdr *iph, struct tcp_payload_t *bp,
- size_t dlen, const uint16_t *check,
- uint32_t seq, bool no_tcp_csum)
+size_t tcp_fill_headers4(const struct tcp_tap_conn *conn,
+ struct tap_hdr *taph, struct iphdr *iph,
+ struct tcp_payload_t *bp, size_t dlen,
+ const uint16_t *check, uint32_t seq, bool no_tcp_csum)
{
const struct flowside *tapside = TAPFLOW(conn);
const struct in_addr *src4 = inany_v4(&tapside->oaddr);
@@ -974,10 +973,10 @@ static size_t tcp_fill_headers4(const struct tcp_tap_conn *conn,
*
* Return: The IPv6 payload length, host order
*/
-static size_t tcp_fill_headers6(const struct tcp_tap_conn *conn,
- struct tap_hdr *taph,
- struct ipv6hdr *ip6h, struct tcp_payload_t *bp,
- size_t dlen, uint32_t seq, bool no_tcp_csum)
+size_t tcp_fill_headers6(const struct tcp_tap_conn *conn,
+ struct tap_hdr *taph, struct ipv6hdr *ip6h,
+ struct tcp_payload_t *bp, size_t dlen,
+ uint32_t seq, bool no_tcp_csum)
{
const struct flowside *tapside = TAPFLOW(conn);
size_t l4len = dlen + sizeof(bp->th);
@@ -1012,38 +1011,6 @@ static size_t tcp_fill_headers6(const struct tcp_tap_conn *conn,
return l4len;
}
-/**
- * tcp_l2_buf_fill_headers() - Fill 802.3, IP, TCP headers in pre-cooked buffers
- * @conn: Connection pointer
- * @iov: Pointer to an array of iovec of TCP pre-cooked buffers
- * @dlen: TCP payload length
- * @check: Checksum, if already known
- * @seq: Sequence number for this segment
- * @no_tcp_csum: Do not set TCP checksum
- *
- * Return: IP payload length, host order
- */
-size_t tcp_l2_buf_fill_headers(const struct tcp_tap_conn *conn,
- struct iovec *iov, size_t dlen,
- const uint16_t *check, uint32_t seq,
- bool no_tcp_csum)
-{
- const struct flowside *tapside = TAPFLOW(conn);
- const struct in_addr *a4 = inany_v4(&tapside->oaddr);
-
- if (a4) {
- return tcp_fill_headers4(conn, iov[TCP_IOV_TAP].iov_base,
- iov[TCP_IOV_IP].iov_base,
- iov[TCP_IOV_PAYLOAD].iov_base, dlen,
- check, seq, no_tcp_csum);
- }
-
- return tcp_fill_headers6(conn, iov[TCP_IOV_TAP].iov_base,
- iov[TCP_IOV_IP].iov_base,
- iov[TCP_IOV_PAYLOAD].iov_base, dlen,
- seq, no_tcp_csum);
-}
-
/**
* tcp_update_seqack_wnd() - Update ACK sequence and window to guest/tap
* @c: Execution context
diff --git a/tcp_buf.c b/tcp_buf.c
index cb6742c..dbe565c 100644
--- a/tcp_buf.c
+++ b/tcp_buf.c
@@ -256,6 +256,38 @@ void tcp_payload_flush(const struct ctx *c)
tcp4_payload_used = 0;
}
+/**
+ * tcp_buf_fill_headers() - Fill 802.3, IP, TCP headers in pre-cooked buffers
+ * @conn: Connection pointer
+ * @iov: Pointer to an array of iovec of TCP pre-cooked buffers
+ * @dlen: TCP payload length
+ * @check: Checksum, if already known
+ * @seq: Sequence number for this segment
+ * @no_tcp_csum: Do not set TCP checksum
+ *
+ * Return: IP payload length, host order
+ */
+static size_t tcp_l2_buf_fill_headers(const struct tcp_tap_conn *conn,
+ struct iovec *iov, size_t dlen,
+ const uint16_t *check, uint32_t seq,
+ bool no_tcp_csum)
+{
+ const struct flowside *tapside = TAPFLOW(conn);
+ const struct in_addr *a4 = inany_v4(&tapside->oaddr);
+
+ if (a4) {
+ return tcp_fill_headers4(conn, iov[TCP_IOV_TAP].iov_base,
+ iov[TCP_IOV_IP].iov_base,
+ iov[TCP_IOV_PAYLOAD].iov_base, dlen,
+ check, seq, no_tcp_csum);
+ }
+
+ return tcp_fill_headers6(conn, iov[TCP_IOV_TAP].iov_base,
+ iov[TCP_IOV_IP].iov_base,
+ iov[TCP_IOV_PAYLOAD].iov_base, dlen,
+ seq, no_tcp_csum);
+}
+
/**
* tcp_buf_send_flag() - Send segment with flags to tap (no payload)
* @c: Execution context
diff --git a/tcp_internal.h b/tcp_internal.h
index a5a47df..0034b22 100644
--- a/tcp_internal.h
+++ b/tcp_internal.h
@@ -177,10 +177,14 @@ void tcp_rst_do(const struct ctx *c, struct tcp_tap_conn *conn);
struct tcp_info_linux;
-size_t tcp_l2_buf_fill_headers(const struct tcp_tap_conn *conn,
- struct iovec *iov, size_t dlen,
- const uint16_t *check, uint32_t seq,
- bool no_tcp_csum);
+size_t tcp_fill_headers4(const struct tcp_tap_conn *conn,
+ struct tap_hdr *taph, struct iphdr *iph,
+ struct tcp_payload_t *bp, size_t dlen,
+ const uint16_t *check, uint32_t seq, bool no_tcp_csum);
+size_t tcp_fill_headers6(const struct tcp_tap_conn *conn,
+ struct tap_hdr *taph, struct ipv6hdr *ip6h,
+ struct tcp_payload_t *bp, size_t dlen,
+ uint32_t seq, bool no_tcp_csum);
int tcp_update_seqack_wnd(const struct ctx *c, struct tcp_tap_conn *conn,
bool force_seq, struct tcp_info_linux *tinfo);
int tcp_prepare_flags(const struct ctx *c, struct tcp_tap_conn *conn,
--
@@ -177,10 +177,14 @@ void tcp_rst_do(const struct ctx *c, struct tcp_tap_conn *conn);
struct tcp_info_linux;
-size_t tcp_l2_buf_fill_headers(const struct tcp_tap_conn *conn,
- struct iovec *iov, size_t dlen,
- const uint16_t *check, uint32_t seq,
- bool no_tcp_csum);
+size_t tcp_fill_headers4(const struct tcp_tap_conn *conn,
+ struct tap_hdr *taph, struct iphdr *iph,
+ struct tcp_payload_t *bp, size_t dlen,
+ const uint16_t *check, uint32_t seq, bool no_tcp_csum);
+size_t tcp_fill_headers6(const struct tcp_tap_conn *conn,
+ struct tap_hdr *taph, struct ipv6hdr *ip6h,
+ struct tcp_payload_t *bp, size_t dlen,
+ uint32_t seq, bool no_tcp_csum);
int tcp_update_seqack_wnd(const struct ctx *c, struct tcp_tap_conn *conn,
bool force_seq, struct tcp_info_linux *tinfo);
int tcp_prepare_flags(const struct ctx *c, struct tcp_tap_conn *conn,
--
2.47.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/7] tcp: Rework tcp_l2_buf_fill_headers() into tcp_buf_make_frame()
2024-10-28 9:40 [PATCH 0/7] Rework some IOV handling in TCP code David Gibson
2024-10-28 9:40 ` [PATCH 1/7] tcp: Pass TCP header and payload separately to tcp_update_check_tcp[46]() David Gibson
2024-10-28 9:40 ` [PATCH 2/7] tcp: Move tcp_l2_buf_fill_headers() to tcp_buf.c David Gibson
@ 2024-10-28 9:40 ` David Gibson
2024-10-28 9:40 ` [PATCH 4/7] tcp: Don't use return value from tcp_fill_headers[46] to adjust iov_len David Gibson
` (3 subsequent siblings)
6 siblings, 0 replies; 14+ messages in thread
From: David Gibson @ 2024-10-28 9:40 UTC (permalink / raw)
To: Stefano Brivio, Laurent Vivier, passt-dev; +Cc: David Gibson
tcp_l2_buf_fill_headers() is always followed by updating the payload IOV
entry to the correct length of the frame. It already needs knowledge of
the frame/IOV layout, so we might as well perform that update inside the
function. Rename it to tcp_buf_make_frame() to reflect its expanded
duties.
While we're there use some temporaries to make our dissection of the IOV a
bit clearer.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
tcp_buf.c | 46 +++++++++++++++++++++-------------------------
1 file changed, 21 insertions(+), 25 deletions(-)
diff --git a/tcp_buf.c b/tcp_buf.c
index dbe565c..deb7be4 100644
--- a/tcp_buf.c
+++ b/tcp_buf.c
@@ -257,35 +257,38 @@ void tcp_payload_flush(const struct ctx *c)
}
/**
- * tcp_buf_fill_headers() - Fill 802.3, IP, TCP headers in pre-cooked buffers
+ * tcp_buf_make_frame() - Adjust IOV, complete headers to build a TCP frame
* @conn: Connection pointer
* @iov: Pointer to an array of iovec of TCP pre-cooked buffers
* @dlen: TCP payload length
* @check: Checksum, if already known
* @seq: Sequence number for this segment
* @no_tcp_csum: Do not set TCP checksum
- *
- * Return: IP payload length, host order
*/
-static size_t tcp_l2_buf_fill_headers(const struct tcp_tap_conn *conn,
- struct iovec *iov, size_t dlen,
- const uint16_t *check, uint32_t seq,
- bool no_tcp_csum)
+static void tcp_buf_make_frame(const struct tcp_tap_conn *conn,
+ struct iovec *iov, size_t dlen,
+ const uint16_t *check, uint32_t seq,
+ bool no_tcp_csum)
{
+ struct tcp_payload_t *payload = iov[TCP_IOV_PAYLOAD].iov_base;
+ struct tap_hdr *taph = iov[TCP_IOV_TAP].iov_base;
const struct flowside *tapside = TAPFLOW(conn);
const struct in_addr *a4 = inany_v4(&tapside->oaddr);
+ size_t l4len;
if (a4) {
- return tcp_fill_headers4(conn, iov[TCP_IOV_TAP].iov_base,
- iov[TCP_IOV_IP].iov_base,
- iov[TCP_IOV_PAYLOAD].iov_base, dlen,
- check, seq, no_tcp_csum);
+ struct iphdr *iph = iov[TCP_IOV_IP].iov_base;
+
+ l4len = tcp_fill_headers4(conn, taph, iph, payload, dlen,
+ check, seq, no_tcp_csum);
+ } else {
+ struct ipv6hdr *ip6h = iov[TCP_IOV_IP].iov_base;
+
+ l4len = tcp_fill_headers6(conn, taph, ip6h, payload, dlen,
+ seq, no_tcp_csum);
}
- return tcp_fill_headers6(conn, iov[TCP_IOV_TAP].iov_base,
- iov[TCP_IOV_IP].iov_base,
- iov[TCP_IOV_PAYLOAD].iov_base, dlen,
- seq, no_tcp_csum);
+ iov[TCP_IOV_PAYLOAD].iov_len = l4len;
}
/**
@@ -301,7 +304,6 @@ int tcp_buf_send_flag(const struct ctx *c, struct tcp_tap_conn *conn, int flags)
struct tcp_flags_t *payload;
struct iovec *iov;
size_t optlen;
- size_t l4len;
uint32_t seq;
int ret;
@@ -323,8 +325,7 @@ int tcp_buf_send_flag(const struct ctx *c, struct tcp_tap_conn *conn, int flags)
return ret;
}
- l4len = tcp_l2_buf_fill_headers(conn, iov, optlen, NULL, seq, false);
- iov[TCP_IOV_PAYLOAD].iov_len = l4len;
+ tcp_buf_make_frame(conn, iov, optlen, NULL, seq, false);
if (flags & DUP_ACK) {
struct iovec *dup_iov;
@@ -368,7 +369,6 @@ static void tcp_data_to_tap(const struct ctx *c, struct tcp_tap_conn *conn,
ssize_t dlen, int no_csum, uint32_t seq)
{
struct iovec *iov;
- size_t l4len;
conn->seq_to_tap = seq + dlen;
@@ -384,18 +384,14 @@ static void tcp_data_to_tap(const struct ctx *c, struct tcp_tap_conn *conn,
tcp4_frame_conns[tcp4_payload_used] = conn;
iov = tcp4_l2_iov[tcp4_payload_used++];
- l4len = tcp_l2_buf_fill_headers(conn, iov, dlen, check, seq,
- false);
- iov[TCP_IOV_PAYLOAD].iov_len = l4len;
+ tcp_buf_make_frame(conn, iov, dlen, check, seq, false);
if (tcp4_payload_used > TCP_FRAMES_MEM - 1)
tcp_payload_flush(c);
} else if (CONN_V6(conn)) {
tcp6_frame_conns[tcp6_payload_used] = conn;
iov = tcp6_l2_iov[tcp6_payload_used++];
- l4len = tcp_l2_buf_fill_headers(conn, iov, dlen, NULL, seq,
- false);
- iov[TCP_IOV_PAYLOAD].iov_len = l4len;
+ tcp_buf_make_frame(conn, iov, dlen, NULL, seq, false);
if (tcp6_payload_used > TCP_FRAMES_MEM - 1)
tcp_payload_flush(c);
}
--
@@ -257,35 +257,38 @@ void tcp_payload_flush(const struct ctx *c)
}
/**
- * tcp_buf_fill_headers() - Fill 802.3, IP, TCP headers in pre-cooked buffers
+ * tcp_buf_make_frame() - Adjust IOV, complete headers to build a TCP frame
* @conn: Connection pointer
* @iov: Pointer to an array of iovec of TCP pre-cooked buffers
* @dlen: TCP payload length
* @check: Checksum, if already known
* @seq: Sequence number for this segment
* @no_tcp_csum: Do not set TCP checksum
- *
- * Return: IP payload length, host order
*/
-static size_t tcp_l2_buf_fill_headers(const struct tcp_tap_conn *conn,
- struct iovec *iov, size_t dlen,
- const uint16_t *check, uint32_t seq,
- bool no_tcp_csum)
+static void tcp_buf_make_frame(const struct tcp_tap_conn *conn,
+ struct iovec *iov, size_t dlen,
+ const uint16_t *check, uint32_t seq,
+ bool no_tcp_csum)
{
+ struct tcp_payload_t *payload = iov[TCP_IOV_PAYLOAD].iov_base;
+ struct tap_hdr *taph = iov[TCP_IOV_TAP].iov_base;
const struct flowside *tapside = TAPFLOW(conn);
const struct in_addr *a4 = inany_v4(&tapside->oaddr);
+ size_t l4len;
if (a4) {
- return tcp_fill_headers4(conn, iov[TCP_IOV_TAP].iov_base,
- iov[TCP_IOV_IP].iov_base,
- iov[TCP_IOV_PAYLOAD].iov_base, dlen,
- check, seq, no_tcp_csum);
+ struct iphdr *iph = iov[TCP_IOV_IP].iov_base;
+
+ l4len = tcp_fill_headers4(conn, taph, iph, payload, dlen,
+ check, seq, no_tcp_csum);
+ } else {
+ struct ipv6hdr *ip6h = iov[TCP_IOV_IP].iov_base;
+
+ l4len = tcp_fill_headers6(conn, taph, ip6h, payload, dlen,
+ seq, no_tcp_csum);
}
- return tcp_fill_headers6(conn, iov[TCP_IOV_TAP].iov_base,
- iov[TCP_IOV_IP].iov_base,
- iov[TCP_IOV_PAYLOAD].iov_base, dlen,
- seq, no_tcp_csum);
+ iov[TCP_IOV_PAYLOAD].iov_len = l4len;
}
/**
@@ -301,7 +304,6 @@ int tcp_buf_send_flag(const struct ctx *c, struct tcp_tap_conn *conn, int flags)
struct tcp_flags_t *payload;
struct iovec *iov;
size_t optlen;
- size_t l4len;
uint32_t seq;
int ret;
@@ -323,8 +325,7 @@ int tcp_buf_send_flag(const struct ctx *c, struct tcp_tap_conn *conn, int flags)
return ret;
}
- l4len = tcp_l2_buf_fill_headers(conn, iov, optlen, NULL, seq, false);
- iov[TCP_IOV_PAYLOAD].iov_len = l4len;
+ tcp_buf_make_frame(conn, iov, optlen, NULL, seq, false);
if (flags & DUP_ACK) {
struct iovec *dup_iov;
@@ -368,7 +369,6 @@ static void tcp_data_to_tap(const struct ctx *c, struct tcp_tap_conn *conn,
ssize_t dlen, int no_csum, uint32_t seq)
{
struct iovec *iov;
- size_t l4len;
conn->seq_to_tap = seq + dlen;
@@ -384,18 +384,14 @@ static void tcp_data_to_tap(const struct ctx *c, struct tcp_tap_conn *conn,
tcp4_frame_conns[tcp4_payload_used] = conn;
iov = tcp4_l2_iov[tcp4_payload_used++];
- l4len = tcp_l2_buf_fill_headers(conn, iov, dlen, check, seq,
- false);
- iov[TCP_IOV_PAYLOAD].iov_len = l4len;
+ tcp_buf_make_frame(conn, iov, dlen, check, seq, false);
if (tcp4_payload_used > TCP_FRAMES_MEM - 1)
tcp_payload_flush(c);
} else if (CONN_V6(conn)) {
tcp6_frame_conns[tcp6_payload_used] = conn;
iov = tcp6_l2_iov[tcp6_payload_used++];
- l4len = tcp_l2_buf_fill_headers(conn, iov, dlen, NULL, seq,
- false);
- iov[TCP_IOV_PAYLOAD].iov_len = l4len;
+ tcp_buf_make_frame(conn, iov, dlen, NULL, seq, false);
if (tcp6_payload_used > TCP_FRAMES_MEM - 1)
tcp_payload_flush(c);
}
--
2.47.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 4/7] tcp: Don't use return value from tcp_fill_headers[46] to adjust iov_len
2024-10-28 9:40 [PATCH 0/7] Rework some IOV handling in TCP code David Gibson
` (2 preceding siblings ...)
2024-10-28 9:40 ` [PATCH 3/7] tcp: Rework tcp_l2_buf_fill_headers() into tcp_buf_make_frame() David Gibson
@ 2024-10-28 9:40 ` David Gibson
2024-10-28 9:40 ` [PATCH 5/7] tcp: Pass TCP header and payload separately to tcp_fill_headers[46]() David Gibson
` (2 subsequent siblings)
6 siblings, 0 replies; 14+ messages in thread
From: David Gibson @ 2024-10-28 9:40 UTC (permalink / raw)
To: Stefano Brivio, Laurent Vivier, passt-dev; +Cc: David Gibson
Currently tcp_fill_headers[46] return the size of the IP payload, which
we use to adjust the size of the last IOV entry for the frame, so that it
only includes the expected data. This was originally done to isolate
knowledge of the header layout to the header building functions. However,
we since reorganized from a single buffer for the frame to an IO vector of
pieces, which means we already know something about the layout in the
caller.
Use that knowledge to adjust iov_len *before* we call tcp_fill_headers*().
This means that the header building functions are called with the IOV
containing the frame and only the frame, which will be useful later on.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
tcp.c | 24 ++++++++----------------
tcp_buf.c | 13 ++++++-------
tcp_internal.h | 17 +++++++++--------
3 files changed, 23 insertions(+), 31 deletions(-)
diff --git a/tcp.c b/tcp.c
index 7c6f51a..d1e71ec 100644
--- a/tcp.c
+++ b/tcp.c
@@ -919,13 +919,11 @@ static void tcp_fill_header(struct tcphdr *th,
* @check: Checksum, if already known
* @seq: Sequence number for this segment
* @no_tcp_csum: Do not set TCP checksum
- *
- * Return: The IPv4 payload length, host order
*/
-size_t tcp_fill_headers4(const struct tcp_tap_conn *conn,
- struct tap_hdr *taph, struct iphdr *iph,
- struct tcp_payload_t *bp, size_t dlen,
- const uint16_t *check, uint32_t seq, bool no_tcp_csum)
+void tcp_fill_headers4(const struct tcp_tap_conn *conn,
+ struct tap_hdr *taph, struct iphdr *iph,
+ struct tcp_payload_t *bp, size_t dlen,
+ const uint16_t *check, uint32_t seq, bool no_tcp_csum)
{
const struct flowside *tapside = TAPFLOW(conn);
const struct in_addr *src4 = inany_v4(&tapside->oaddr);
@@ -956,8 +954,6 @@ size_t tcp_fill_headers4(const struct tcp_tap_conn *conn,
}
tap_hdr_update(taph, l3len + sizeof(struct ethhdr));
-
- return l4len;
}
/**
@@ -970,13 +966,11 @@ size_t tcp_fill_headers4(const struct tcp_tap_conn *conn,
* @check: Checksum, if already known
* @seq: Sequence number for this segment
* @no_tcp_csum: Do not set TCP checksum
- *
- * Return: The IPv6 payload length, host order
*/
-size_t tcp_fill_headers6(const struct tcp_tap_conn *conn,
- struct tap_hdr *taph, struct ipv6hdr *ip6h,
- struct tcp_payload_t *bp, size_t dlen,
- uint32_t seq, bool no_tcp_csum)
+void tcp_fill_headers6(const struct tcp_tap_conn *conn,
+ struct tap_hdr *taph, struct ipv6hdr *ip6h,
+ struct tcp_payload_t *bp, size_t dlen,
+ uint32_t seq, bool no_tcp_csum)
{
const struct flowside *tapside = TAPFLOW(conn);
size_t l4len = dlen + sizeof(bp->th);
@@ -1007,8 +1001,6 @@ size_t tcp_fill_headers6(const struct tcp_tap_conn *conn,
}
tap_hdr_update(taph, l4len + sizeof(*ip6h) + sizeof(struct ethhdr));
-
- return l4len;
}
/**
diff --git a/tcp_buf.c b/tcp_buf.c
index deb7be4..b2d806a 100644
--- a/tcp_buf.c
+++ b/tcp_buf.c
@@ -274,21 +274,20 @@ static void tcp_buf_make_frame(const struct tcp_tap_conn *conn,
struct tap_hdr *taph = iov[TCP_IOV_TAP].iov_base;
const struct flowside *tapside = TAPFLOW(conn);
const struct in_addr *a4 = inany_v4(&tapside->oaddr);
- size_t l4len;
+
+ iov[TCP_IOV_PAYLOAD].iov_len = dlen + sizeof(struct tcphdr);
if (a4) {
struct iphdr *iph = iov[TCP_IOV_IP].iov_base;
- l4len = tcp_fill_headers4(conn, taph, iph, payload, dlen,
- check, seq, no_tcp_csum);
+ tcp_fill_headers4(conn, taph, iph, payload, dlen,
+ check, seq, no_tcp_csum);
} else {
struct ipv6hdr *ip6h = iov[TCP_IOV_IP].iov_base;
- l4len = tcp_fill_headers6(conn, taph, ip6h, payload, dlen,
- seq, no_tcp_csum);
+ tcp_fill_headers6(conn, taph, ip6h, payload, dlen,
+ seq, no_tcp_csum);
}
-
- iov[TCP_IOV_PAYLOAD].iov_len = l4len;
}
/**
diff --git a/tcp_internal.h b/tcp_internal.h
index 0034b22..6cf5ad6 100644
--- a/tcp_internal.h
+++ b/tcp_internal.h
@@ -177,14 +177,15 @@ void tcp_rst_do(const struct ctx *c, struct tcp_tap_conn *conn);
struct tcp_info_linux;
-size_t tcp_fill_headers4(const struct tcp_tap_conn *conn,
- struct tap_hdr *taph, struct iphdr *iph,
- struct tcp_payload_t *bp, size_t dlen,
- const uint16_t *check, uint32_t seq, bool no_tcp_csum);
-size_t tcp_fill_headers6(const struct tcp_tap_conn *conn,
- struct tap_hdr *taph, struct ipv6hdr *ip6h,
- struct tcp_payload_t *bp, size_t dlen,
- uint32_t seq, bool no_tcp_csum);
+void tcp_fill_headers4(const struct tcp_tap_conn *conn,
+ struct tap_hdr *taph, struct iphdr *iph,
+ struct tcp_payload_t *bp, size_t dlen,
+ const uint16_t *check, uint32_t seq, bool no_tcp_csum);
+void tcp_fill_headers6(const struct tcp_tap_conn *conn,
+ struct tap_hdr *taph, struct ipv6hdr *ip6h,
+ struct tcp_payload_t *bp, size_t dlen,
+ uint32_t seq, bool no_tcp_csum);
+
int tcp_update_seqack_wnd(const struct ctx *c, struct tcp_tap_conn *conn,
bool force_seq, struct tcp_info_linux *tinfo);
int tcp_prepare_flags(const struct ctx *c, struct tcp_tap_conn *conn,
--
@@ -177,14 +177,15 @@ void tcp_rst_do(const struct ctx *c, struct tcp_tap_conn *conn);
struct tcp_info_linux;
-size_t tcp_fill_headers4(const struct tcp_tap_conn *conn,
- struct tap_hdr *taph, struct iphdr *iph,
- struct tcp_payload_t *bp, size_t dlen,
- const uint16_t *check, uint32_t seq, bool no_tcp_csum);
-size_t tcp_fill_headers6(const struct tcp_tap_conn *conn,
- struct tap_hdr *taph, struct ipv6hdr *ip6h,
- struct tcp_payload_t *bp, size_t dlen,
- uint32_t seq, bool no_tcp_csum);
+void tcp_fill_headers4(const struct tcp_tap_conn *conn,
+ struct tap_hdr *taph, struct iphdr *iph,
+ struct tcp_payload_t *bp, size_t dlen,
+ const uint16_t *check, uint32_t seq, bool no_tcp_csum);
+void tcp_fill_headers6(const struct tcp_tap_conn *conn,
+ struct tap_hdr *taph, struct ipv6hdr *ip6h,
+ struct tcp_payload_t *bp, size_t dlen,
+ uint32_t seq, bool no_tcp_csum);
+
int tcp_update_seqack_wnd(const struct ctx *c, struct tcp_tap_conn *conn,
bool force_seq, struct tcp_info_linux *tinfo);
int tcp_prepare_flags(const struct ctx *c, struct tcp_tap_conn *conn,
--
2.47.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 5/7] tcp: Pass TCP header and payload separately to tcp_fill_headers[46]()
2024-10-28 9:40 [PATCH 0/7] Rework some IOV handling in TCP code David Gibson
` (3 preceding siblings ...)
2024-10-28 9:40 ` [PATCH 4/7] tcp: Don't use return value from tcp_fill_headers[46] to adjust iov_len David Gibson
@ 2024-10-28 9:40 ` David Gibson
2024-10-28 9:40 ` [PATCH 6/7] tcp: Merge tcp_update_check_tcp[46]() David Gibson
2024-10-28 9:40 ` [PATCH 7/7] tcp: Fold tcp_update_csum() into tcp_fill_header() David Gibson
6 siblings, 0 replies; 14+ messages in thread
From: David Gibson @ 2024-10-28 9:40 UTC (permalink / raw)
To: Stefano Brivio, Laurent Vivier, passt-dev; +Cc: David Gibson
At the moment these take separate pointers to the tap specific and IP
headers, but expect the TCP header and payload as a single tcp_payload_t.
As well as being slightly inconsistent, this involves some slightly iffy
pointer shenanigans when called on the flags path with a tcp_flags_t
instead of a tcp_payload_t.
More importantly, it's inconvenient for the upcoming vhost-user case, where
the TCP header and payload might not be contiguous. Furthermore, the
payload itself might not be contiguous.
So, pass the TCP header as its own pointer, and the TCP payload as an IO
vector.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
tcp.c | 56 +++++++++++++++++++++++---------------------------
tcp_buf.c | 7 ++++---
tcp_internal.h | 6 ++++--
3 files changed, 34 insertions(+), 35 deletions(-)
diff --git a/tcp.c b/tcp.c
index d1e71ec..787bc19 100644
--- a/tcp.c
+++ b/tcp.c
@@ -914,21 +914,25 @@ static void tcp_fill_header(struct tcphdr *th,
* @conn: Connection pointer
* @taph: tap backend specific header
* @iph: Pointer to IPv4 header
- * @bp: Pointer to TCP header followed by TCP payload
- * @dlen: TCP payload length
+ * @th: Pointer to TCP header
+ * @iov: IO vector containing payload
+ * @iov_cnt: Number of entries in @iov
+ * @doffset: Offset of the TCP payload within @iov
* @check: Checksum, if already known
* @seq: Sequence number for this segment
* @no_tcp_csum: Do not set TCP checksum
*/
void tcp_fill_headers4(const struct tcp_tap_conn *conn,
struct tap_hdr *taph, struct iphdr *iph,
- struct tcp_payload_t *bp, size_t dlen,
+ struct tcphdr *th,
+ const struct iovec *iov, size_t iov_cnt, size_t doffset,
const uint16_t *check, uint32_t seq, bool no_tcp_csum)
{
const struct flowside *tapside = TAPFLOW(conn);
const struct in_addr *src4 = inany_v4(&tapside->oaddr);
const struct in_addr *dst4 = inany_v4(&tapside->eaddr);
- size_t l4len = dlen + sizeof(bp->th);
+ size_t dlen = iov_size(iov, iov_cnt) - doffset;
+ size_t l4len = dlen + sizeof(*th);
size_t l3len = l4len + sizeof(*iph);
ASSERT(src4 && dst4);
@@ -940,18 +944,12 @@ void tcp_fill_headers4(const struct tcp_tap_conn *conn,
iph->check = check ? *check :
csum_ip4_header(l3len, IPPROTO_TCP, *src4, *dst4);
- tcp_fill_header(&bp->th, conn, seq);
+ tcp_fill_header(th, conn, seq);
- if (no_tcp_csum) {
- bp->th.check = 0;
- } else {
- const struct iovec iov = {
- .iov_base = bp->data,
- .iov_len = dlen,
- };
-
- tcp_update_check_tcp4(iph, &bp->th, &iov, 1, 0);
- }
+ if (no_tcp_csum)
+ th->check = 0;
+ else
+ tcp_update_check_tcp4(iph, th, iov, iov_cnt, doffset);
tap_hdr_update(taph, l3len + sizeof(struct ethhdr));
}
@@ -961,19 +959,23 @@ void tcp_fill_headers4(const struct tcp_tap_conn *conn,
* @conn: Connection pointer
* @taph: tap backend specific header
* @ip6h: Pointer to IPv6 header
- * @bp: Pointer to TCP header followed by TCP payload
- * @dlen: TCP payload length
+ * @th: Pointer to TCP header
+ * @iov: IO vector containing payload
+ * @iov_cnt: Number of entries in @iov
+ * @doffset: Offset of the TCP payload within @iov
* @check: Checksum, if already known
* @seq: Sequence number for this segment
* @no_tcp_csum: Do not set TCP checksum
*/
void tcp_fill_headers6(const struct tcp_tap_conn *conn,
struct tap_hdr *taph, struct ipv6hdr *ip6h,
- struct tcp_payload_t *bp, size_t dlen,
+ struct tcphdr *th,
+ const struct iovec *iov, size_t iov_cnt, size_t doffset,
uint32_t seq, bool no_tcp_csum)
{
const struct flowside *tapside = TAPFLOW(conn);
- size_t l4len = dlen + sizeof(bp->th);
+ size_t dlen = iov_size(iov, iov_cnt) - doffset;
+ size_t l4len = dlen + sizeof(*th);
ip6h->payload_len = htons(l4len);
ip6h->saddr = tapside->oaddr.a6;
@@ -987,18 +989,12 @@ void tcp_fill_headers6(const struct tcp_tap_conn *conn,
ip6h->flow_lbl[1] = (conn->sock >> 8) & 0xff;
ip6h->flow_lbl[2] = (conn->sock >> 0) & 0xff;
- tcp_fill_header(&bp->th, conn, seq);
+ tcp_fill_header(th, conn, seq);
- if (no_tcp_csum) {
- bp->th.check = 0;
- } else {
- const struct iovec iov = {
- .iov_base = bp->data,
- .iov_len = dlen,
- };
-
- tcp_update_check_tcp6(ip6h, &bp->th, &iov, 1, 0);
- }
+ if (no_tcp_csum)
+ th->check = 0;
+ else
+ tcp_update_check_tcp6(ip6h, th, iov, iov_cnt, doffset);
tap_hdr_update(taph, l4len + sizeof(*ip6h) + sizeof(struct ethhdr));
}
diff --git a/tcp_buf.c b/tcp_buf.c
index b2d806a..c03dd33 100644
--- a/tcp_buf.c
+++ b/tcp_buf.c
@@ -270,8 +270,9 @@ static void tcp_buf_make_frame(const struct tcp_tap_conn *conn,
const uint16_t *check, uint32_t seq,
bool no_tcp_csum)
{
- struct tcp_payload_t *payload = iov[TCP_IOV_PAYLOAD].iov_base;
+ struct tcphdr *th = iov[TCP_IOV_PAYLOAD].iov_base;
struct tap_hdr *taph = iov[TCP_IOV_TAP].iov_base;
+ const struct iovec *tail = &iov[TCP_IOV_PAYLOAD];
const struct flowside *tapside = TAPFLOW(conn);
const struct in_addr *a4 = inany_v4(&tapside->oaddr);
@@ -280,12 +281,12 @@ static void tcp_buf_make_frame(const struct tcp_tap_conn *conn,
if (a4) {
struct iphdr *iph = iov[TCP_IOV_IP].iov_base;
- tcp_fill_headers4(conn, taph, iph, payload, dlen,
+ tcp_fill_headers4(conn, taph, iph, th, tail, 1, sizeof(*th),
check, seq, no_tcp_csum);
} else {
struct ipv6hdr *ip6h = iov[TCP_IOV_IP].iov_base;
- tcp_fill_headers6(conn, taph, ip6h, payload, dlen,
+ tcp_fill_headers6(conn, taph, ip6h, th, tail, 1, sizeof(*th),
seq, no_tcp_csum);
}
}
diff --git a/tcp_internal.h b/tcp_internal.h
index 6cf5ad6..8bdfc77 100644
--- a/tcp_internal.h
+++ b/tcp_internal.h
@@ -179,11 +179,13 @@ struct tcp_info_linux;
void tcp_fill_headers4(const struct tcp_tap_conn *conn,
struct tap_hdr *taph, struct iphdr *iph,
- struct tcp_payload_t *bp, size_t dlen,
+ struct tcphdr *th,
+ const struct iovec *iov, size_t iov_cnt, size_t doffset,
const uint16_t *check, uint32_t seq, bool no_tcp_csum);
void tcp_fill_headers6(const struct tcp_tap_conn *conn,
struct tap_hdr *taph, struct ipv6hdr *ip6h,
- struct tcp_payload_t *bp, size_t dlen,
+ struct tcphdr *th,
+ const struct iovec *iov, size_t iov_cnt, size_t doffset,
uint32_t seq, bool no_tcp_csum);
int tcp_update_seqack_wnd(const struct ctx *c, struct tcp_tap_conn *conn,
--
@@ -179,11 +179,13 @@ struct tcp_info_linux;
void tcp_fill_headers4(const struct tcp_tap_conn *conn,
struct tap_hdr *taph, struct iphdr *iph,
- struct tcp_payload_t *bp, size_t dlen,
+ struct tcphdr *th,
+ const struct iovec *iov, size_t iov_cnt, size_t doffset,
const uint16_t *check, uint32_t seq, bool no_tcp_csum);
void tcp_fill_headers6(const struct tcp_tap_conn *conn,
struct tap_hdr *taph, struct ipv6hdr *ip6h,
- struct tcp_payload_t *bp, size_t dlen,
+ struct tcphdr *th,
+ const struct iovec *iov, size_t iov_cnt, size_t doffset,
uint32_t seq, bool no_tcp_csum);
int tcp_update_seqack_wnd(const struct ctx *c, struct tcp_tap_conn *conn,
--
2.47.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 6/7] tcp: Merge tcp_update_check_tcp[46]()
2024-10-28 9:40 [PATCH 0/7] Rework some IOV handling in TCP code David Gibson
` (4 preceding siblings ...)
2024-10-28 9:40 ` [PATCH 5/7] tcp: Pass TCP header and payload separately to tcp_fill_headers[46]() David Gibson
@ 2024-10-28 9:40 ` David Gibson
2024-10-28 9:40 ` [PATCH 7/7] tcp: Fold tcp_update_csum() into tcp_fill_header() David Gibson
6 siblings, 0 replies; 14+ messages in thread
From: David Gibson @ 2024-10-28 9:40 UTC (permalink / raw)
To: Stefano Brivio, Laurent Vivier, passt-dev; +Cc: David Gibson
The only reason we need separate functions for the IPv4 and IPv6 case is
to calculate the checksum of the IP pseudo-header, which is different for
the two cases. However, the caller already knows which path it's on and
can access the values needed for the pseudo-header partial sum more easily
than tcp_update_check_tcp[46]() can.
So, merge these functions into a single tcp_update_csum() function that
just takes the pseudo-header partial sum, calculated in the caller.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
tcp.c | 65 ++++++++++++++++++++---------------------------------------
1 file changed, 22 insertions(+), 43 deletions(-)
diff --git a/tcp.c b/tcp.c
index 787bc19..e9f62a4 100644
--- a/tcp.c
+++ b/tcp.c
@@ -751,50 +751,20 @@ static void tcp_sock_set_bufsize(const struct ctx *c, int s)
}
/**
- * tcp_update_check_tcp4() - Calculate TCP checksum for IPv4
- * @iph: IPv4 header
+ * tcp_update_csum() - Calculate TCP checksum
+ * @psum: Unfolded partial checksum of the IPv4 or IPv6 pseudo-header
* @th: TCP header (updated)
* @iov: IO vector containing the TCP payload
* @iov_cnt: Length of @iov
* @doffset: TCP payload offset in @iov
*/
-static void tcp_update_check_tcp4(const struct iphdr *iph, struct tcphdr *th,
- const struct iovec *iov, int iov_cnt,
- size_t doffset)
+static void tcp_update_csum(uint32_t psum, struct tcphdr *th,
+ const struct iovec *iov, int iov_cnt,
+ size_t doffset)
{
- uint16_t l4len = ntohs(iph->tot_len) - sizeof(struct iphdr);
- struct in_addr saddr = { .s_addr = iph->saddr };
- struct in_addr daddr = { .s_addr = iph->daddr };
- uint32_t sum;
-
- sum = proto_ipv4_header_psum(l4len, IPPROTO_TCP, saddr, daddr);
-
- th->check = 0;
- sum = csum_unfolded(th, sizeof(*th), sum);
- th->check = csum_iov(iov, iov_cnt, doffset, sum);
-}
-
-/**
- * tcp_update_check_tcp6() - Calculate TCP checksum for IPv6
- * @ip6h: IPv6 header
- * @th: TCP header (updated)
- * @iov: IO vector containing the TCP payload
- * @iov_cnt: Length of @iov
- * @doffset: TCP payload offset in @iov
- */
-static void tcp_update_check_tcp6(const struct ipv6hdr *ip6h, struct tcphdr *th,
- const struct iovec *iov, int iov_cnt,
- size_t doffset)
-{
- uint16_t l4len = ntohs(ip6h->payload_len);
- uint32_t sum;
-
- sum = proto_ipv6_header_psum(l4len, IPPROTO_TCP, &ip6h->saddr,
- &ip6h->daddr);
-
th->check = 0;
- sum = csum_unfolded(th, sizeof(*th), sum);
- th->check = csum_iov(iov, iov_cnt, doffset, sum);
+ psum = csum_unfolded(th, sizeof(*th), psum);
+ th->check = csum_iov(iov, iov_cnt, doffset, psum);
}
/**
@@ -946,10 +916,14 @@ void tcp_fill_headers4(const struct tcp_tap_conn *conn,
tcp_fill_header(th, conn, seq);
- if (no_tcp_csum)
+ if (no_tcp_csum) {
th->check = 0;
- else
- tcp_update_check_tcp4(iph, th, iov, iov_cnt, doffset);
+ } else {
+ uint32_t psum = proto_ipv4_header_psum(l4len, IPPROTO_TCP,
+ *src4, *dst4);
+
+ tcp_update_csum(psum, th, iov, iov_cnt, doffset);
+ }
tap_hdr_update(taph, l3len + sizeof(struct ethhdr));
}
@@ -991,10 +965,15 @@ void tcp_fill_headers6(const struct tcp_tap_conn *conn,
tcp_fill_header(th, conn, seq);
- if (no_tcp_csum)
+ if (no_tcp_csum) {
th->check = 0;
- else
- tcp_update_check_tcp6(ip6h, th, iov, iov_cnt, doffset);
+ } else {
+ uint32_t psum = proto_ipv6_header_psum(l4len, IPPROTO_TCP,
+ &ip6h->saddr,
+ &ip6h->daddr);
+
+ tcp_update_csum(psum, th, iov, iov_cnt, doffset);
+ }
tap_hdr_update(taph, l4len + sizeof(*ip6h) + sizeof(struct ethhdr));
}
--
@@ -751,50 +751,20 @@ static void tcp_sock_set_bufsize(const struct ctx *c, int s)
}
/**
- * tcp_update_check_tcp4() - Calculate TCP checksum for IPv4
- * @iph: IPv4 header
+ * tcp_update_csum() - Calculate TCP checksum
+ * @psum: Unfolded partial checksum of the IPv4 or IPv6 pseudo-header
* @th: TCP header (updated)
* @iov: IO vector containing the TCP payload
* @iov_cnt: Length of @iov
* @doffset: TCP payload offset in @iov
*/
-static void tcp_update_check_tcp4(const struct iphdr *iph, struct tcphdr *th,
- const struct iovec *iov, int iov_cnt,
- size_t doffset)
+static void tcp_update_csum(uint32_t psum, struct tcphdr *th,
+ const struct iovec *iov, int iov_cnt,
+ size_t doffset)
{
- uint16_t l4len = ntohs(iph->tot_len) - sizeof(struct iphdr);
- struct in_addr saddr = { .s_addr = iph->saddr };
- struct in_addr daddr = { .s_addr = iph->daddr };
- uint32_t sum;
-
- sum = proto_ipv4_header_psum(l4len, IPPROTO_TCP, saddr, daddr);
-
- th->check = 0;
- sum = csum_unfolded(th, sizeof(*th), sum);
- th->check = csum_iov(iov, iov_cnt, doffset, sum);
-}
-
-/**
- * tcp_update_check_tcp6() - Calculate TCP checksum for IPv6
- * @ip6h: IPv6 header
- * @th: TCP header (updated)
- * @iov: IO vector containing the TCP payload
- * @iov_cnt: Length of @iov
- * @doffset: TCP payload offset in @iov
- */
-static void tcp_update_check_tcp6(const struct ipv6hdr *ip6h, struct tcphdr *th,
- const struct iovec *iov, int iov_cnt,
- size_t doffset)
-{
- uint16_t l4len = ntohs(ip6h->payload_len);
- uint32_t sum;
-
- sum = proto_ipv6_header_psum(l4len, IPPROTO_TCP, &ip6h->saddr,
- &ip6h->daddr);
-
th->check = 0;
- sum = csum_unfolded(th, sizeof(*th), sum);
- th->check = csum_iov(iov, iov_cnt, doffset, sum);
+ psum = csum_unfolded(th, sizeof(*th), psum);
+ th->check = csum_iov(iov, iov_cnt, doffset, psum);
}
/**
@@ -946,10 +916,14 @@ void tcp_fill_headers4(const struct tcp_tap_conn *conn,
tcp_fill_header(th, conn, seq);
- if (no_tcp_csum)
+ if (no_tcp_csum) {
th->check = 0;
- else
- tcp_update_check_tcp4(iph, th, iov, iov_cnt, doffset);
+ } else {
+ uint32_t psum = proto_ipv4_header_psum(l4len, IPPROTO_TCP,
+ *src4, *dst4);
+
+ tcp_update_csum(psum, th, iov, iov_cnt, doffset);
+ }
tap_hdr_update(taph, l3len + sizeof(struct ethhdr));
}
@@ -991,10 +965,15 @@ void tcp_fill_headers6(const struct tcp_tap_conn *conn,
tcp_fill_header(th, conn, seq);
- if (no_tcp_csum)
+ if (no_tcp_csum) {
th->check = 0;
- else
- tcp_update_check_tcp6(ip6h, th, iov, iov_cnt, doffset);
+ } else {
+ uint32_t psum = proto_ipv6_header_psum(l4len, IPPROTO_TCP,
+ &ip6h->saddr,
+ &ip6h->daddr);
+
+ tcp_update_csum(psum, th, iov, iov_cnt, doffset);
+ }
tap_hdr_update(taph, l4len + sizeof(*ip6h) + sizeof(struct ethhdr));
}
--
2.47.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 7/7] tcp: Fold tcp_update_csum() into tcp_fill_header()
2024-10-28 9:40 [PATCH 0/7] Rework some IOV handling in TCP code David Gibson
` (5 preceding siblings ...)
2024-10-28 9:40 ` [PATCH 6/7] tcp: Merge tcp_update_check_tcp[46]() David Gibson
@ 2024-10-28 9:40 ` David Gibson
6 siblings, 0 replies; 14+ messages in thread
From: David Gibson @ 2024-10-28 9:40 UTC (permalink / raw)
To: Stefano Brivio, Laurent Vivier, passt-dev; +Cc: David Gibson
tcp_update_csum() is now simple enough that it makes sense to just fold it
into tcp_fill_header(), meaning the latter now really does fill all the
header fields.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
tcp.c | 66 ++++++++++++++++++++++++++---------------------------------
1 file changed, 29 insertions(+), 37 deletions(-)
diff --git a/tcp.c b/tcp.c
index e9f62a4..88a285b 100644
--- a/tcp.c
+++ b/tcp.c
@@ -750,23 +750,6 @@ static void tcp_sock_set_bufsize(const struct ctx *c, int s)
trace("TCP: failed to set SO_SNDBUF to %i", v);
}
-/**
- * tcp_update_csum() - Calculate TCP checksum
- * @psum: Unfolded partial checksum of the IPv4 or IPv6 pseudo-header
- * @th: TCP header (updated)
- * @iov: IO vector containing the TCP payload
- * @iov_cnt: Length of @iov
- * @doffset: TCP payload offset in @iov
- */
-static void tcp_update_csum(uint32_t psum, struct tcphdr *th,
- const struct iovec *iov, int iov_cnt,
- size_t doffset)
-{
- th->check = 0;
- psum = csum_unfolded(th, sizeof(*th), psum);
- th->check = csum_iov(iov, iov_cnt, doffset, psum);
-}
-
/**
* tcp_opt_get() - Get option, and value if any, from TCP header
* @opts: Pointer to start of TCP options in header
@@ -860,9 +843,16 @@ void tcp_defer_handler(struct ctx *c)
* @th: Pointer to the TCP header structure
* @conn: Pointer to the TCP connection structure
* @seq: Sequence number
+ * @ppsum: Pointer to pseudo-header checksum, or NULL to omit checksum
+ * @iov: IO vector containing the TCP payload
+ * @iov_cnt: Length of @iov
+ * @doffset: TCP payload offset in @iov
*/
static void tcp_fill_header(struct tcphdr *th,
- const struct tcp_tap_conn *conn, uint32_t seq)
+ const struct tcp_tap_conn *conn, uint32_t seq,
+ const uint32_t *ppsum,
+ const struct iovec *iov, size_t iov_cnt,
+ size_t doffset)
{
const struct flowside *tapside = TAPFLOW(conn);
@@ -877,6 +867,11 @@ static void tcp_fill_header(struct tcphdr *th,
th->window = htons(MIN(wnd, USHRT_MAX));
}
+ th->check = 0;
+ if (ppsum) {
+ uint32_t sum = csum_unfolded(th, sizeof(*th), *ppsum);
+ th->check = csum_iov(iov, iov_cnt, doffset, sum);
+ }
}
/**
@@ -904,6 +899,8 @@ void tcp_fill_headers4(const struct tcp_tap_conn *conn,
size_t dlen = iov_size(iov, iov_cnt) - doffset;
size_t l4len = dlen + sizeof(*th);
size_t l3len = l4len + sizeof(*iph);
+ const uint32_t *ppsum = NULL;
+ uint32_t psum;
ASSERT(src4 && dst4);
@@ -914,17 +911,13 @@ void tcp_fill_headers4(const struct tcp_tap_conn *conn,
iph->check = check ? *check :
csum_ip4_header(l3len, IPPROTO_TCP, *src4, *dst4);
- tcp_fill_header(th, conn, seq);
-
- if (no_tcp_csum) {
- th->check = 0;
- } else {
- uint32_t psum = proto_ipv4_header_psum(l4len, IPPROTO_TCP,
- *src4, *dst4);
-
- tcp_update_csum(psum, th, iov, iov_cnt, doffset);
+ if (!no_tcp_csum) {
+ psum = proto_ipv4_header_psum(l4len, IPPROTO_TCP, *src4, *dst4);
+ ppsum = &psum;
}
+ tcp_fill_header(th, conn, seq, ppsum, iov, iov_cnt, doffset);
+
tap_hdr_update(taph, l3len + sizeof(struct ethhdr));
}
@@ -950,6 +943,8 @@ void tcp_fill_headers6(const struct tcp_tap_conn *conn,
const struct flowside *tapside = TAPFLOW(conn);
size_t dlen = iov_size(iov, iov_cnt) - doffset;
size_t l4len = dlen + sizeof(*th);
+ const uint32_t *ppsum = NULL;
+ uint32_t psum;
ip6h->payload_len = htons(l4len);
ip6h->saddr = tapside->oaddr.a6;
@@ -963,18 +958,15 @@ void tcp_fill_headers6(const struct tcp_tap_conn *conn,
ip6h->flow_lbl[1] = (conn->sock >> 8) & 0xff;
ip6h->flow_lbl[2] = (conn->sock >> 0) & 0xff;
- tcp_fill_header(th, conn, seq);
-
- if (no_tcp_csum) {
- th->check = 0;
- } else {
- uint32_t psum = proto_ipv6_header_psum(l4len, IPPROTO_TCP,
- &ip6h->saddr,
- &ip6h->daddr);
-
- tcp_update_csum(psum, th, iov, iov_cnt, doffset);
+ if (!no_tcp_csum) {
+ psum = proto_ipv6_header_psum(l4len, IPPROTO_TCP,
+ &ip6h->saddr,
+ &ip6h->daddr);
+ ppsum = &psum;
}
+ tcp_fill_header(th, conn, seq, ppsum, iov, iov_cnt, doffset);
+
tap_hdr_update(taph, l4len + sizeof(*ip6h) + sizeof(struct ethhdr));
}
--
@@ -750,23 +750,6 @@ static void tcp_sock_set_bufsize(const struct ctx *c, int s)
trace("TCP: failed to set SO_SNDBUF to %i", v);
}
-/**
- * tcp_update_csum() - Calculate TCP checksum
- * @psum: Unfolded partial checksum of the IPv4 or IPv6 pseudo-header
- * @th: TCP header (updated)
- * @iov: IO vector containing the TCP payload
- * @iov_cnt: Length of @iov
- * @doffset: TCP payload offset in @iov
- */
-static void tcp_update_csum(uint32_t psum, struct tcphdr *th,
- const struct iovec *iov, int iov_cnt,
- size_t doffset)
-{
- th->check = 0;
- psum = csum_unfolded(th, sizeof(*th), psum);
- th->check = csum_iov(iov, iov_cnt, doffset, psum);
-}
-
/**
* tcp_opt_get() - Get option, and value if any, from TCP header
* @opts: Pointer to start of TCP options in header
@@ -860,9 +843,16 @@ void tcp_defer_handler(struct ctx *c)
* @th: Pointer to the TCP header structure
* @conn: Pointer to the TCP connection structure
* @seq: Sequence number
+ * @ppsum: Pointer to pseudo-header checksum, or NULL to omit checksum
+ * @iov: IO vector containing the TCP payload
+ * @iov_cnt: Length of @iov
+ * @doffset: TCP payload offset in @iov
*/
static void tcp_fill_header(struct tcphdr *th,
- const struct tcp_tap_conn *conn, uint32_t seq)
+ const struct tcp_tap_conn *conn, uint32_t seq,
+ const uint32_t *ppsum,
+ const struct iovec *iov, size_t iov_cnt,
+ size_t doffset)
{
const struct flowside *tapside = TAPFLOW(conn);
@@ -877,6 +867,11 @@ static void tcp_fill_header(struct tcphdr *th,
th->window = htons(MIN(wnd, USHRT_MAX));
}
+ th->check = 0;
+ if (ppsum) {
+ uint32_t sum = csum_unfolded(th, sizeof(*th), *ppsum);
+ th->check = csum_iov(iov, iov_cnt, doffset, sum);
+ }
}
/**
@@ -904,6 +899,8 @@ void tcp_fill_headers4(const struct tcp_tap_conn *conn,
size_t dlen = iov_size(iov, iov_cnt) - doffset;
size_t l4len = dlen + sizeof(*th);
size_t l3len = l4len + sizeof(*iph);
+ const uint32_t *ppsum = NULL;
+ uint32_t psum;
ASSERT(src4 && dst4);
@@ -914,17 +911,13 @@ void tcp_fill_headers4(const struct tcp_tap_conn *conn,
iph->check = check ? *check :
csum_ip4_header(l3len, IPPROTO_TCP, *src4, *dst4);
- tcp_fill_header(th, conn, seq);
-
- if (no_tcp_csum) {
- th->check = 0;
- } else {
- uint32_t psum = proto_ipv4_header_psum(l4len, IPPROTO_TCP,
- *src4, *dst4);
-
- tcp_update_csum(psum, th, iov, iov_cnt, doffset);
+ if (!no_tcp_csum) {
+ psum = proto_ipv4_header_psum(l4len, IPPROTO_TCP, *src4, *dst4);
+ ppsum = &psum;
}
+ tcp_fill_header(th, conn, seq, ppsum, iov, iov_cnt, doffset);
+
tap_hdr_update(taph, l3len + sizeof(struct ethhdr));
}
@@ -950,6 +943,8 @@ void tcp_fill_headers6(const struct tcp_tap_conn *conn,
const struct flowside *tapside = TAPFLOW(conn);
size_t dlen = iov_size(iov, iov_cnt) - doffset;
size_t l4len = dlen + sizeof(*th);
+ const uint32_t *ppsum = NULL;
+ uint32_t psum;
ip6h->payload_len = htons(l4len);
ip6h->saddr = tapside->oaddr.a6;
@@ -963,18 +958,15 @@ void tcp_fill_headers6(const struct tcp_tap_conn *conn,
ip6h->flow_lbl[1] = (conn->sock >> 8) & 0xff;
ip6h->flow_lbl[2] = (conn->sock >> 0) & 0xff;
- tcp_fill_header(th, conn, seq);
-
- if (no_tcp_csum) {
- th->check = 0;
- } else {
- uint32_t psum = proto_ipv6_header_psum(l4len, IPPROTO_TCP,
- &ip6h->saddr,
- &ip6h->daddr);
-
- tcp_update_csum(psum, th, iov, iov_cnt, doffset);
+ if (!no_tcp_csum) {
+ psum = proto_ipv6_header_psum(l4len, IPPROTO_TCP,
+ &ip6h->saddr,
+ &ip6h->daddr);
+ ppsum = &psum;
}
+ tcp_fill_header(th, conn, seq, ppsum, iov, iov_cnt, doffset);
+
tap_hdr_update(taph, l4len + sizeof(*ip6h) + sizeof(struct ethhdr));
}
--
2.47.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 1/7] tcp: Pass TCP header and payload separately to tcp_update_check_tcp[46]()
2024-10-28 9:40 ` [PATCH 1/7] tcp: Pass TCP header and payload separately to tcp_update_check_tcp[46]() David Gibson
@ 2024-10-28 18:42 ` Stefano Brivio
2024-10-29 3:02 ` David Gibson
0 siblings, 1 reply; 14+ messages in thread
From: Stefano Brivio @ 2024-10-28 18:42 UTC (permalink / raw)
To: David Gibson; +Cc: Laurent Vivier, passt-dev
On Mon, 28 Oct 2024 20:40:44 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:
> Currently these expects both the TCP header and payload in a single IOV,
> and goes to some trouble to locate the checksum field within it. In the
> current caller we've already know where the TCP header is, so we might as
> well just pass it in. This will need to work a bit differently for
> vhost-user, but that code already needs to locate the TCP header for other
> reasons, so again we can just pass it in.
We couldn't do this, and also what you're now doing in 5/7, because
with vhost-user the TCP header is not aligned, so we can't pass it
around as a pointer, see:
<ZeUpxEY-sn64NLE5@zatzit>
https://archives.passt.top/passt-dev/ZeUpxEY-sn64NLE5@zatzit/
and following. That one is about IP headers, but the same applies to
TCP and UDP headers.
Of course the current solution is not elegant and it would be nice to
find another way to deal with it, but we couldn't come up with anything
better back then.
The rest of the series looks good to me, but I'm afraid that without
this one and 5/7 the other changes will be a bit more complicated to
implement (if at all possible).
--
Stefano
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/7] tcp: Pass TCP header and payload separately to tcp_update_check_tcp[46]()
2024-10-28 18:42 ` Stefano Brivio
@ 2024-10-29 3:02 ` David Gibson
2024-10-29 4:07 ` David Gibson
0 siblings, 1 reply; 14+ messages in thread
From: David Gibson @ 2024-10-29 3:02 UTC (permalink / raw)
To: Stefano Brivio; +Cc: Laurent Vivier, passt-dev
[-- Attachment #1: Type: text/plain, Size: 2134 bytes --]
On Mon, Oct 28, 2024 at 07:42:54PM +0100, Stefano Brivio wrote:
> On Mon, 28 Oct 2024 20:40:44 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
>
> > Currently these expects both the TCP header and payload in a single IOV,
> > and goes to some trouble to locate the checksum field within it. In the
> > current caller we've already know where the TCP header is, so we might as
> > well just pass it in. This will need to work a bit differently for
> > vhost-user, but that code already needs to locate the TCP header for other
> > reasons, so again we can just pass it in.
>
> We couldn't do this, and also what you're now doing in 5/7, because
> with vhost-user the TCP header is not aligned, so we can't pass it
> around as a pointer, see:
>
> <ZeUpxEY-sn64NLE5@zatzit>
> https://archives.passt.top/passt-dev/ZeUpxEY-sn64NLE5@zatzit/
>
> and following. That one is about IP headers, but the same applies to
> TCP and UDP headers.
Hrm. I'm aware it theoretically need not be aligned, but I thought it
was in practice.. and that we were already relying on that.
In fact, I'm pretty sure the second part is true, although more subtly
than here. v8 of the vhost-user patches calls tcp_fill_headers[46]()
with the bp parameter set to the offset of the TCP header. If
creating a tcphdr * there is a problem, then creating a tcp_payload_t
* can't be any better.
> Of course the current solution is not elegant and it would be nice to
> find another way to deal with it, but we couldn't come up with anything
> better back then.
>
> The rest of the series looks good to me, but I'm afraid that without
> this one and 5/7 the other changes will be a bit more complicated to
> implement (if at all possible).
Definitely. I have so ideas for approaches more robust to
misalignment, but they're substantially more complicated. I was
hoping we could avoid it at least for now.
--
David Gibson (he or they) | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you, not the other way
| around.
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/7] tcp: Pass TCP header and payload separately to tcp_update_check_tcp[46]()
2024-10-29 3:02 ` David Gibson
@ 2024-10-29 4:07 ` David Gibson
2024-10-29 9:09 ` Stefano Brivio
0 siblings, 1 reply; 14+ messages in thread
From: David Gibson @ 2024-10-29 4:07 UTC (permalink / raw)
To: Stefano Brivio; +Cc: Laurent Vivier, passt-dev
[-- Attachment #1: Type: text/plain, Size: 2598 bytes --]
On Tue, Oct 29, 2024 at 02:02:25PM +1100, David Gibson wrote:
> On Mon, Oct 28, 2024 at 07:42:54PM +0100, Stefano Brivio wrote:
> > On Mon, 28 Oct 2024 20:40:44 +1100
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >
> > > Currently these expects both the TCP header and payload in a single IOV,
> > > and goes to some trouble to locate the checksum field within it. In the
> > > current caller we've already know where the TCP header is, so we might as
> > > well just pass it in. This will need to work a bit differently for
> > > vhost-user, but that code already needs to locate the TCP header for other
> > > reasons, so again we can just pass it in.
> >
> > We couldn't do this, and also what you're now doing in 5/7, because
> > with vhost-user the TCP header is not aligned, so we can't pass it
> > around as a pointer, see:
> >
> > <ZeUpxEY-sn64NLE5@zatzit>
> > https://archives.passt.top/passt-dev/ZeUpxEY-sn64NLE5@zatzit/
> >
> > and following. That one is about IP headers, but the same applies to
> > TCP and UDP headers.
>
> Hrm. I'm aware it theoretically need not be aligned, but I thought it
> was in practice.. and that we were already relying on that.
>
> In fact, I'm pretty sure the second part is true, although more subtly
> than here. v8 of the vhost-user patches calls tcp_fill_headers[46]()
> with the bp parameter set to the offset of the TCP header. If
> creating a tcphdr * there is a problem, then creating a tcp_payload_t
> * can't be any better.
>
> > Of course the current solution is not elegant and it would be nice to
> > find another way to deal with it, but we couldn't come up with anything
> > better back then.
> >
> > The rest of the series looks good to me, but I'm afraid that without
> > this one and 5/7 the other changes will be a bit more complicated to
> > implement (if at all possible).
>
> Definitely. I have so ideas for approaches more robust to
> misalignment, but they're substantially more complicated. I was
> hoping we could avoid it at least for now.
I had a closer look at that earlier message now. I believe at the
time I was aiming for fully robust handling of misaligned user
buffers. AIUI, we've given up on that for the time being: instead
we'll just *test* for suitable alignment and we can do the hard work
of handling it if it ever arises in practice.
--
David Gibson (he or they) | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you, not the other way
| around.
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/7] tcp: Pass TCP header and payload separately to tcp_update_check_tcp[46]()
2024-10-29 4:07 ` David Gibson
@ 2024-10-29 9:09 ` Stefano Brivio
2024-10-29 9:26 ` David Gibson
0 siblings, 1 reply; 14+ messages in thread
From: Stefano Brivio @ 2024-10-29 9:09 UTC (permalink / raw)
To: David Gibson; +Cc: Laurent Vivier, passt-dev
On Tue, 29 Oct 2024 15:07:56 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:
> On Tue, Oct 29, 2024 at 02:02:25PM +1100, David Gibson wrote:
> > On Mon, Oct 28, 2024 at 07:42:54PM +0100, Stefano Brivio wrote:
> > > On Mon, 28 Oct 2024 20:40:44 +1100
> > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > >
> > > > Currently these expects both the TCP header and payload in a single IOV,
> > > > and goes to some trouble to locate the checksum field within it. In the
> > > > current caller we've already know where the TCP header is, so we might as
> > > > well just pass it in. This will need to work a bit differently for
> > > > vhost-user, but that code already needs to locate the TCP header for other
> > > > reasons, so again we can just pass it in.
> > >
> > > We couldn't do this, and also what you're now doing in 5/7, because
> > > with vhost-user the TCP header is not aligned, so we can't pass it
> > > around as a pointer, see:
> > >
> > > <ZeUpxEY-sn64NLE5@zatzit>
> > > https://archives.passt.top/passt-dev/ZeUpxEY-sn64NLE5@zatzit/
> > >
> > > and following. That one is about IP headers, but the same applies to
> > > TCP and UDP headers.
> >
> > Hrm. I'm aware it theoretically need not be aligned, but I thought it
> > was in practice.. and that we were already relying on that.
> >
> > In fact, I'm pretty sure the second part is true, although more subtly
> > than here. v8 of the vhost-user patches calls tcp_fill_headers[46]()
> > with the bp parameter set to the offset of the TCP header. If
> > creating a tcphdr * there is a problem, then creating a tcp_payload_t
> > * can't be any better.
Ah, okay, I missed that. Still, I think we should ask gcc for an
opinion (with the vhost-user series on top of this series), because
those build-time pointer alignment checks are pretty reliable.
> > > Of course the current solution is not elegant and it would be nice to
> > > find another way to deal with it, but we couldn't come up with anything
> > > better back then.
> > >
> > > The rest of the series looks good to me, but I'm afraid that without
> > > this one and 5/7 the other changes will be a bit more complicated to
> > > implement (if at all possible).
> >
> > Definitely. I have so ideas for approaches more robust to
> > misalignment, but they're substantially more complicated. I was
> > hoping we could avoid it at least for now.
>
> I had a closer look at that earlier message now. I believe at the
> time I was aiming for fully robust handling of misaligned user
> buffers. AIUI, we've given up on that for the time being: instead
> we'll just *test* for suitable alignment and we can do the hard work
> of handling it if it ever arises in practice.
Right, and we can use the compiler to test for suitable alignment.
--
Stefano
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/7] tcp: Pass TCP header and payload separately to tcp_update_check_tcp[46]()
2024-10-29 9:09 ` Stefano Brivio
@ 2024-10-29 9:26 ` David Gibson
2024-10-29 10:32 ` Stefano Brivio
0 siblings, 1 reply; 14+ messages in thread
From: David Gibson @ 2024-10-29 9:26 UTC (permalink / raw)
To: Stefano Brivio; +Cc: Laurent Vivier, passt-dev
[-- Attachment #1: Type: text/plain, Size: 4021 bytes --]
On Tue, Oct 29, 2024 at 10:09:54AM +0100, Stefano Brivio wrote:
> On Tue, 29 Oct 2024 15:07:56 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
>
> > On Tue, Oct 29, 2024 at 02:02:25PM +1100, David Gibson wrote:
> > > On Mon, Oct 28, 2024 at 07:42:54PM +0100, Stefano Brivio wrote:
> > > > On Mon, 28 Oct 2024 20:40:44 +1100
> > > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > > >
> > > > > Currently these expects both the TCP header and payload in a single IOV,
> > > > > and goes to some trouble to locate the checksum field within it. In the
> > > > > current caller we've already know where the TCP header is, so we might as
> > > > > well just pass it in. This will need to work a bit differently for
> > > > > vhost-user, but that code already needs to locate the TCP header for other
> > > > > reasons, so again we can just pass it in.
> > > >
> > > > We couldn't do this, and also what you're now doing in 5/7, because
> > > > with vhost-user the TCP header is not aligned, so we can't pass it
> > > > around as a pointer, see:
> > > >
> > > > <ZeUpxEY-sn64NLE5@zatzit>
> > > > https://archives.passt.top/passt-dev/ZeUpxEY-sn64NLE5@zatzit/
> > > >
> > > > and following. That one is about IP headers, but the same applies to
> > > > TCP and UDP headers.
> > >
> > > Hrm. I'm aware it theoretically need not be aligned, but I thought it
> > > was in practice.. and that we were already relying on that.
> > >
> > > In fact, I'm pretty sure the second part is true, although more subtly
> > > than here. v8 of the vhost-user patches calls tcp_fill_headers[46]()
> > > with the bp parameter set to the offset of the TCP header. If
> > > creating a tcphdr * there is a problem, then creating a tcp_payload_t
> > > * can't be any better.
>
> Ah, okay, I missed that. Still, I think we should ask gcc for an
> opinion (with the vhost-user series on top of this series), because
> those build-time pointer alignment checks are pretty reliable.
I'm not exactly sure what you're suggesting with this. I don't think
the compiler will catch it in this case, because we're constructing
the (possibly) misaligned pointer as a (void *), then implicitly
casting it by passing it to a (tcp_payload_t *) argument. (void *) is
explicitly allowed to be cast to any pointer type, so I think the
compiler will take this as asserting we know what we're doing. More
fool it.
> > > > Of course the current solution is not elegant and it would be nice to
> > > > find another way to deal with it, but we couldn't come up with anything
> > > > better back then.
> > > >
> > > > The rest of the series looks good to me, but I'm afraid that without
> > > > this one and 5/7 the other changes will be a bit more complicated to
> > > > implement (if at all possible).
> > >
> > > Definitely. I have so ideas for approaches more robust to
> > > misalignment, but they're substantially more complicated. I was
> > > hoping we could avoid it at least for now.
> >
> > I had a closer look at that earlier message now. I believe at the
> > time I was aiming for fully robust handling of misaligned user
> > buffers. AIUI, we've given up on that for the time being: instead
> > we'll just *test* for suitable alignment and we can do the hard work
> > of handling it if it ever arises in practice.
>
> Right, and we can use the compiler to test for suitable alignment.
I do see allowing the compiler to check this in more cases as an
advantage of using explicitly typed pointers where we can.
Btw, I didn't find a use for it just yet, but I also have a draft
patch which adds a function+macro that extracts a typed pointer from a
given offset into a IO vector, verifying that it's contiguous and
properly aligned.
--
David Gibson (he or they) | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you, not the other way
| around.
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/7] tcp: Pass TCP header and payload separately to tcp_update_check_tcp[46]()
2024-10-29 9:26 ` David Gibson
@ 2024-10-29 10:32 ` Stefano Brivio
0 siblings, 0 replies; 14+ messages in thread
From: Stefano Brivio @ 2024-10-29 10:32 UTC (permalink / raw)
To: David Gibson; +Cc: Laurent Vivier, passt-dev
On Tue, 29 Oct 2024 20:26:25 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:
> On Tue, Oct 29, 2024 at 10:09:54AM +0100, Stefano Brivio wrote:
> > On Tue, 29 Oct 2024 15:07:56 +1100
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >
> > > On Tue, Oct 29, 2024 at 02:02:25PM +1100, David Gibson wrote:
> > > > On Mon, Oct 28, 2024 at 07:42:54PM +0100, Stefano Brivio wrote:
> > > > > On Mon, 28 Oct 2024 20:40:44 +1100
> > > > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > > > >
> > > > > > Currently these expects both the TCP header and payload in a single IOV,
> > > > > > and goes to some trouble to locate the checksum field within it. In the
> > > > > > current caller we've already know where the TCP header is, so we might as
> > > > > > well just pass it in. This will need to work a bit differently for
> > > > > > vhost-user, but that code already needs to locate the TCP header for other
> > > > > > reasons, so again we can just pass it in.
> > > > >
> > > > > We couldn't do this, and also what you're now doing in 5/7, because
> > > > > with vhost-user the TCP header is not aligned, so we can't pass it
> > > > > around as a pointer, see:
> > > > >
> > > > > <ZeUpxEY-sn64NLE5@zatzit>
> > > > > https://archives.passt.top/passt-dev/ZeUpxEY-sn64NLE5@zatzit/
> > > > >
> > > > > and following. That one is about IP headers, but the same applies to
> > > > > TCP and UDP headers.
> > > >
> > > > Hrm. I'm aware it theoretically need not be aligned, but I thought it
> > > > was in practice.. and that we were already relying on that.
> > > >
> > > > In fact, I'm pretty sure the second part is true, although more subtly
> > > > than here. v8 of the vhost-user patches calls tcp_fill_headers[46]()
> > > > with the bp parameter set to the offset of the TCP header. If
> > > > creating a tcphdr * there is a problem, then creating a tcp_payload_t
> > > > * can't be any better.
> >
> > Ah, okay, I missed that. Still, I think we should ask gcc for an
> > opinion (with the vhost-user series on top of this series), because
> > those build-time pointer alignment checks are pretty reliable.
>
> I'm not exactly sure what you're suggesting with this. I don't think
> the compiler will catch it in this case, because we're constructing
> the (possibly) misaligned pointer as a (void *), then implicitly
> casting it by passing it to a (tcp_payload_t *) argument. (void *) is
> explicitly allowed to be cast to any pointer type, so I think the
> compiler will take this as asserting we know what we're doing. More
> fool it.
Oh, hm, right. In the original case we were discussing in that thread
it was coming from an offset in a static struct, but if it's not the
case anymore, then we should check ourselves I guess (possibly with the
function + macro you mention below?).
> > > > > Of course the current solution is not elegant and it would be nice to
> > > > > find another way to deal with it, but we couldn't come up with anything
> > > > > better back then.
> > > > >
> > > > > The rest of the series looks good to me, but I'm afraid that without
> > > > > this one and 5/7 the other changes will be a bit more complicated to
> > > > > implement (if at all possible).
> > > >
> > > > Definitely. I have so ideas for approaches more robust to
> > > > misalignment, but they're substantially more complicated. I was
> > > > hoping we could avoid it at least for now.
> > >
> > > I had a closer look at that earlier message now. I believe at the
> > > time I was aiming for fully robust handling of misaligned user
> > > buffers. AIUI, we've given up on that for the time being: instead
> > > we'll just *test* for suitable alignment and we can do the hard work
> > > of handling it if it ever arises in practice.
> >
> > Right, and we can use the compiler to test for suitable alignment.
>
> I do see allowing the compiler to check this in more cases as an
> advantage of using explicitly typed pointers where we can.
>
> Btw, I didn't find a use for it just yet, but I also have a draft
> patch which adds a function+macro that extracts a typed pointer from a
> given offset into a IO vector, verifying that it's contiguous and
> properly aligned.
--
Stefano
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2024-10-29 10:32 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-10-28 9:40 [PATCH 0/7] Rework some IOV handling in TCP code David Gibson
2024-10-28 9:40 ` [PATCH 1/7] tcp: Pass TCP header and payload separately to tcp_update_check_tcp[46]() David Gibson
2024-10-28 18:42 ` Stefano Brivio
2024-10-29 3:02 ` David Gibson
2024-10-29 4:07 ` David Gibson
2024-10-29 9:09 ` Stefano Brivio
2024-10-29 9:26 ` David Gibson
2024-10-29 10:32 ` Stefano Brivio
2024-10-28 9:40 ` [PATCH 2/7] tcp: Move tcp_l2_buf_fill_headers() to tcp_buf.c David Gibson
2024-10-28 9:40 ` [PATCH 3/7] tcp: Rework tcp_l2_buf_fill_headers() into tcp_buf_make_frame() David Gibson
2024-10-28 9:40 ` [PATCH 4/7] tcp: Don't use return value from tcp_fill_headers[46] to adjust iov_len David Gibson
2024-10-28 9:40 ` [PATCH 5/7] tcp: Pass TCP header and payload separately to tcp_fill_headers[46]() David Gibson
2024-10-28 9:40 ` [PATCH 6/7] tcp: Merge tcp_update_check_tcp[46]() David Gibson
2024-10-28 9:40 ` [PATCH 7/7] tcp: Fold tcp_update_csum() into tcp_fill_header() David Gibson
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).