* [PATCH v2 01/14] Add csum_icmp6() helper for calculating ICMPv6 checksums
2022-10-19 0:43 [PATCH v2 00/14] Clean up checksum and header generation for inbound packets David Gibson
@ 2022-10-19 0:43 ` David Gibson
2022-10-19 0:43 ` [PATCH v2 02/14] Add csum_icmp4() helper for calculating ICMP checksums David Gibson
` (14 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: David Gibson @ 2022-10-19 0:43 UTC (permalink / raw)
To: Stefano Brivio; +Cc: passt-dev, David Gibson
At least two places in passt calculate ICMPv6 checksums, ndp() and
tap_ip_send(). Add a helper to handle this calculation in both places.
For future flexibility, the new helper takes parameters for the fields in
the IPv6 pseudo-header, so an IPv6 header or pseudo-header doesn't need to
be explicitly constructed. It also allows the ICMPv6 header and payload to
be in separate buffers, although we don't use this yet.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
checksum.c | 25 +++++++++++++++++++++++++
checksum.h | 5 +++++
ndp.c | 5 +----
tap.c | 6 ++----
4 files changed, 33 insertions(+), 8 deletions(-)
diff --git a/checksum.c b/checksum.c
index 56ad01e..78c6960 100644
--- a/checksum.c
+++ b/checksum.c
@@ -52,6 +52,8 @@
#include <stddef.h>
#include <stdint.h>
+#include <linux/icmpv6.h>
+
/**
* sum_16b() - Calculate sum of 16-bit words
* @buf: Input buffer
@@ -105,6 +107,29 @@ uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init)
return (uint16_t)~csum_fold(sum_16b(buf, len) + init);
}
+/**
+ * csum_icmp6() - Calculate and set checksum for an ICMPv6 packet
+ * @icmp6hr: ICMPv6 header, initialised apart from checksum
+ * @saddr: IPv6 source address
+ * @daddr: IPv6 destination address
+ * @payload: ICMP packet payload
+ * @len: Length of @payload (not including ICMPv6 header)
+ */
+void csum_icmp6(struct icmp6hdr *icmp6hr,
+ const struct in6_addr *saddr, const struct in6_addr *daddr,
+ const void *payload, size_t len)
+{
+ /* Partial checksum for the pseudo-IPv6 header */
+ uint32_t psum = sum_16b(saddr, sizeof(*saddr)) +
+ sum_16b(daddr, sizeof(*daddr)) +
+ htons(len + sizeof(*icmp6hr)) + htons(IPPROTO_ICMPV6);
+
+ icmp6hr->icmp6_cksum = 0;
+ /* Add in partial checksum for the ICMPv6 header alone */
+ psum += sum_16b(icmp6hr, sizeof(*icmp6hr));
+ icmp6hr->icmp6_cksum = csum_unaligned(payload, len, psum);
+}
+
/**
* csum_tcp4() - Calculate TCP checksum for IPv4 and set in place
* @iph: Packet buffer, IP header
diff --git a/checksum.h b/checksum.h
index 5418406..d7daabf 100644
--- a/checksum.h
+++ b/checksum.h
@@ -6,9 +6,14 @@
#ifndef CHECKSUM_H
#define CHECKSUM_H
+struct icmp6hdr;
+
uint32_t sum_16b(const void *buf, size_t len);
uint16_t csum_fold(uint32_t sum);
uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init);
+void csum_icmp6(struct icmp6hdr *icmp6hr,
+ const struct in6_addr *saddr, const struct in6_addr *daddr,
+ const void *payload, size_t len);
void csum_tcp4(struct iphdr *iph);
uint16_t csum(const void *buf, size_t len, uint32_t init);
diff --git a/ndp.c b/ndp.c
index dec36a9..03f1d06 100644
--- a/ndp.c
+++ b/ndp.c
@@ -189,10 +189,7 @@ dns_done:
ip6hr->saddr = c->ip6.addr_ll;
ip6hr->payload_len = htons(sizeof(*ihr) + len);
- ip6hr->hop_limit = IPPROTO_ICMPV6;
- ihr->icmp6_cksum = 0;
- ihr->icmp6_cksum = csum_unaligned(ip6hr, sizeof(*ip6hr) +
- sizeof(*ihr) + len, 0);
+ csum_icmp6(ihr, &ip6hr->saddr, &ip6hr->daddr, ihr + 1, len);
ip6hr->version = 6;
ip6hr->nexthdr = IPPROTO_ICMPV6;
diff --git a/tap.c b/tap.c
index 8b6d9bc..aafc92b 100644
--- a/tap.c
+++ b/tap.c
@@ -191,10 +191,8 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto,
} else if (proto == IPPROTO_ICMPV6) {
struct icmp6hdr *ih = (struct icmp6hdr *)(ip6h + 1);
- ih->icmp6_cksum = 0;
- ih->icmp6_cksum = csum_unaligned(ip6h,
- len + sizeof(*ip6h),
- 0);
+ csum_icmp6(ih, &ip6h->saddr, &ip6h->daddr,
+ ih + 1, len - sizeof(*ih));
}
ip6h->version = 6;
ip6h->nexthdr = proto;
--
@@ -191,10 +191,8 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto,
} else if (proto == IPPROTO_ICMPV6) {
struct icmp6hdr *ih = (struct icmp6hdr *)(ip6h + 1);
- ih->icmp6_cksum = 0;
- ih->icmp6_cksum = csum_unaligned(ip6h,
- len + sizeof(*ip6h),
- 0);
+ csum_icmp6(ih, &ip6h->saddr, &ip6h->daddr,
+ ih + 1, len - sizeof(*ih));
}
ip6h->version = 6;
ip6h->nexthdr = proto;
--
2.37.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 02/14] Add csum_icmp4() helper for calculating ICMP checksums
2022-10-19 0:43 [PATCH v2 00/14] Clean up checksum and header generation for inbound packets David Gibson
2022-10-19 0:43 ` [PATCH v2 01/14] Add csum_icmp6() helper for calculating ICMPv6 checksums David Gibson
@ 2022-10-19 0:43 ` David Gibson
2022-10-19 0:43 ` [PATCH v2 03/14] Add csum_udp6() helper for calculating UDP over IPv6 checksums David Gibson
` (13 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: David Gibson @ 2022-10-19 0:43 UTC (permalink / raw)
To: Stefano Brivio; +Cc: passt-dev, David Gibson
Although tap_ip_send() is currently the only place calculating ICMP
checksums, create a helper function for symmetry with ICMPv6. For
future flexibility it allows the ICMPv6 header and payload to be in
separate buffers.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
checksum.c | 16 ++++++++++++++++
checksum.h | 2 ++
tap.c | 4 +---
3 files changed, 19 insertions(+), 3 deletions(-)
diff --git a/checksum.c b/checksum.c
index 78c6960..f35c948 100644
--- a/checksum.c
+++ b/checksum.c
@@ -52,6 +52,7 @@
#include <stddef.h>
#include <stdint.h>
+#include <linux/icmp.h>
#include <linux/icmpv6.h>
/**
@@ -107,6 +108,21 @@ uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init)
return (uint16_t)~csum_fold(sum_16b(buf, len) + init);
}
+/**
+ * csum_icmp4() - Calculate and set checksum for an ICMP packet
+ * @icmp4hr: ICMP header, initialised apart from checksum
+ * @payload: ICMP packet payload
+ * @len: Length of @payload (not including ICMP header)
+ */
+void csum_icmp4(struct icmphdr *icmp4hr, const void *payload, size_t len)
+{
+ /* Partial checksum for ICMP header alone */
+ uint32_t psum = sum_16b(icmp4hr, sizeof(*icmp4hr));
+
+ icmp4hr->checksum = 0;
+ icmp4hr->checksum = csum_unaligned(payload, len, psum);
+}
+
/**
* csum_icmp6() - Calculate and set checksum for an ICMPv6 packet
* @icmp6hr: ICMPv6 header, initialised apart from checksum
diff --git a/checksum.h b/checksum.h
index d7daabf..bf0620f 100644
--- a/checksum.h
+++ b/checksum.h
@@ -6,11 +6,13 @@
#ifndef CHECKSUM_H
#define CHECKSUM_H
+struct icmphdr;
struct icmp6hdr;
uint32_t sum_16b(const void *buf, size_t len);
uint16_t csum_fold(uint32_t sum);
uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init);
+void csum_icmp4(struct icmphdr *ih, const void *payload, size_t len);
void csum_icmp6(struct icmp6hdr *icmp6hr,
const struct in6_addr *saddr, const struct in6_addr *daddr,
const void *payload, size_t len);
diff --git a/tap.c b/tap.c
index aafc92b..f082901 100644
--- a/tap.c
+++ b/tap.c
@@ -148,9 +148,7 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto,
uh->check = 0;
} else if (iph->protocol == IPPROTO_ICMP) {
struct icmphdr *ih = (struct icmphdr *)(iph + 1);
-
- ih->checksum = 0;
- ih->checksum = csum_unaligned(ih, len, 0);
+ csum_icmp4(ih, ih + 1, len - sizeof(*ih));
}
if (tap_send(c, buf, len + sizeof(*iph) + sizeof(*eh), 1) < 0)
--
@@ -148,9 +148,7 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto,
uh->check = 0;
} else if (iph->protocol == IPPROTO_ICMP) {
struct icmphdr *ih = (struct icmphdr *)(iph + 1);
-
- ih->checksum = 0;
- ih->checksum = csum_unaligned(ih, len, 0);
+ csum_icmp4(ih, ih + 1, len - sizeof(*ih));
}
if (tap_send(c, buf, len + sizeof(*iph) + sizeof(*eh), 1) < 0)
--
2.37.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 03/14] Add csum_udp6() helper for calculating UDP over IPv6 checksums
2022-10-19 0:43 [PATCH v2 00/14] Clean up checksum and header generation for inbound packets David Gibson
2022-10-19 0:43 ` [PATCH v2 01/14] Add csum_icmp6() helper for calculating ICMPv6 checksums David Gibson
2022-10-19 0:43 ` [PATCH v2 02/14] Add csum_icmp4() helper for calculating ICMP checksums David Gibson
@ 2022-10-19 0:43 ` David Gibson
2022-10-19 0:43 ` [PATCH v2 04/14] Add csum_udp4() helper for calculating UDP over IPv4 checksums David Gibson
` (12 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: David Gibson @ 2022-10-19 0:43 UTC (permalink / raw)
To: Stefano Brivio; +Cc: passt-dev, David Gibson
Add a helper for calculating UDP checksums when used over IPv6
For future flexibility, the new helper takes parameters for the fields in
the IPv6 pseudo-header, so an IPv6 header or pseudo-header doesn't need to
be explicitly constructed. It also allows the UDP header and payload to
be in separate buffers, although we don't use this yet.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
checksum.c | 22 ++++++++++++++++++++++
checksum.h | 4 ++++
tap.c | 5 ++---
3 files changed, 28 insertions(+), 3 deletions(-)
diff --git a/checksum.c b/checksum.c
index f35c948..175381d 100644
--- a/checksum.c
+++ b/checksum.c
@@ -52,6 +52,7 @@
#include <stddef.h>
#include <stdint.h>
+#include <linux/udp.h>
#include <linux/icmp.h>
#include <linux/icmpv6.h>
@@ -123,6 +124,27 @@ void csum_icmp4(struct icmphdr *icmp4hr, const void *payload, size_t len)
icmp4hr->checksum = csum_unaligned(payload, len, psum);
}
+/**
+ * csum_udp6() - Calculate and set checksum for a UDP over IPv6 packet
+ * @udp6hr: UDP header, initialised apart from checksum
+ * @payload: UDP packet payload
+ * @len: Length of @payload (not including UDP header)
+ */
+void csum_udp6(struct udphdr *udp6hr,
+ const struct in6_addr *saddr, const struct in6_addr *daddr,
+ const void *payload, size_t len)
+{
+ /* Partial checksum for the pseudo-IPv6 header */
+ uint32_t psum = sum_16b(saddr, sizeof(*saddr)) +
+ sum_16b(daddr, sizeof(*daddr)) +
+ htons(len + sizeof(*udp6hr)) + htons(IPPROTO_UDP);
+
+ udp6hr->check = 0;
+ /* Add in partial checksum for the UDP header alone */
+ psum += sum_16b(udp6hr, sizeof(*udp6hr));
+ udp6hr->check = csum_unaligned(payload, len, psum);
+}
+
/**
* csum_icmp6() - Calculate and set checksum for an ICMPv6 packet
* @icmp6hr: ICMPv6 header, initialised apart from checksum
diff --git a/checksum.h b/checksum.h
index bf0620f..2bb2ff9 100644
--- a/checksum.h
+++ b/checksum.h
@@ -6,6 +6,7 @@
#ifndef CHECKSUM_H
#define CHECKSUM_H
+struct udphdr;
struct icmphdr;
struct icmp6hdr;
@@ -13,6 +14,9 @@ uint32_t sum_16b(const void *buf, size_t len);
uint16_t csum_fold(uint32_t sum);
uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init);
void csum_icmp4(struct icmphdr *ih, const void *payload, size_t len);
+void csum_udp6(struct udphdr *udp6hr,
+ const struct in6_addr *saddr, const struct in6_addr *daddr,
+ const void *payload, size_t len);
void csum_icmp6(struct icmp6hdr *icmp6hr,
const struct in6_addr *saddr, const struct in6_addr *daddr,
const void *payload, size_t len);
diff --git a/tap.c b/tap.c
index f082901..9c197cb 100644
--- a/tap.c
+++ b/tap.c
@@ -183,9 +183,8 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto,
} else if (proto == IPPROTO_UDP) {
struct udphdr *uh = (struct udphdr *)(ip6h + 1);
- uh->check = 0;
- uh->check = csum_unaligned(ip6h, len + sizeof(*ip6h),
- 0);
+ csum_udp6(uh, &ip6h->saddr, &ip6h->daddr,
+ uh + 1, len - sizeof(*uh));
} else if (proto == IPPROTO_ICMPV6) {
struct icmp6hdr *ih = (struct icmp6hdr *)(ip6h + 1);
--
@@ -183,9 +183,8 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto,
} else if (proto == IPPROTO_UDP) {
struct udphdr *uh = (struct udphdr *)(ip6h + 1);
- uh->check = 0;
- uh->check = csum_unaligned(ip6h, len + sizeof(*ip6h),
- 0);
+ csum_udp6(uh, &ip6h->saddr, &ip6h->daddr,
+ uh + 1, len - sizeof(*uh));
} else if (proto == IPPROTO_ICMPV6) {
struct icmp6hdr *ih = (struct icmp6hdr *)(ip6h + 1);
--
2.37.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 04/14] Add csum_udp4() helper for calculating UDP over IPv4 checksums
2022-10-19 0:43 [PATCH v2 00/14] Clean up checksum and header generation for inbound packets David Gibson
` (2 preceding siblings ...)
2022-10-19 0:43 ` [PATCH v2 03/14] Add csum_udp6() helper for calculating UDP over IPv6 checksums David Gibson
@ 2022-10-19 0:43 ` David Gibson
2022-10-19 0:43 ` [PATCH v2 05/14] Add csum_ip4_header() helper to calculate IPv4 header checksums David Gibson
` (11 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: David Gibson @ 2022-10-19 0:43 UTC (permalink / raw)
To: Stefano Brivio; +Cc: passt-dev, David Gibson
At least two places in passt fill in UDP over IPv4 checksums, although
since UDP checksums are optional with IPv4 that just amounts to storing
a 0 (in tap_ip_send()) or leaving a 0 from an earlier initialization (in
dhcp()). For consistency, add a helper for this "calculation".
Just for the heck of it, add the option (compile time disabled for now) to
calculate real UDP checksums.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
checksum.c | 33 +++++++++++++++++++++++++++++++++
checksum.h | 2 ++
dhcp.c | 2 +-
tap.c | 2 +-
4 files changed, 37 insertions(+), 2 deletions(-)
diff --git a/checksum.c b/checksum.c
index 175381d..cf6fc31 100644
--- a/checksum.c
+++ b/checksum.c
@@ -56,6 +56,12 @@
#include <linux/icmp.h>
#include <linux/icmpv6.h>
+/* Checksums are optional for UDP over IPv4, so we usually just set
+ * them to 0. Change this to 1 to calculate real UDP over IPv4
+ * checksums
+ */
+#define UDP4_REAL_CHECKSUMS 0
+
/**
* sum_16b() - Calculate sum of 16-bit words
* @buf: Input buffer
@@ -109,6 +115,33 @@ uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init)
return (uint16_t)~csum_fold(sum_16b(buf, len) + init);
}
+/**
+ * csum_udp4() - Calculate and set checksum for a UDP over IPv4 packet
+ * @udp4hr: UDP header, initialised apart from checksum
+ * @saddr: IPv4 source address
+ * @daddr: IPv4 destination address
+ * @payload: ICMPv4 packet payload
+ * @len: Length of @payload (not including UDP)
+ */
+void csum_udp4(struct udphdr *udp4hr, in_addr_t saddr, in_addr_t daddr,
+ const void *payload, size_t len)
+{
+ /* UDP checksums are optional, so don't bother */
+ udp4hr->check = 0;
+
+ if (UDP4_REAL_CHECKSUMS) {
+ /* UNTESTED: if we did want real UDPv4 checksums, this
+ * is roughly what we'd need */
+ uint32_t psum = csum_fold(htonl(saddr))
+ + csum_fold(htonl(daddr))
+ + htons(len + sizeof(*udp4hr))
+ + htons(IPPROTO_UDP);
+ /* Add in partial checksum for the UDP header alone */
+ psum += sum_16b(udp4hr, sizeof(*udp4hr));
+ udp4hr->check = csum_unaligned(payload, len, psum);
+ }
+}
+
/**
* csum_icmp4() - Calculate and set checksum for an ICMP packet
* @icmp4hr: ICMP header, initialised apart from checksum
diff --git a/checksum.h b/checksum.h
index 2bb2ff9..2a5e915 100644
--- a/checksum.h
+++ b/checksum.h
@@ -13,6 +13,8 @@ struct icmp6hdr;
uint32_t sum_16b(const void *buf, size_t len);
uint16_t csum_fold(uint32_t sum);
uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init);
+void csum_udp4(struct udphdr *udp4hr, in_addr_t saddr, in_addr_t daddr,
+ const void *payload, size_t len);
void csum_icmp4(struct icmphdr *ih, const void *payload, size_t len);
void csum_udp6(struct udphdr *udp6hr,
const struct in6_addr *saddr, const struct in6_addr *daddr,
diff --git a/dhcp.c b/dhcp.c
index 7f0cc0b..8dcf645 100644
--- a/dhcp.c
+++ b/dhcp.c
@@ -364,9 +364,9 @@ int dhcp(const struct ctx *c, const struct pool *p)
opt_set_dns_search(c, sizeof(m->o));
uh->len = htons(len = offsetof(struct msg, o) + fill(m) + sizeof(*uh));
- uh->check = 0;
uh->source = htons(67);
uh->dest = htons(68);
+ csum_udp4(uh, c->ip4.gw, c->ip4.addr, uh + 1, len - sizeof(*uh));
iph->tot_len = htons(len += sizeof(*iph));
iph->daddr = c->ip4.addr;
diff --git a/tap.c b/tap.c
index 9c197cb..58fc1de 100644
--- a/tap.c
+++ b/tap.c
@@ -145,7 +145,7 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto,
} else if (iph->protocol == IPPROTO_UDP) {
struct udphdr *uh = (struct udphdr *)(iph + 1);
- uh->check = 0;
+ csum_udp4(uh, iph->saddr, iph->daddr, uh + 1, len - sizeof(*uh));
} else if (iph->protocol == IPPROTO_ICMP) {
struct icmphdr *ih = (struct icmphdr *)(iph + 1);
csum_icmp4(ih, ih + 1, len - sizeof(*ih));
--
@@ -145,7 +145,7 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto,
} else if (iph->protocol == IPPROTO_UDP) {
struct udphdr *uh = (struct udphdr *)(iph + 1);
- uh->check = 0;
+ csum_udp4(uh, iph->saddr, iph->daddr, uh + 1, len - sizeof(*uh));
} else if (iph->protocol == IPPROTO_ICMP) {
struct icmphdr *ih = (struct icmphdr *)(iph + 1);
csum_icmp4(ih, ih + 1, len - sizeof(*ih));
--
2.37.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 05/14] Add csum_ip4_header() helper to calculate IPv4 header checksums
2022-10-19 0:43 [PATCH v2 00/14] Clean up checksum and header generation for inbound packets David Gibson
` (3 preceding siblings ...)
2022-10-19 0:43 ` [PATCH v2 04/14] Add csum_udp4() helper for calculating UDP over IPv4 checksums David Gibson
@ 2022-10-19 0:43 ` David Gibson
2022-10-19 0:43 ` [PATCH v2 06/14] Add helpers for normal inbound packet destination addresses David Gibson
` (10 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: David Gibson @ 2022-10-19 0:43 UTC (permalink / raw)
To: Stefano Brivio; +Cc: passt-dev, David Gibson
We calculate IPv4 header checksums in at least two places, in dhcp() and
in tap_ip_send. Add a helper to handle this calculation in both places.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
checksum.c | 10 ++++++++++
checksum.h | 1 +
dhcp.c | 3 +--
tap.c | 3 +--
4 files changed, 13 insertions(+), 4 deletions(-)
diff --git a/checksum.c b/checksum.c
index cf6fc31..7b83196 100644
--- a/checksum.c
+++ b/checksum.c
@@ -115,6 +115,16 @@ uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init)
return (uint16_t)~csum_fold(sum_16b(buf, len) + init);
}
+/**
+ * csum_ip4_header() - Calculate and set IPv4 header checksum
+ * @ip4h: IPv4 header
+ */
+void csum_ip4_header(struct iphdr *ip4h)
+{
+ ip4h->check = 0;
+ ip4h->check = csum_unaligned(ip4h, (size_t)ip4h->ihl * 4, 0);
+}
+
/**
* csum_udp4() - Calculate and set checksum for a UDP over IPv4 packet
* @udp4hr: UDP header, initialised apart from checksum
diff --git a/checksum.h b/checksum.h
index 2a5e915..91e9954 100644
--- a/checksum.h
+++ b/checksum.h
@@ -13,6 +13,7 @@ struct icmp6hdr;
uint32_t sum_16b(const void *buf, size_t len);
uint16_t csum_fold(uint32_t sum);
uint16_t csum_unaligned(const void *buf, size_t len, uint32_t init);
+void csum_ip4_header(struct iphdr *ip4h);
void csum_udp4(struct udphdr *udp4hr, in_addr_t saddr, in_addr_t daddr,
const void *payload, size_t len);
void csum_icmp4(struct icmphdr *ih, const void *payload, size_t len);
diff --git a/dhcp.c b/dhcp.c
index 8dcf645..875e18b 100644
--- a/dhcp.c
+++ b/dhcp.c
@@ -371,8 +371,7 @@ int dhcp(const struct ctx *c, const struct pool *p)
iph->tot_len = htons(len += sizeof(*iph));
iph->daddr = c->ip4.addr;
iph->saddr = c->ip4.gw;
- iph->check = 0;
- iph->check = csum_unaligned(iph, (intptr_t)(iph->ihl * 4), 0);
+ csum_ip4_header(iph);
len += sizeof(*eh);
memcpy(eh->h_dest, eh->h_source, ETH_ALEN);
diff --git a/tap.c b/tap.c
index 58fc1de..de02c56 100644
--- a/tap.c
+++ b/tap.c
@@ -135,8 +135,7 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto,
iph->daddr = c->ip4.addr_seen;
memcpy(&iph->saddr, &src->s6_addr[12], 4);
- iph->check = 0;
- iph->check = csum_unaligned(iph, (size_t)iph->ihl * 4, 0);
+ csum_ip4_header(iph);
memcpy(data, in, len);
--
@@ -135,8 +135,7 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto,
iph->daddr = c->ip4.addr_seen;
memcpy(&iph->saddr, &src->s6_addr[12], 4);
- iph->check = 0;
- iph->check = csum_unaligned(iph, (size_t)iph->ihl * 4, 0);
+ csum_ip4_header(iph);
memcpy(data, in, len);
--
2.37.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 06/14] Add helpers for normal inbound packet destination addresses
2022-10-19 0:43 [PATCH v2 00/14] Clean up checksum and header generation for inbound packets David Gibson
` (4 preceding siblings ...)
2022-10-19 0:43 ` [PATCH v2 05/14] Add csum_ip4_header() helper to calculate IPv4 header checksums David Gibson
@ 2022-10-19 0:43 ` David Gibson
2022-10-19 0:43 ` [PATCH v2 07/14] Remove support for TCP packets from tap_ip_send() David Gibson
` (9 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: David Gibson @ 2022-10-19 0:43 UTC (permalink / raw)
To: Stefano Brivio; +Cc: passt-dev, David Gibson
tap_ip_send() doesn't take a destination address, because it's specifically
for inbound packets, and the IP addresses of the guest/namespace are
already known to us. Rather than open-coding this destination address
logic, make helper functions for it which will enable some later cleanups.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
tap.c | 33 ++++++++++++++++++++++++++++-----
tap.h | 3 +++
2 files changed, 31 insertions(+), 5 deletions(-)
diff --git a/tap.c b/tap.c
index de02c56..89be383 100644
--- a/tap.c
+++ b/tap.c
@@ -96,6 +96,32 @@ int tap_send(const struct ctx *c, const void *data, size_t len, int vnet_pre)
return write(c->fd_tap, (char *)data + (vnet_pre ? 4 : 0), len);
}
+/**
+ * tap_ip4_daddr() - Normal IPv4 destination address for inbound packets
+ * @c: Execution context
+ *
+ * Returns: IPv4 address, network order
+ */
+in_addr_t tap_ip4_daddr(const struct ctx *c)
+{
+ return c->ip4.addr_seen;
+}
+
+/**
+ * tap_ip6_daddr() - Normal IPv4 destination address for inbound packets
+ * @c: Execution context
+ * @src: Source address
+ *
+ * Returns: pointer to IPv6 address
+ */
+const struct in6_addr *tap_ip6_daddr(const struct ctx *c,
+ const struct in6_addr *src)
+{
+ if (IN6_IS_ADDR_LINKLOCAL(src))
+ return &c->ip6.addr_ll_seen;
+ return &c->ip6.addr_seen;
+}
+
/**
* tap_ip_send() - Send IP packet, with L2 headers, calculating L3/L4 checksums
* @c: Execution context
@@ -132,7 +158,7 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto,
iph->frag_off = 0;
iph->ttl = 255;
iph->protocol = proto;
- iph->daddr = c->ip4.addr_seen;
+ iph->daddr = tap_ip4_daddr(c);
memcpy(&iph->saddr, &src->s6_addr[12], 4);
csum_ip4_header(iph);
@@ -163,10 +189,7 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto,
ip6h->priority = 0;
ip6h->saddr = *src;
- if (IN6_IS_ADDR_LINKLOCAL(src))
- ip6h->daddr = c->ip6.addr_ll_seen;
- else
- ip6h->daddr = c->ip6.addr_seen;
+ ip6h->daddr = *tap_ip6_daddr(c, src);
memcpy(data, in, len);
diff --git a/tap.h b/tap.h
index df3aec0..a6764b4 100644
--- a/tap.h
+++ b/tap.h
@@ -6,6 +6,9 @@
#ifndef TAP_H
#define TAP_H
+in_addr_t tap_ip4_daddr(const struct ctx *c);
+const struct in6_addr *tap_ip6_daddr(const struct ctx *c,
+ const struct in6_addr *src);
void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto,
const char *in, size_t len, uint32_t flow);
int tap_send(const struct ctx *c, const void *data, size_t len, int vnet_pre);
--
@@ -6,6 +6,9 @@
#ifndef TAP_H
#define TAP_H
+in_addr_t tap_ip4_daddr(const struct ctx *c);
+const struct in6_addr *tap_ip6_daddr(const struct ctx *c,
+ const struct in6_addr *src);
void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto,
const char *in, size_t len, uint32_t flow);
int tap_send(const struct ctx *c, const void *data, size_t len, int vnet_pre);
--
2.37.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 07/14] Remove support for TCP packets from tap_ip_send()
2022-10-19 0:43 [PATCH v2 00/14] Clean up checksum and header generation for inbound packets David Gibson
` (5 preceding siblings ...)
2022-10-19 0:43 ` [PATCH v2 06/14] Add helpers for normal inbound packet destination addresses David Gibson
@ 2022-10-19 0:43 ` David Gibson
2022-10-19 0:43 ` [PATCH v2 08/14] tap: Remove unhelpeful vnet_pre optimization from tap_send() David Gibson
` (8 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: David Gibson @ 2022-10-19 0:43 UTC (permalink / raw)
To: Stefano Brivio; +Cc: passt-dev, David Gibson
tap_ip_send() is never used for TCP packets, we're unlikely to use it for
that in future, and the handling of TCP packets makes other cleanups
unnecessarily awkward. Remove it.
This is the only user of csum_tcp4(), so we can remove that as well.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
checksum.c | 34 ----------------------------------
checksum.h | 1 -
tap.c | 11 ++---------
3 files changed, 2 insertions(+), 44 deletions(-)
diff --git a/checksum.c b/checksum.c
index 7b83196..09d2c7c 100644
--- a/checksum.c
+++ b/checksum.c
@@ -211,40 +211,6 @@ void csum_icmp6(struct icmp6hdr *icmp6hr,
icmp6hr->icmp6_cksum = csum_unaligned(payload, len, psum);
}
-/**
- * csum_tcp4() - Calculate TCP checksum for IPv4 and set in place
- * @iph: Packet buffer, IP header
- */
-void csum_tcp4(struct iphdr *iph)
-{
- uint16_t tlen = ntohs(iph->tot_len) - iph->ihl * 4, *p;
- struct tcphdr *th;
- uint32_t sum = 0;
-
- th = (struct tcphdr *)((char *)iph + (intptr_t)(iph->ihl * 4));
- p = (uint16_t *)th;
-
- sum += (iph->saddr >> 16) & 0xffff;
- sum += iph->saddr & 0xffff;
- sum += (iph->daddr >> 16) & 0xffff;
- sum += iph->daddr & 0xffff;
-
- sum += htons(IPPROTO_TCP);
- sum += htons(tlen);
-
- th->check = 0;
- while (tlen > 1) {
- sum += *p++;
- tlen -= 2;
- }
-
- if (tlen > 0) {
- sum += *p & htons(0xff00);
- }
-
- th->check = (uint16_t)~csum_fold(sum);
-}
-
#ifdef __AVX2__
#include <immintrin.h>
diff --git a/checksum.h b/checksum.h
index 91e9954..b87b0d6 100644
--- a/checksum.h
+++ b/checksum.h
@@ -23,7 +23,6 @@ void csum_udp6(struct udphdr *udp6hr,
void csum_icmp6(struct icmp6hdr *icmp6hr,
const struct in6_addr *saddr, const struct in6_addr *daddr,
const void *payload, size_t len);
-void csum_tcp4(struct iphdr *iph);
uint16_t csum(const void *buf, size_t len, uint32_t init);
#endif /* CHECKSUM_H */
diff --git a/tap.c b/tap.c
index 89be383..844ee43 100644
--- a/tap.c
+++ b/tap.c
@@ -165,9 +165,7 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto,
memcpy(data, in, len);
- if (iph->protocol == IPPROTO_TCP) {
- csum_tcp4(iph);
- } else if (iph->protocol == IPPROTO_UDP) {
+ if (iph->protocol == IPPROTO_UDP) {
struct udphdr *uh = (struct udphdr *)(iph + 1);
csum_udp4(uh, iph->saddr, iph->daddr, uh + 1, len - sizeof(*uh));
@@ -196,13 +194,8 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto,
ip6h->hop_limit = proto;
ip6h->version = 0;
ip6h->nexthdr = 0;
- if (proto == IPPROTO_TCP) {
- struct tcphdr *th = (struct tcphdr *)(ip6h + 1);
- th->check = 0;
- th->check = csum_unaligned(ip6h, len + sizeof(*ip6h),
- 0);
- } else if (proto == IPPROTO_UDP) {
+ if (proto == IPPROTO_UDP) {
struct udphdr *uh = (struct udphdr *)(ip6h + 1);
csum_udp6(uh, &ip6h->saddr, &ip6h->daddr,
--
@@ -165,9 +165,7 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto,
memcpy(data, in, len);
- if (iph->protocol == IPPROTO_TCP) {
- csum_tcp4(iph);
- } else if (iph->protocol == IPPROTO_UDP) {
+ if (iph->protocol == IPPROTO_UDP) {
struct udphdr *uh = (struct udphdr *)(iph + 1);
csum_udp4(uh, iph->saddr, iph->daddr, uh + 1, len - sizeof(*uh));
@@ -196,13 +194,8 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto,
ip6h->hop_limit = proto;
ip6h->version = 0;
ip6h->nexthdr = 0;
- if (proto == IPPROTO_TCP) {
- struct tcphdr *th = (struct tcphdr *)(ip6h + 1);
- th->check = 0;
- th->check = csum_unaligned(ip6h, len + sizeof(*ip6h),
- 0);
- } else if (proto == IPPROTO_UDP) {
+ if (proto == IPPROTO_UDP) {
struct udphdr *uh = (struct udphdr *)(ip6h + 1);
csum_udp6(uh, &ip6h->saddr, &ip6h->daddr,
--
2.37.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 08/14] tap: Remove unhelpeful vnet_pre optimization from tap_send()
2022-10-19 0:43 [PATCH v2 00/14] Clean up checksum and header generation for inbound packets David Gibson
` (6 preceding siblings ...)
2022-10-19 0:43 ` [PATCH v2 07/14] Remove support for TCP packets from tap_ip_send() David Gibson
@ 2022-10-19 0:43 ` David Gibson
2022-10-19 0:43 ` [PATCH v2 09/14] Split tap_ip_send() into IPv4 and IPv6 specific functions David Gibson
` (7 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: David Gibson @ 2022-10-19 0:43 UTC (permalink / raw)
To: Stefano Brivio; +Cc: passt-dev, David Gibson
Callers of tap_send() can optionally use a small optimization by adding
extra space for the 4 byte length header used on the qemu socket interface.
tap_ip_send() is currently the only user of this, but this is used only
for "slow path" ICMP and DHCP packets, so there's not a lot of value to
the optimization.
Worse, having the two paths here complicates the interface and makes future
cleanups difficult, so just remove it. I have some plans to bring back the
optimization in a more general way in future, but for now it's just in the
way.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
arp.c | 2 +-
dhcp.c | 2 +-
ndp.c | 2 +-
tap.c | 29 +++++++++--------------------
tap.h | 2 +-
5 files changed, 13 insertions(+), 24 deletions(-)
diff --git a/arp.c b/arp.c
index 0ad97af..141d43f 100644
--- a/arp.c
+++ b/arp.c
@@ -81,7 +81,7 @@ int arp(const struct ctx *c, const struct pool *p)
memcpy(eh->h_dest, eh->h_source, sizeof(eh->h_dest));
memcpy(eh->h_source, c->mac, sizeof(eh->h_source));
- if (tap_send(c, eh, len, 0) < 0)
+ if (tap_send(c, eh, len) < 0)
perror("ARP: send");
return 1;
diff --git a/dhcp.c b/dhcp.c
index 875e18b..2b3af82 100644
--- a/dhcp.c
+++ b/dhcp.c
@@ -377,7 +377,7 @@ int dhcp(const struct ctx *c, const struct pool *p)
memcpy(eh->h_dest, eh->h_source, ETH_ALEN);
memcpy(eh->h_source, c->mac, ETH_ALEN);
- if (tap_send(c, eh, len, 0) < 0)
+ if (tap_send(c, eh, len) < 0)
perror("DHCP: send");
return 1;
diff --git a/ndp.c b/ndp.c
index 03f1d06..79be0cf 100644
--- a/ndp.c
+++ b/ndp.c
@@ -200,7 +200,7 @@ dns_done:
memcpy(ehr->h_source, c->mac, ETH_ALEN);
ehr->h_proto = htons(ETH_P_IPV6);
- if (tap_send(c, ehr, len, 0) < 0)
+ if (tap_send(c, ehr, len) < 0)
perror("NDP: send");
return 1;
diff --git a/tap.c b/tap.c
index 844ee43..07592dd 100644
--- a/tap.c
+++ b/tap.c
@@ -66,34 +66,24 @@ static PACKET_POOL_NOINIT(pool_tap6, TAP_MSGS, pkt_buf);
* @c: Execution context
* @data: Packet buffer
* @len: Total L2 packet length
- * @vnet_pre: Buffer has four-byte headroom
*
* Return: return code from send() or write()
*/
-int tap_send(const struct ctx *c, const void *data, size_t len, int vnet_pre)
+int tap_send(const struct ctx *c, const void *data, size_t len)
{
- if (vnet_pre)
- pcap((char *)data + 4, len);
- else
- pcap(data, len);
+ pcap(data, len);
if (c->mode == MODE_PASST) {
int flags = MSG_NOSIGNAL | MSG_DONTWAIT;
+ uint32_t vnet_len = htonl(len);
- if (vnet_pre) {
- *((uint32_t *)data) = htonl(len);
- len += 4;
- } else {
- uint32_t vnet_len = htonl(len);
-
- if (send(c->fd_tap, &vnet_len, 4, flags) < 0)
- return -1;
- }
+ if (send(c->fd_tap, &vnet_len, 4, flags) < 0)
+ return -1;
return send(c->fd_tap, data, len, flags);
}
- return write(c->fd_tap, (char *)data + (vnet_pre ? 4 : 0), len);
+ return write(c->fd_tap, (char *)data, len);
}
/**
@@ -135,10 +125,9 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto,
const char *in, size_t len, uint32_t flow)
{
char buf[USHRT_MAX];
- char *pkt = buf + 4;
struct ethhdr *eh;
- eh = (struct ethhdr *)pkt;
+ eh = (struct ethhdr *)buf;
/* TODO: ARP table lookup */
memcpy(eh->h_dest, c->mac_guest, ETH_ALEN);
@@ -174,7 +163,7 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto,
csum_icmp4(ih, ih + 1, len - sizeof(*ih));
}
- if (tap_send(c, buf, len + sizeof(*iph) + sizeof(*eh), 1) < 0)
+ if (tap_send(c, buf, len + sizeof(*iph) + sizeof(*eh)) < 0)
debug("tap: failed to send %lu bytes (IPv4)", len);
} else {
struct ipv6hdr *ip6h = (struct ipv6hdr *)(eh + 1);
@@ -215,7 +204,7 @@ void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto,
ip6h->flow_lbl[2] = (flow >> 0) & 0xff;
}
- if (tap_send(c, buf, len + sizeof(*ip6h) + sizeof(*eh), 1) < 1)
+ if (tap_send(c, buf, len + sizeof(*ip6h) + sizeof(*eh)) < 1)
debug("tap: failed to send %lu bytes (IPv6)", len);
}
}
diff --git a/tap.h b/tap.h
index a6764b4..a8da8bb 100644
--- a/tap.h
+++ b/tap.h
@@ -11,7 +11,7 @@ const struct in6_addr *tap_ip6_daddr(const struct ctx *c,
const struct in6_addr *src);
void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto,
const char *in, size_t len, uint32_t flow);
-int tap_send(const struct ctx *c, const void *data, size_t len, int vnet_pre);
+int tap_send(const struct ctx *c, const void *data, size_t len);
void tap_handler(struct ctx *c, int fd, uint32_t events,
const struct timespec *now);
void tap_sock_init(struct ctx *c);
--
@@ -11,7 +11,7 @@ const struct in6_addr *tap_ip6_daddr(const struct ctx *c,
const struct in6_addr *src);
void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto,
const char *in, size_t len, uint32_t flow);
-int tap_send(const struct ctx *c, const void *data, size_t len, int vnet_pre);
+int tap_send(const struct ctx *c, const void *data, size_t len);
void tap_handler(struct ctx *c, int fd, uint32_t events,
const struct timespec *now);
void tap_sock_init(struct ctx *c);
--
2.37.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 09/14] Split tap_ip_send() into IPv4 and IPv6 specific functions
2022-10-19 0:43 [PATCH v2 00/14] Clean up checksum and header generation for inbound packets David Gibson
` (7 preceding siblings ...)
2022-10-19 0:43 ` [PATCH v2 08/14] tap: Remove unhelpeful vnet_pre optimization from tap_send() David Gibson
@ 2022-10-19 0:43 ` David Gibson
2022-10-19 0:43 ` [PATCH v2 10/14] tap: Split tap_ip6_send() into UDP and ICMP variants David Gibson
` (6 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: David Gibson @ 2022-10-19 0:43 UTC (permalink / raw)
To: Stefano Brivio; +Cc: passt-dev, David Gibson
The IPv4 and IPv6 paths in tap_ip_send() have very little in common, and
it turns out that every caller (statically) knows if it is using IPv4 or
IPv6. So split into separate tap_ip4_send() and tap_ip6_send() functions.
Use a new tap_l2_hdr() function for the very small common part.
While we're there, make some minor cleanups:
- We were double writing some fields in the IPv6 header, so that it
temporary matched the pseudo-header for checksum calculation. With
recent checksum reworks, this isn't neccessary any more.
- We don't use any IPv4 header options, so use some sizeof() constructs
instead of some open coded values for header length.
- The comment used to say that the flow label was for TCP over IPv6, but
in fact the only thing we used it for was DHCPv6 over UDP traffic
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
dhcpv6.c | 6 +-
icmp.c | 10 +---
tap.c | 177 +++++++++++++++++++++++++++++--------------------------
tap.h | 6 +-
4 files changed, 103 insertions(+), 96 deletions(-)
diff --git a/dhcpv6.c b/dhcpv6.c
index e7640ce..7829968 100644
--- a/dhcpv6.c
+++ b/dhcpv6.c
@@ -531,8 +531,8 @@ int dhcpv6(struct ctx *c, const struct pool *p,
resp_not_on_link.hdr.xid = mh->xid;
- tap_ip_send(c, src, IPPROTO_UDP,
- (char *)&resp_not_on_link, n, mh->xid);
+ tap_ip6_send(c, src, IPPROTO_UDP,
+ (char *)&resp_not_on_link, n, mh->xid);
return 1;
}
@@ -580,7 +580,7 @@ int dhcpv6(struct ctx *c, const struct pool *p,
resp.hdr.xid = mh->xid;
- tap_ip_send(c, src, IPPROTO_UDP, (char *)&resp, n, mh->xid);
+ tap_ip6_send(c, src, IPPROTO_UDP, (char *)&resp, n, mh->xid);
c->ip6.addr_seen = c->ip6.addr;
return 1;
diff --git a/icmp.c b/icmp.c
index 21ea2d7..61c2d90 100644
--- a/icmp.c
+++ b/icmp.c
@@ -69,10 +69,6 @@ static uint8_t icmp_act[IP_VERSIONS][DIV_ROUND_UP(ICMP_NUM_IDS, 8)];
void icmp_sock_handler(const struct ctx *c, union epoll_ref ref,
uint32_t events, const struct timespec *now)
{
- struct in6_addr a6 = { .s6_addr = { 0, 0, 0, 0,
- 0, 0, 0, 0,
- 0, 0, 0xff, 0xff,
- 0, 0, 0, 0 } };
union icmp_epoll_ref *iref = &ref.r.p.icmp;
struct sockaddr_storage sr;
socklen_t sl = sizeof(sr);
@@ -109,7 +105,7 @@ void icmp_sock_handler(const struct ctx *c, union epoll_ref ref,
icmp_id_map[V6][id].seq = seq;
}
- tap_ip_send(c, &sr6->sin6_addr, IPPROTO_ICMPV6, buf, n, 0);
+ tap_ip6_send(c, &sr6->sin6_addr, IPPROTO_ICMPV6, buf, n, 0);
} else {
struct sockaddr_in *sr4 = (struct sockaddr_in *)&sr;
struct icmphdr *ih = (struct icmphdr *)buf;
@@ -127,9 +123,7 @@ void icmp_sock_handler(const struct ctx *c, union epoll_ref ref,
icmp_id_map[V4][id].seq = seq;
}
- memcpy(&a6.s6_addr[12], &sr4->sin_addr, sizeof(sr4->sin_addr));
-
- tap_ip_send(c, &a6, IPPROTO_ICMP, buf, n, 0);
+ tap_ip4_send(c, sr4->sin_addr.s_addr, IPPROTO_ICMP, buf, n);
}
}
diff --git a/tap.c b/tap.c
index 07592dd..0e8c99b 100644
--- a/tap.c
+++ b/tap.c
@@ -113,100 +113,111 @@ const struct in6_addr *tap_ip6_daddr(const struct ctx *c,
}
/**
- * tap_ip_send() - Send IP packet, with L2 headers, calculating L3/L4 checksums
+ * tap_push_l2h() - Build an L2 header for an inbound packet
* @c: Execution context
- * @src: IPv6 source address, IPv4-mapped for IPv4 sources
- * @proto: L4 protocol number
- * @in: Payload
- * @len: L4 payload length
- * @flow: Flow label for TCP over IPv6
+ * @buf: Buffer address at which to generate header
+ * @proto: Ethernet protocol number for L3
+ *
+ * Return: pointer at which to write the packet's payload
*/
-void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto,
- const char *in, size_t len, uint32_t flow)
+static void *tap_push_l2h(const struct ctx *c, void *buf, uint16_t proto)
{
- char buf[USHRT_MAX];
- struct ethhdr *eh;
-
- eh = (struct ethhdr *)buf;
+ struct ethhdr *eh = (struct ethhdr *)buf;
/* TODO: ARP table lookup */
memcpy(eh->h_dest, c->mac_guest, ETH_ALEN);
memcpy(eh->h_source, c->mac, ETH_ALEN);
+ eh->h_proto = ntohs(proto);
+ return eh + 1;
+}
- if (IN6_IS_ADDR_V4MAPPED(src)) {
- struct iphdr *iph = (struct iphdr *)(eh + 1);
- char *data = (char *)(iph + 1);
-
- eh->h_proto = ntohs(ETH_P_IP);
-
- iph->version = 4;
- iph->ihl = 5;
- iph->tos = 0;
- iph->tot_len = htons(len + 20);
- iph->id = 0;
- iph->frag_off = 0;
- iph->ttl = 255;
- iph->protocol = proto;
- iph->daddr = tap_ip4_daddr(c);
- memcpy(&iph->saddr, &src->s6_addr[12], 4);
-
- csum_ip4_header(iph);
-
- memcpy(data, in, len);
-
- if (iph->protocol == IPPROTO_UDP) {
- struct udphdr *uh = (struct udphdr *)(iph + 1);
-
- csum_udp4(uh, iph->saddr, iph->daddr, uh + 1, len - sizeof(*uh));
- } else if (iph->protocol == IPPROTO_ICMP) {
- struct icmphdr *ih = (struct icmphdr *)(iph + 1);
- csum_icmp4(ih, ih + 1, len - sizeof(*ih));
- }
-
- if (tap_send(c, buf, len + sizeof(*iph) + sizeof(*eh)) < 0)
- debug("tap: failed to send %lu bytes (IPv4)", len);
- } else {
- struct ipv6hdr *ip6h = (struct ipv6hdr *)(eh + 1);
- char *data = (char *)(ip6h + 1);
-
- eh->h_proto = ntohs(ETH_P_IPV6);
-
- memset(ip6h->flow_lbl, 0, 3);
- ip6h->payload_len = htons(len);
- ip6h->priority = 0;
-
- ip6h->saddr = *src;
- ip6h->daddr = *tap_ip6_daddr(c, src);
-
- memcpy(data, in, len);
-
- ip6h->hop_limit = proto;
- ip6h->version = 0;
- ip6h->nexthdr = 0;
-
- if (proto == IPPROTO_UDP) {
- struct udphdr *uh = (struct udphdr *)(ip6h + 1);
-
- csum_udp6(uh, &ip6h->saddr, &ip6h->daddr,
- uh + 1, len - sizeof(*uh));
- } else if (proto == IPPROTO_ICMPV6) {
- struct icmp6hdr *ih = (struct icmp6hdr *)(ip6h + 1);
+/**
+ * tap_ip4_send() - Send IPv4 packet, with L2 headers, calculating L3/L4 checksums
+ * @c: Execution context
+ * @src: IPv4 source address
+ * @proto: L4 protocol number
+ * @in: Payload
+ * @len: L4 payload length
+ */
+void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto,
+ const char *in, size_t len)
+{
+ char buf[USHRT_MAX];
+ struct iphdr *ip4h = (struct iphdr *)tap_push_l2h(c, buf, ETH_P_IP);
+ char *data = (char *)(ip4h + 1);
+
+ ip4h->version = 4;
+ ip4h->ihl = sizeof(struct iphdr) / 4;
+ ip4h->tos = 0;
+ ip4h->tot_len = htons(len + sizeof(*ip4h));
+ ip4h->id = 0;
+ ip4h->frag_off = 0;
+ ip4h->ttl = 255;
+ ip4h->protocol = proto;
+ ip4h->saddr = src;
+ ip4h->daddr = tap_ip4_daddr(c);
+ csum_ip4_header(ip4h);
+
+ memcpy(data, in, len);
+
+ if (ip4h->protocol == IPPROTO_UDP) {
+ struct udphdr *uh = (struct udphdr *)(ip4h + 1);
+
+ csum_udp4(uh, ip4h->saddr, ip4h->daddr,
+ uh + 1, len - sizeof(*uh));
+ } else if (ip4h->protocol == IPPROTO_ICMP) {
+ struct icmphdr *ih = (struct icmphdr *)(ip4h + 1);
+ csum_icmp4(ih, ih + 1, len - sizeof(*ih));
+ }
- csum_icmp6(ih, &ip6h->saddr, &ip6h->daddr,
- ih + 1, len - sizeof(*ih));
- }
- ip6h->version = 6;
- ip6h->nexthdr = proto;
- ip6h->hop_limit = 255;
- if (flow) {
- ip6h->flow_lbl[0] = (flow >> 16) & 0xf;
- ip6h->flow_lbl[1] = (flow >> 8) & 0xff;
- ip6h->flow_lbl[2] = (flow >> 0) & 0xff;
- }
+ if (tap_send(c, buf, len + (data - buf)) < 0)
+ debug("tap: failed to send %lu bytes (IPv4)", len);
+}
- if (tap_send(c, buf, len + sizeof(*ip6h) + sizeof(*eh)) < 1)
- debug("tap: failed to send %lu bytes (IPv6)", len);
+/**
+ * tap_ip6_send() - Send IPv6 packet, with L2 headers, calculating L3/L4 checksums
+ * @c: Execution context
+ * @src: IPv6 source address
+ * @proto: L4 protocol number
+ * @in: Payload
+ * @len: L4 payload length
+ * @flow: Flow label
+ */
+void tap_ip6_send(const struct ctx *c, const struct in6_addr *src,
+ uint8_t proto, const char *in, size_t len, uint32_t flow)
+{
+ char buf[USHRT_MAX];
+ struct ipv6hdr *ip6h =
+ (struct ipv6hdr *)tap_push_l2h(c, buf, ETH_P_IPV6);
+ char *data = (char *)(ip6h + 1);
+
+ ip6h->payload_len = htons(len);
+ ip6h->priority = 0;
+ ip6h->version = 6;
+ ip6h->nexthdr = proto;
+ ip6h->hop_limit = 255;
+ ip6h->saddr = *src;
+ ip6h->daddr = *tap_ip6_daddr(c, src);
+ ip6h->flow_lbl[0] = (flow >> 16) & 0xf;
+ ip6h->flow_lbl[1] = (flow >> 8) & 0xff;
+ ip6h->flow_lbl[2] = (flow >> 0) & 0xff;
+
+ memcpy(data, in, len);
+
+ if (proto == IPPROTO_UDP) {
+ struct udphdr *uh = (struct udphdr *)(ip6h + 1);
+
+ csum_udp6(uh, &ip6h->saddr, &ip6h->daddr,
+ uh + 1, len - sizeof(*uh));
+ } else if (proto == IPPROTO_ICMPV6) {
+ struct icmp6hdr *ih = (struct icmp6hdr *)(ip6h + 1);
+
+ csum_icmp6(ih, &ip6h->saddr, &ip6h->daddr,
+ ih + 1, len - sizeof(*ih));
}
+
+ if (tap_send(c, buf, len + (data - buf)) < 1)
+ debug("tap: failed to send %lu bytes (IPv6)", len);
}
PACKET_POOL_DECL(pool_l4, UIO_MAXIOV, pkt_buf);
diff --git a/tap.h b/tap.h
index a8da8bb..011ba8e 100644
--- a/tap.h
+++ b/tap.h
@@ -9,8 +9,10 @@
in_addr_t tap_ip4_daddr(const struct ctx *c);
const struct in6_addr *tap_ip6_daddr(const struct ctx *c,
const struct in6_addr *src);
-void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto,
- const char *in, size_t len, uint32_t flow);
+void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto,
+ const char *in, size_t len);
+void tap_ip6_send(const struct ctx *c, const struct in6_addr *src,
+ uint8_t proto, const char *in, size_t len, uint32_t flow);
int tap_send(const struct ctx *c, const void *data, size_t len);
void tap_handler(struct ctx *c, int fd, uint32_t events,
const struct timespec *now);
--
@@ -9,8 +9,10 @@
in_addr_t tap_ip4_daddr(const struct ctx *c);
const struct in6_addr *tap_ip6_daddr(const struct ctx *c,
const struct in6_addr *src);
-void tap_ip_send(const struct ctx *c, const struct in6_addr *src, uint8_t proto,
- const char *in, size_t len, uint32_t flow);
+void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto,
+ const char *in, size_t len);
+void tap_ip6_send(const struct ctx *c, const struct in6_addr *src,
+ uint8_t proto, const char *in, size_t len, uint32_t flow);
int tap_send(const struct ctx *c, const void *data, size_t len);
void tap_handler(struct ctx *c, int fd, uint32_t events,
const struct timespec *now);
--
2.37.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 10/14] tap: Split tap_ip6_send() into UDP and ICMP variants
2022-10-19 0:43 [PATCH v2 00/14] Clean up checksum and header generation for inbound packets David Gibson
` (8 preceding siblings ...)
2022-10-19 0:43 ` [PATCH v2 09/14] Split tap_ip_send() into IPv4 and IPv6 specific functions David Gibson
@ 2022-10-19 0:43 ` David Gibson
2022-10-19 0:43 ` [PATCH v2 11/14] ndp: Remove unneeded eh_source parameter David Gibson
` (5 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: David Gibson @ 2022-10-19 0:43 UTC (permalink / raw)
To: Stefano Brivio; +Cc: passt-dev, David Gibson
tap_ip6_send() has special case logic to compute the checksums for UDP
and ICMP packets, which is a mild layering violation. By using a suitable
helper we can split it into tap_udp6_send() and tap_icmp6_send() functions
without greatly increasing the code size, this removing that layering
violation.
We make some small changes to the interface while there. In both cases
we make the destination IPv6 address a parameter, which will be useful
later. For the UDP variant we make it take just the UDP payload, and it
will generate the UDP header. For the ICMP variant we pass in the ICMP
header as before. The inconsistency is because that's what seems to be
the more natural way to invoke the function in the callers in each case.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
dhcpv6.c | 21 +++------------
icmp.c | 3 ++-
tap.c | 82 ++++++++++++++++++++++++++++++++++++++++++--------------
tap.h | 9 +++++--
4 files changed, 75 insertions(+), 40 deletions(-)
diff --git a/dhcpv6.c b/dhcpv6.c
index 7829968..e763aed 100644
--- a/dhcpv6.c
+++ b/dhcpv6.c
@@ -208,15 +208,8 @@ struct msg_hdr {
uint32_t xid:24;
} __attribute__((__packed__));
-#if __BYTE_ORDER == __BIG_ENDIAN
-#define UH_RESP {{{ 547, 546, 0, 0, }}}
-#else
-#define UH_RESP {{{ __bswap_constant_16(547), __bswap_constant_16(546), 0, 0 }}}
-#endif
-
/**
* struct resp_t - Normal advertise and reply message
- * @uh: UDP header
* @hdr: DHCP message header
* @server_id: Server Identifier option
* @ia_na: Non-temporary Address option
@@ -226,7 +219,6 @@ struct msg_hdr {
* @dns_search: Domain Search List, here just for storage size
*/
static struct resp_t {
- struct udphdr uh;
struct msg_hdr hdr;
struct opt_server_id server_id;
@@ -236,7 +228,6 @@ static struct resp_t {
struct opt_dns_servers dns_servers;
struct opt_dns_search dns_search;
} __attribute__((__packed__)) resp = {
- UH_RESP,
{ 0 },
SERVER_ID,
@@ -270,13 +261,11 @@ static const struct opt_status_code sc_not_on_link = {
/**
* struct resp_not_on_link_t - NotOnLink error (mandated by RFC 8415, 18.3.2.)
- * @uh: UDP header
* @hdr: DHCP message header
* @server_id: Server Identifier option
* @var: Payload: IA_NA from client, status code, client ID
*/
static struct resp_not_on_link_t {
- struct udphdr uh;
struct msg_hdr hdr;
struct opt_server_id server_id;
@@ -284,7 +273,6 @@ static struct resp_not_on_link_t {
uint8_t var[sizeof(struct opt_ia_na) + sizeof(struct opt_status_code) +
sizeof(struct opt_client_id)];
} __attribute__((__packed__)) resp_not_on_link = {
- UH_RESP,
{ TYPE_REPLY, 0 },
SERVER_ID,
{ 0, },
@@ -527,12 +515,11 @@ int dhcpv6(struct ctx *c, const struct pool *p,
n += sizeof(struct opt_hdr) + ntohs(client_id->l);
n = offsetof(struct resp_not_on_link_t, var) + n;
- resp_not_on_link.uh.len = htons(n);
resp_not_on_link.hdr.xid = mh->xid;
- tap_ip6_send(c, src, IPPROTO_UDP,
- (char *)&resp_not_on_link, n, mh->xid);
+ tap_udp6_send(c, src, 547, tap_ip6_daddr(c, src), 546,
+ mh->xid, &resp_not_on_link, n);
return 1;
}
@@ -576,11 +563,11 @@ int dhcpv6(struct ctx *c, const struct pool *p,
n = offsetof(struct resp_t, client_id) +
sizeof(struct opt_hdr) + ntohs(client_id->l);
n = dhcpv6_dns_fill(c, (char *)&resp, n);
- resp.uh.len = htons(n);
resp.hdr.xid = mh->xid;
- tap_ip6_send(c, src, IPPROTO_UDP, (char *)&resp, n, mh->xid);
+ tap_udp6_send(c, src, 547, tap_ip6_daddr(c, src), 546,
+ mh->xid, &resp, n);
c->ip6.addr_seen = c->ip6.addr;
return 1;
diff --git a/icmp.c b/icmp.c
index 61c2d90..6493ea9 100644
--- a/icmp.c
+++ b/icmp.c
@@ -105,7 +105,8 @@ void icmp_sock_handler(const struct ctx *c, union epoll_ref ref,
icmp_id_map[V6][id].seq = seq;
}
- tap_ip6_send(c, &sr6->sin6_addr, IPPROTO_ICMPV6, buf, n, 0);
+ tap_icmp6_send(c, &sr6->sin6_addr,
+ tap_ip6_daddr(c, &sr6->sin6_addr), buf, n);
} else {
struct sockaddr_in *sr4 = (struct sockaddr_in *)&sr;
struct icmphdr *ih = (struct icmphdr *)buf;
diff --git a/tap.c b/tap.c
index 0e8c99b..135d799 100644
--- a/tap.c
+++ b/tap.c
@@ -175,21 +175,22 @@ void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto,
}
/**
- * tap_ip6_send() - Send IPv6 packet, with L2 headers, calculating L3/L4 checksums
+ * tap_push_ip6h() - Build IPv6 header for inbound packet
* @c: Execution context
* @src: IPv6 source address
- * @proto: L4 protocol number
- * @in: Payload
+ * @dst: IPv6 destination address
* @len: L4 payload length
- * @flow: Flow label
+ * @proto: L4 protocol number
+ * @flow: IPv6 flow identifier
+ *
+ * Return: pointer at which to write the packet's payload
*/
-void tap_ip6_send(const struct ctx *c, const struct in6_addr *src,
- uint8_t proto, const char *in, size_t len, uint32_t flow)
+static void *tap_push_ip6h(char *buf,
+ const struct in6_addr *src,
+ const struct in6_addr *dst,
+ size_t len, uint8_t proto, uint32_t flow)
{
- char buf[USHRT_MAX];
- struct ipv6hdr *ip6h =
- (struct ipv6hdr *)tap_push_l2h(c, buf, ETH_P_IPV6);
- char *data = (char *)(ip6h + 1);
+ struct ipv6hdr *ip6h = (struct ipv6hdr *)buf;
ip6h->payload_len = htons(len);
ip6h->priority = 0;
@@ -197,24 +198,65 @@ void tap_ip6_send(const struct ctx *c, const struct in6_addr *src,
ip6h->nexthdr = proto;
ip6h->hop_limit = 255;
ip6h->saddr = *src;
- ip6h->daddr = *tap_ip6_daddr(c, src);
+ ip6h->daddr = *dst;
ip6h->flow_lbl[0] = (flow >> 16) & 0xf;
ip6h->flow_lbl[1] = (flow >> 8) & 0xff;
ip6h->flow_lbl[2] = (flow >> 0) & 0xff;
+ return ip6h + 1;
+}
+/**
+ * tap_udp6_send() - Send UDP over IPv6 packet
+ * @c: Execution context
+ * @src: IPv6 source address
+ * @sport: UDP source port
+ * @dst: IPv6 destination address
+ * @dport: UDP destination port
+ * @flow: Flow label
+ * @in: UDP payload contents (not including UDP header)
+ * @len: UDP payload length (not including UDP header)
+ */
+void tap_udp6_send(const struct ctx *c,
+ const struct in6_addr *src, in_port_t sport,
+ const struct in6_addr *dst, in_port_t dport,
+ uint32_t flow, const void *in, size_t len)
+{
+ size_t udplen = len + sizeof(struct udphdr);
+ char buf[USHRT_MAX];
+ void *ip6h = tap_push_l2h(c, buf, ETH_P_IPV6);
+ void *uhp = tap_push_ip6h(ip6h, src, dst, udplen, IPPROTO_UDP, flow);
+ struct udphdr *uh = (struct udphdr *)uhp;
+ char *data = (char *)(uh + 1);
+
+ uh->source = htons(sport);
+ uh->dest = htons(dport);
+ uh->len = htons(udplen);
+ csum_udp6(uh, src, dst, in, len);
memcpy(data, in, len);
- if (proto == IPPROTO_UDP) {
- struct udphdr *uh = (struct udphdr *)(ip6h + 1);
+ if (tap_send(c, buf, len + (data - buf)) < 1)
+ debug("tap: failed to send %lu bytes (IPv6)", len);
+}
- csum_udp6(uh, &ip6h->saddr, &ip6h->daddr,
- uh + 1, len - sizeof(*uh));
- } else if (proto == IPPROTO_ICMPV6) {
- struct icmp6hdr *ih = (struct icmp6hdr *)(ip6h + 1);
+/**
+ * tap_icmp6_send() - Send ICMPv6 packet
+ * @c: Execution context
+ * @src: IPv6 source address
+ * @dst: IPv6 destination address
+ * @in: ICMP packet, including ICMP header
+ * @len: ICMP packet length, including ICMP header
+ */
+void tap_icmp6_send(const struct ctx *c,
+ const struct in6_addr *src, const struct in6_addr *dst,
+ void *in, size_t len)
+{
+ char buf[USHRT_MAX];
+ void *ip6h = tap_push_l2h(c, buf, ETH_P_IPV6);
+ char *data = tap_push_ip6h(ip6h, src, dst, len, IPPROTO_ICMPV6, 0);
+ struct icmp6hdr *icmp6h = (struct icmp6hdr *)data;
- csum_icmp6(ih, &ip6h->saddr, &ip6h->daddr,
- ih + 1, len - sizeof(*ih));
- }
+ memcpy(data, in, len);
+ csum_icmp6(icmp6h, src, dst, icmp6h + 1, len - sizeof(*icmp6h));
if (tap_send(c, buf, len + (data - buf)) < 1)
debug("tap: failed to send %lu bytes (IPv6)", len);
diff --git a/tap.h b/tap.h
index 011ba8e..d43c7a0 100644
--- a/tap.h
+++ b/tap.h
@@ -11,8 +11,13 @@ const struct in6_addr *tap_ip6_daddr(const struct ctx *c,
const struct in6_addr *src);
void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto,
const char *in, size_t len);
-void tap_ip6_send(const struct ctx *c, const struct in6_addr *src,
- uint8_t proto, const char *in, size_t len, uint32_t flow);
+void tap_udp6_send(const struct ctx *c,
+ const struct in6_addr *src, in_port_t sport,
+ const struct in6_addr *dst, in_port_t dport,
+ uint32_t flow, const void *in, size_t len);
+void tap_icmp6_send(const struct ctx *c,
+ const struct in6_addr *src, const struct in6_addr *dst,
+ void *in, size_t len);
int tap_send(const struct ctx *c, const void *data, size_t len);
void tap_handler(struct ctx *c, int fd, uint32_t events,
const struct timespec *now);
--
@@ -11,8 +11,13 @@ const struct in6_addr *tap_ip6_daddr(const struct ctx *c,
const struct in6_addr *src);
void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto,
const char *in, size_t len);
-void tap_ip6_send(const struct ctx *c, const struct in6_addr *src,
- uint8_t proto, const char *in, size_t len, uint32_t flow);
+void tap_udp6_send(const struct ctx *c,
+ const struct in6_addr *src, in_port_t sport,
+ const struct in6_addr *dst, in_port_t dport,
+ uint32_t flow, const void *in, size_t len);
+void tap_icmp6_send(const struct ctx *c,
+ const struct in6_addr *src, const struct in6_addr *dst,
+ void *in, size_t len);
int tap_send(const struct ctx *c, const void *data, size_t len);
void tap_handler(struct ctx *c, int fd, uint32_t events,
const struct timespec *now);
--
2.37.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 11/14] ndp: Remove unneeded eh_source parameter
2022-10-19 0:43 [PATCH v2 00/14] Clean up checksum and header generation for inbound packets David Gibson
` (9 preceding siblings ...)
2022-10-19 0:43 ` [PATCH v2 10/14] tap: Split tap_ip6_send() into UDP and ICMP variants David Gibson
@ 2022-10-19 0:43 ` David Gibson
2022-10-19 0:43 ` [PATCH v2 12/14] ndp: Use tap_icmp6_send() helper David Gibson
` (4 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: David Gibson @ 2022-10-19 0:43 UTC (permalink / raw)
To: Stefano Brivio; +Cc: passt-dev, David Gibson
ndp() takes a parameter giving the ethernet source address of the packet
it is to respond to, which it uses to determine the destination address to
send the reply packet to.
This is not necessary, because the address will always be the guest's
MAC address. Even if the guest has just changed MAC address, then either
tap_handler_passt() or tap_handler_pasta() - which are the only call paths
leading to ndp() will have updated c->mac_guest with the new value.
So, remove the parameter, and just use c->mac_guest, making it more
consistent with other paths where we construct packets to send inwards.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
ndp.c | 6 ++----
ndp.h | 3 +--
tap.c | 2 +-
3 files changed, 4 insertions(+), 7 deletions(-)
diff --git a/ndp.c b/ndp.c
index 79be0cf..f96b4b7 100644
--- a/ndp.c
+++ b/ndp.c
@@ -41,13 +41,11 @@
* ndp() - Check for NDP solicitations, reply as needed
* @c: Execution context
* @ih: ICMPv6 header
- * @eh_source: Source Ethernet address
* @saddr Source IPv6 address
*
* Return: 0 if not handled here, 1 if handled, -1 on failure
*/
-int ndp(struct ctx *c, const struct icmp6hdr *ih,
- const unsigned char *eh_source, const struct in6_addr *saddr)
+int ndp(struct ctx *c, const struct icmp6hdr *ih, const struct in6_addr *saddr)
{
char buf[BUFSIZ] = { 0 };
struct ipv6hdr *ip6hr;
@@ -196,7 +194,7 @@ dns_done:
ip6hr->hop_limit = 255;
len += sizeof(*ehr) + sizeof(*ip6hr) + sizeof(*ihr);
- memcpy(ehr->h_dest, eh_source, ETH_ALEN);
+ memcpy(ehr->h_dest, c->mac_guest, ETH_ALEN);
memcpy(ehr->h_source, c->mac, ETH_ALEN);
ehr->h_proto = htons(ETH_P_IPV6);
diff --git a/ndp.h b/ndp.h
index d857425..b012747 100644
--- a/ndp.h
+++ b/ndp.h
@@ -6,7 +6,6 @@
#ifndef NDP_H
#define NDP_H
-int ndp(struct ctx *c, const struct icmp6hdr *ih,
- const unsigned char *eh_source, const struct in6_addr *saddr);
+int ndp(struct ctx *c, const struct icmp6hdr *ih, const struct in6_addr *saddr);
#endif /* NDP_H */
diff --git a/tap.c b/tap.c
index 135d799..0031d82 100644
--- a/tap.c
+++ b/tap.c
@@ -576,7 +576,7 @@ resume:
if (l4_len < sizeof(struct icmp6hdr))
continue;
- if (ndp(c, (struct icmp6hdr *)l4h, eh->h_source, saddr))
+ if (ndp(c, (struct icmp6hdr *)l4h, saddr))
continue;
tap_packet_debug(NULL, ip6h, NULL, proto, NULL, 1);
--
@@ -576,7 +576,7 @@ resume:
if (l4_len < sizeof(struct icmp6hdr))
continue;
- if (ndp(c, (struct icmp6hdr *)l4h, eh->h_source, saddr))
+ if (ndp(c, (struct icmp6hdr *)l4h, saddr))
continue;
tap_packet_debug(NULL, ip6h, NULL, proto, NULL, 1);
--
2.37.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 12/14] ndp: Use tap_icmp6_send() helper
2022-10-19 0:43 [PATCH v2 00/14] Clean up checksum and header generation for inbound packets David Gibson
` (10 preceding siblings ...)
2022-10-19 0:43 ` [PATCH v2 11/14] ndp: Remove unneeded eh_source parameter David Gibson
@ 2022-10-19 0:43 ` David Gibson
2022-10-19 0:43 ` [PATCH v2 13/14] tap: Split tap_ip4_send() into UDP and ICMP variants David Gibson
` (3 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: David Gibson @ 2022-10-19 0:43 UTC (permalink / raw)
To: Stefano Brivio; +Cc: passt-dev, David Gibson
We send ICMPv6 packets to the guest from both icmp.c and from ndp.c. The
case in ndp() manually constructs L2 and IPv6 headers, unlike the version
in icmp.c which uses the tap_icmp6_send() helper from tap.c Now that we've
broaded the parameters of tap_icmp6_send() we can use it in ndp() as well
saving some duplicated logic.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
ndp.c | 21 ++++-----------------
1 file changed, 4 insertions(+), 17 deletions(-)
diff --git a/ndp.c b/ndp.c
index f96b4b7..80e1f19 100644
--- a/ndp.c
+++ b/ndp.c
@@ -47,6 +47,7 @@
*/
int ndp(struct ctx *c, const struct icmp6hdr *ih, const struct in6_addr *saddr)
{
+ const struct in6_addr *rsaddr; /* src addr for reply */
char buf[BUFSIZ] = { 0 };
struct ipv6hdr *ip6hr;
struct icmp6hdr *ihr;
@@ -180,26 +181,12 @@ dns_done:
else
c->ip6.addr_seen = *saddr;
- ip6hr->daddr = *saddr;
if (IN6_IS_ADDR_LINKLOCAL(&c->ip6.gw))
- ip6hr->saddr = c->ip6.gw;
+ rsaddr = &c->ip6.gw;
else
- ip6hr->saddr = c->ip6.addr_ll;
+ rsaddr = &c->ip6.addr_ll;
- ip6hr->payload_len = htons(sizeof(*ihr) + len);
- csum_icmp6(ihr, &ip6hr->saddr, &ip6hr->daddr, ihr + 1, len);
-
- ip6hr->version = 6;
- ip6hr->nexthdr = IPPROTO_ICMPV6;
- ip6hr->hop_limit = 255;
-
- len += sizeof(*ehr) + sizeof(*ip6hr) + sizeof(*ihr);
- memcpy(ehr->h_dest, c->mac_guest, ETH_ALEN);
- memcpy(ehr->h_source, c->mac, ETH_ALEN);
- ehr->h_proto = htons(ETH_P_IPV6);
-
- if (tap_send(c, ehr, len) < 0)
- perror("NDP: send");
+ tap_icmp6_send(c, rsaddr, saddr, ihr, len + sizeof(*ihr));
return 1;
}
--
@@ -47,6 +47,7 @@
*/
int ndp(struct ctx *c, const struct icmp6hdr *ih, const struct in6_addr *saddr)
{
+ const struct in6_addr *rsaddr; /* src addr for reply */
char buf[BUFSIZ] = { 0 };
struct ipv6hdr *ip6hr;
struct icmp6hdr *ihr;
@@ -180,26 +181,12 @@ dns_done:
else
c->ip6.addr_seen = *saddr;
- ip6hr->daddr = *saddr;
if (IN6_IS_ADDR_LINKLOCAL(&c->ip6.gw))
- ip6hr->saddr = c->ip6.gw;
+ rsaddr = &c->ip6.gw;
else
- ip6hr->saddr = c->ip6.addr_ll;
+ rsaddr = &c->ip6.addr_ll;
- ip6hr->payload_len = htons(sizeof(*ihr) + len);
- csum_icmp6(ihr, &ip6hr->saddr, &ip6hr->daddr, ihr + 1, len);
-
- ip6hr->version = 6;
- ip6hr->nexthdr = IPPROTO_ICMPV6;
- ip6hr->hop_limit = 255;
-
- len += sizeof(*ehr) + sizeof(*ip6hr) + sizeof(*ihr);
- memcpy(ehr->h_dest, c->mac_guest, ETH_ALEN);
- memcpy(ehr->h_source, c->mac, ETH_ALEN);
- ehr->h_proto = htons(ETH_P_IPV6);
-
- if (tap_send(c, ehr, len) < 0)
- perror("NDP: send");
+ tap_icmp6_send(c, rsaddr, saddr, ihr, len + sizeof(*ihr));
return 1;
}
--
2.37.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 13/14] tap: Split tap_ip4_send() into UDP and ICMP variants
2022-10-19 0:43 [PATCH v2 00/14] Clean up checksum and header generation for inbound packets David Gibson
` (11 preceding siblings ...)
2022-10-19 0:43 ` [PATCH v2 12/14] ndp: Use tap_icmp6_send() helper David Gibson
@ 2022-10-19 0:43 ` David Gibson
2022-10-19 0:43 ` [PATCH v2 14/14] dhcp: Use tap_udp4_send() helper in dhcp() David Gibson
` (2 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: David Gibson @ 2022-10-19 0:43 UTC (permalink / raw)
To: Stefano Brivio; +Cc: passt-dev, David Gibson
tap_ip4_send() has special case logic to compute the checksums for UDP
and ICMP packets, which is a mild layering violation. By using a suitable
helper we can split it into tap_udp4_send() and tap_icmp4_send() functions
without greatly increasing the code size, this removing that layering
violation.
We make some small changes to the interface while there. In both cases
we make the destination IPv4 address a parameter, which will be useful
later. For the UDP variant we make it take just the UDP payload, and it
will generate the UDP header. For the ICMP variant we pass in the ICMP
header as before. The inconsistency is because that's what seems to be
the more natural way to invoke the function in the callers in each case.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
icmp.c | 3 ++-
tap.c | 77 ++++++++++++++++++++++++++++++++++++++++++++--------------
tap.h | 7 ++++--
3 files changed, 66 insertions(+), 21 deletions(-)
diff --git a/icmp.c b/icmp.c
index 6493ea9..233acf9 100644
--- a/icmp.c
+++ b/icmp.c
@@ -124,7 +124,8 @@ void icmp_sock_handler(const struct ctx *c, union epoll_ref ref,
icmp_id_map[V4][id].seq = seq;
}
- tap_ip4_send(c, sr4->sin_addr.s_addr, IPPROTO_ICMP, buf, n);
+ tap_icmp4_send(c, sr4->sin_addr.s_addr, tap_ip4_daddr(c),
+ buf, n);
}
}
diff --git a/tap.c b/tap.c
index 0031d82..d250a0b 100644
--- a/tap.c
+++ b/tap.c
@@ -132,19 +132,19 @@ static void *tap_push_l2h(const struct ctx *c, void *buf, uint16_t proto)
}
/**
- * tap_ip4_send() - Send IPv4 packet, with L2 headers, calculating L3/L4 checksums
+ * tap_push_ip4h() - Build IPv4 header for inbound packet, with checksum
* @c: Execution context
- * @src: IPv4 source address
- * @proto: L4 protocol number
- * @in: Payload
+ * @src: IPv4 source address, network order
+ * @dst: IPv4 destination address, network order
* @len: L4 payload length
+ * @proto: L4 protocol number
+ *
+ * Return: pointer at which to write the packet's payload
*/
-void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto,
- const char *in, size_t len)
+static void *tap_push_ip4h(char *buf, in_addr_t src, in_addr_t dst,
+ size_t len, uint8_t proto)
{
- char buf[USHRT_MAX];
- struct iphdr *ip4h = (struct iphdr *)tap_push_l2h(c, buf, ETH_P_IP);
- char *data = (char *)(ip4h + 1);
+ struct iphdr *ip4h = (struct iphdr *)buf;
ip4h->version = 4;
ip4h->ihl = sizeof(struct iphdr) / 4;
@@ -155,20 +155,61 @@ void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto,
ip4h->ttl = 255;
ip4h->protocol = proto;
ip4h->saddr = src;
- ip4h->daddr = tap_ip4_daddr(c);
+ ip4h->daddr = dst;
csum_ip4_header(ip4h);
+ return ip4h + 1;
+}
+/**
+ * tap_udp4_send() - Send UDP over IPv4 packet
+ * @c: Execution context
+ * @src: IPv4 source address
+ * @sport: UDP source port
+ * @dst: IPv4 destination address
+ * @dport: UDP destination port
+ * @in: UDP payload contents (not including UDP header)
+ * @len: UDP payload length (not including UDP header)
+ */
+/* cppcheck-suppress unusedFunction */
+void tap_udp4_send(const struct ctx *c, in_addr_t src, in_port_t sport,
+ in_addr_t dst, in_port_t dport,
+ const void *in, size_t len)
+{
+ size_t udplen = len + sizeof(struct udphdr);
+ char buf[USHRT_MAX];
+ void *ip4h = tap_push_l2h(c, buf, ETH_P_IP);
+ void *uhp = tap_push_ip4h(ip4h, src, dst, udplen, IPPROTO_UDP);
+ struct udphdr *uh = (struct udphdr *)uhp;
+ char *data = (char *)(uh + 1);
+
+ uh->source = htons(sport);
+ uh->dest = htons(dport);
+ uh->len = htons(udplen);
+ csum_udp4(uh, src, dst, in, len);
memcpy(data, in, len);
- if (ip4h->protocol == IPPROTO_UDP) {
- struct udphdr *uh = (struct udphdr *)(ip4h + 1);
+ if (tap_send(c, buf, len + (data - buf)) < 0)
+ debug("tap: failed to send %lu bytes (IPv4)", len);
+}
- csum_udp4(uh, ip4h->saddr, ip4h->daddr,
- uh + 1, len - sizeof(*uh));
- } else if (ip4h->protocol == IPPROTO_ICMP) {
- struct icmphdr *ih = (struct icmphdr *)(ip4h + 1);
- csum_icmp4(ih, ih + 1, len - sizeof(*ih));
- }
+/**
+ * tap_icmp4_send() - Send ICMPv4 packet
+ * @c: Execution context
+ * @src: IPv4 source address
+ * @dst: IPv4 destination address
+ * @in: ICMP packet, including ICMP header
+ * @len: ICMP packet length, including ICMP header
+ */
+void tap_icmp4_send(const struct ctx *c, in_addr_t src, in_addr_t dst,
+ void *in, size_t len)
+{
+ char buf[USHRT_MAX];
+ void *ip4h = tap_push_l2h(c, buf, ETH_P_IP);
+ char *data = tap_push_ip4h(ip4h, src, dst, len, IPPROTO_ICMP);
+ struct icmphdr *icmp4h = (struct icmphdr *)data;
+
+ memcpy(data, in, len);
+ csum_icmp4(icmp4h, icmp4h + 1, len - sizeof(*icmp4h));
if (tap_send(c, buf, len + (data - buf)) < 0)
debug("tap: failed to send %lu bytes (IPv4)", len);
diff --git a/tap.h b/tap.h
index d43c7a0..743bc58 100644
--- a/tap.h
+++ b/tap.h
@@ -7,10 +7,13 @@
#define TAP_H
in_addr_t tap_ip4_daddr(const struct ctx *c);
+void tap_udp4_send(const struct ctx *c, in_addr_t src, in_port_t sport,
+ in_addr_t dst, in_port_t dport,
+ const void *in, size_t len);
+void tap_icmp4_send(const struct ctx *c, in_addr_t src, in_addr_t dst,
+ void *in, size_t len);
const struct in6_addr *tap_ip6_daddr(const struct ctx *c,
const struct in6_addr *src);
-void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto,
- const char *in, size_t len);
void tap_udp6_send(const struct ctx *c,
const struct in6_addr *src, in_port_t sport,
const struct in6_addr *dst, in_port_t dport,
--
@@ -7,10 +7,13 @@
#define TAP_H
in_addr_t tap_ip4_daddr(const struct ctx *c);
+void tap_udp4_send(const struct ctx *c, in_addr_t src, in_port_t sport,
+ in_addr_t dst, in_port_t dport,
+ const void *in, size_t len);
+void tap_icmp4_send(const struct ctx *c, in_addr_t src, in_addr_t dst,
+ void *in, size_t len);
const struct in6_addr *tap_ip6_daddr(const struct ctx *c,
const struct in6_addr *src);
-void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto,
- const char *in, size_t len);
void tap_udp6_send(const struct ctx *c,
const struct in6_addr *src, in_port_t sport,
const struct in6_addr *dst, in_port_t dport,
--
2.37.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 14/14] dhcp: Use tap_udp4_send() helper in dhcp()
2022-10-19 0:43 [PATCH v2 00/14] Clean up checksum and header generation for inbound packets David Gibson
` (12 preceding siblings ...)
2022-10-19 0:43 ` [PATCH v2 13/14] tap: Split tap_ip4_send() into UDP and ICMP variants David Gibson
@ 2022-10-19 0:43 ` David Gibson
2022-10-19 9:07 ` [PATCH v2 00/14] Clean up checksum and header generation for inbound packets Stefano Brivio
2022-10-22 8:21 ` Stefano Brivio
15 siblings, 0 replies; 17+ messages in thread
From: David Gibson @ 2022-10-19 0:43 UTC (permalink / raw)
To: Stefano Brivio; +Cc: passt-dev, David Gibson
The IPv4 specific dhcp() manually constructs L2 and IP headers to send its
DHCP reply packet, unlike its IPv6 equivalent in dhcpv6.c which uses the
tap_udp6_send() helper. Now that we've broaded the parameters to
tap_udp4_send() we can use it in dhcp() to avoid some duplicated logic.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
dhcp.c | 18 ++----------------
tap.c | 1 -
2 files changed, 2 insertions(+), 17 deletions(-)
diff --git a/dhcp.c b/dhcp.c
index 2b3af82..d22698a 100644
--- a/dhcp.c
+++ b/dhcp.c
@@ -363,22 +363,8 @@ int dhcp(const struct ctx *c, const struct pool *p)
if (!c->no_dhcp_dns_search)
opt_set_dns_search(c, sizeof(m->o));
- uh->len = htons(len = offsetof(struct msg, o) + fill(m) + sizeof(*uh));
- uh->source = htons(67);
- uh->dest = htons(68);
- csum_udp4(uh, c->ip4.gw, c->ip4.addr, uh + 1, len - sizeof(*uh));
-
- iph->tot_len = htons(len += sizeof(*iph));
- iph->daddr = c->ip4.addr;
- iph->saddr = c->ip4.gw;
- csum_ip4_header(iph);
-
- len += sizeof(*eh);
- memcpy(eh->h_dest, eh->h_source, ETH_ALEN);
- memcpy(eh->h_source, c->mac, ETH_ALEN);
-
- if (tap_send(c, eh, len) < 0)
- perror("DHCP: send");
+ len = offsetof(struct msg, o) + fill(m);
+ tap_udp4_send(c, c->ip4.gw, 67, c->ip4.addr, 68, m, len);
return 1;
}
diff --git a/tap.c b/tap.c
index d250a0b..3f78c99 100644
--- a/tap.c
+++ b/tap.c
@@ -170,7 +170,6 @@ static void *tap_push_ip4h(char *buf, in_addr_t src, in_addr_t dst,
* @in: UDP payload contents (not including UDP header)
* @len: UDP payload length (not including UDP header)
*/
-/* cppcheck-suppress unusedFunction */
void tap_udp4_send(const struct ctx *c, in_addr_t src, in_port_t sport,
in_addr_t dst, in_port_t dport,
const void *in, size_t len)
--
@@ -170,7 +170,6 @@ static void *tap_push_ip4h(char *buf, in_addr_t src, in_addr_t dst,
* @in: UDP payload contents (not including UDP header)
* @len: UDP payload length (not including UDP header)
*/
-/* cppcheck-suppress unusedFunction */
void tap_udp4_send(const struct ctx *c, in_addr_t src, in_port_t sport,
in_addr_t dst, in_port_t dport,
const void *in, size_t len)
--
2.37.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v2 00/14] Clean up checksum and header generation for inbound packets
2022-10-19 0:43 [PATCH v2 00/14] Clean up checksum and header generation for inbound packets David Gibson
` (13 preceding siblings ...)
2022-10-19 0:43 ` [PATCH v2 14/14] dhcp: Use tap_udp4_send() helper in dhcp() David Gibson
@ 2022-10-19 9:07 ` Stefano Brivio
2022-10-22 8:21 ` Stefano Brivio
15 siblings, 0 replies; 17+ messages in thread
From: Stefano Brivio @ 2022-10-19 9:07 UTC (permalink / raw)
To: David Gibson; +Cc: passt-dev
On Wed, 19 Oct 2022 11:43:43 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:
> The main packet "fast paths" for UDP and TCP mostly just forward
> packets rather than generating them from scratch. However the control
> paths for ICMP and DHCP sometimes generate packets more or less from
> scratch. Because these are relatively rare, it's not performance
> critical.
>
> The paths for sending these packets have some duplication of the
> header generation. There's also some layering violation in
> tap_ip_send() which both generates IP headers and updates the L4 (UDP
> or UCMP) checksum.
>
> Finally that checksum generation is a little awkward: it temporarily
> generates the IP pseudo header (or something close enough to serve) in
> the place of the actual header, generates the checksum, then replaces
> it with the real IP header. This approach seems to be causing
> miscompiles with some LTO optimization, because the stores to the
> pseudo header are being moved or elided across the code calculating
> the checksum.
>
> This series addresses all of these. We consolidate and clarify the
> packet sending helpers, and use them in some places there was
> previously duplicated code. In the process we use new checksum
> generation helpers which take a different approach which should avoid
> the LTO problems (this aspect I haven't tested yet though).
>
> Changes since v1:
> * Numerous minor style changes
> * Rename header generation helpers to make their behaviour clearer
> * Added several missing function doc comments
> * Corrected some erroneous statements and terms in comments
Thanks, it looks good to me! I'm travelling, I'll apply in a bit.
--
Stefano
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 00/14] Clean up checksum and header generation for inbound packets
2022-10-19 0:43 [PATCH v2 00/14] Clean up checksum and header generation for inbound packets David Gibson
` (14 preceding siblings ...)
2022-10-19 9:07 ` [PATCH v2 00/14] Clean up checksum and header generation for inbound packets Stefano Brivio
@ 2022-10-22 8:21 ` Stefano Brivio
15 siblings, 0 replies; 17+ messages in thread
From: Stefano Brivio @ 2022-10-22 8:21 UTC (permalink / raw)
To: David Gibson; +Cc: passt-dev
On Wed, 19 Oct 2022 11:43:43 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:
> The main packet "fast paths" for UDP and TCP mostly just forward
> packets rather than generating them from scratch. However the control
> paths for ICMP and DHCP sometimes generate packets more or less from
> scratch. Because these are relatively rare, it's not performance
> critical.
>
> The paths for sending these packets have some duplication of the
> header generation. There's also some layering violation in
> tap_ip_send() which both generates IP headers and updates the L4 (UDP
> or UCMP) checksum.
>
> Finally that checksum generation is a little awkward: it temporarily
> generates the IP pseudo header (or something close enough to serve) in
> the place of the actual header, generates the checksum, then replaces
> it with the real IP header. This approach seems to be causing
> miscompiles with some LTO optimization, because the stores to the
> pseudo header are being moved or elided across the code calculating
> the checksum.
>
> This series addresses all of these. We consolidate and clarify the
> packet sending helpers, and use them in some places there was
> previously duplicated code. In the process we use new checksum
> generation helpers which take a different approach which should avoid
> the LTO problems (this aspect I haven't tested yet though).
>
> Changes since v1:
> * Numerous minor style changes
> * Rename header generation helpers to make their behaviour clearer
> * Added several missing function doc comments
> * Corrected some erroneous statements and terms in comments
Applied now, thanks, and sorry for the delay.
--
Stefano
^ permalink raw reply [flat|nested] 17+ messages in thread