From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>
Cc: passt-dev@passt.top
Subject: Re: [PATCH 13/14] tap: Split tap_ip4_send() into UDP and ICMP variants
Date: Tue, 18 Oct 2022 23:07:58 +1100 [thread overview]
Message-ID: <Y06XHtUTMqxBwq96@yekko> (raw)
In-Reply-To: <20221018050634.428cf8d6@elisabeth>
[-- Attachment #1: Type: text/plain, Size: 8680 bytes --]
On Tue, Oct 18, 2022 at 05:06:34AM +0200, Stefano Brivio wrote:
> On Mon, 17 Oct 2022 19:58:06 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
>
> > tap_ip4_send() has special case logic to compute the checksums for UDP
> > and ICMP packets, which is a mild layering violation. By using a suitable
> > helper we can split it into tap_udp4_send() and tap_icmp4_send() functions
> > without greatly increasing the code size, this removing that layering
> > violation.
> >
> > We make some small changes to the interface while there. In both cases
> > we make the destination IPv4 address a parameter, which will be useful
> > later. For the UDP variant we make it take just the UDP payload, and it
> > will generate the UDP header. For the ICMP variant we pass in the ICMP
> > header as before. The inconsistency is because that's what seems to be
> > the more natural way to invoke the function in the callers in each case.
> >
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> > icmp.c | 3 ++-
> > tap.c | 75 +++++++++++++++++++++++++++++++++++++++++-----------------
> > tap.h | 7 ++++--
> > 3 files changed, 60 insertions(+), 25 deletions(-)
> >
> > diff --git a/icmp.c b/icmp.c
> > index 6493ea9..233acf9 100644
> > --- a/icmp.c
> > +++ b/icmp.c
> > @@ -124,7 +124,8 @@ void icmp_sock_handler(const struct ctx *c, union epoll_ref ref,
> > icmp_id_map[V4][id].seq = seq;
> > }
> >
> > - tap_ip4_send(c, sr4->sin_addr.s_addr, IPPROTO_ICMP, buf, n);
> > + tap_icmp4_send(c, sr4->sin_addr.s_addr, tap_ip4_daddr(c),
> > + buf, n);
> > }
> > }
> >
> > diff --git a/tap.c b/tap.c
> > index 274f4ba..5792880 100644
> > --- a/tap.c
> > +++ b/tap.c
> > @@ -127,20 +127,10 @@ static void *tap_l2_hdr(const struct ctx *c, void *buf, uint16_t proto)
> > return eh + 1;
> > }
> >
> > -/**
> > - * tap_ip4_send() - Send IPv4 packet, with L2 headers, calculating L3/L4 checksums
> > - * @c: Execution context
> > - * @src: IPv4 source address
> > - * @proto: L4 protocol number
> > - * @in: Payload
> > - * @len: L4 payload length
> > - */
> > -void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto,
> > - const char *in, size_t len)
>
> I understand why you return ip(4)h + 1 here because I've just reviewed
> 9/14, I wouldn't know otherwise:
>
> /**
> * tap_ip4_hdr() - Build IPv4 header for inbound packet, with checksum
> * @c: Execution context
> * @src: IPv4 source address, network order
> * @dst: IPv4 destination address, network order
> * @len: L4 payload length
> * @proto: L4 protocol number
> *
> * Return: pointer to write payload to
> */
Oops, yes, forgot to add a function comment. Done.
> > +static void *tap_ip4_hdr(char *buf, in_addr_t src, in_addr_t dst,
> > + size_t len, uint8_t proto)
> > {
> > - char buf[USHRT_MAX];
> > - struct iphdr *ip4h = (struct iphdr *)tap_l2_hdr(c, buf, ETH_P_IP);
> > - char *data = (char *)(ip4h + 1);
> > + struct iphdr *ip4h = (struct iphdr *)buf;
> >
> > ip4h->version = 4;
> > ip4h->ihl = sizeof(struct iphdr) / 4;
> > @@ -151,20 +141,61 @@ void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto,
> > ip4h->ttl = 255;
> > ip4h->protocol = proto;
> > ip4h->saddr = src;
> > - ip4h->daddr = tap_ip4_daddr(c);
> > + ip4h->daddr = dst;
> > csum_ip4_header(ip4h);
> > + return ip4h + 1;
> > +}
> > +
> > +/**
> > + * tap_udp4_send() - Send UDP over IPv4 packet
> > + * @c: Execution context
> > + * @src: IPv4 source address
> > + * @sport: UDP source port
> > + * @dst: IPv4 destination address
> > + * @dport: UDP destination port
> > + * @in: UDP payload contents (not including UDP header)
> > + * @len: UDP payload length (not including UDP header)
> > + */
> > +/* cppcheck-suppress unusedFunction */
> > +void tap_udp4_send(const struct ctx *c, in_addr_t src, in_port_t sport,
> > + in_addr_t dst, in_port_t dport,
> > + const void *in, size_t len)
> > +{
> > + size_t udplen = len + sizeof(struct udphdr);
> > + char buf[USHRT_MAX];
> > + void *ip4h = tap_l2_hdr(c, buf, ETH_P_IP);
> > + void *uhp = tap_ip4_hdr(ip4h, src, dst, udplen, IPPROTO_UDP);
>
> Two observations:
>
> - this saves one line and one cast, but it's really a bit unnatural that
> tap_ip4_hdr() doesn't point to the header it just made, or to nothing.
>
> I would rather have to +1 the return value or the original pointer
> instead or having this trick
>
> > + struct udphdr *uh = (struct udphdr *)uhp;
> > + char *data = (char *)(uh + 1);
>
> - it's longer, but in my opinion clearer, if we split a bit more clearly
> the components of the packet, that is, something like (untested):
I don't really want to change this. Yes, it's a bit counterintuitive
at first blush, but there's a reason for this approach.
This style of a function which generates a header then points *after*
it works even if the header it generates is of variable length.
Advancing to the payload in the caller doesn't (at least not without
breaking the abstraction I'm trying to generate with these helpers).
That's not just theoretical, because at some point I'd like to extend
the l2_hdr function to also allocate space for the qemu socket length
header.
I'm certainly open to name changes to make this behaviour more
obvious, but I think returning the payload pointer not the header
pointer makes for a better abstraction here.
> char buf[USHRT_MAX];
> struct udphdr *uh;
> struct iphdr *iph;
> char *data;
>
> iph = (struct iphdr *)tap_l2_hdr(c, buf, ETH_P_IP) + 1;
> tap_ip_hdr(iph, src, dst, len + sizeof(uh), IPPROTO_UDP);
>
> uh = (struct udphdr *)iph + 1;
> uh->source = htons(sport);
> uh->dest = htons(dport);
> uh->len = htons(len + sizeof(uh));
> csum_udp4(uh, src, dst, in, len);
>
> data = uh + 1;
> memcpy(data, in, len);
>
> if (tap_send(c, buf, len + (data - buf)) < 0)
> debug("tap: failed to send %lu bytes (IPv4)", len);
> >
> > + uh->source = htons(sport);
> > + uh->dest = htons(dport);
> > + uh->len = htons(udplen);
> > + csum_udp4(uh, src, dst, in, len);
> > memcpy(data, in, len);
> >
> > - if (ip4h->protocol == IPPROTO_UDP) {
> > - struct udphdr *uh = (struct udphdr *)(ip4h + 1);
> > + if (tap_send(c, buf, len + (data - buf)) < 0)
> > + debug("tap: failed to send %lu bytes (IPv4)", len);
> > +}
> >
> > - csum_udp4(uh, ip4h->saddr, ip4h->daddr,
> > - uh + 1, len - sizeof(*uh));
> > - } else if (ip4h->protocol == IPPROTO_ICMP) {
> > - struct icmphdr *ih = (struct icmphdr *)(ip4h + 1);
> > - csum_icmp4(ih, ih + 1, len - sizeof(*ih));
> > - }
> > +/**
> > + * tap_icmp4_send() - Send ICMPv4 packet
> > + * @c: Execution context
> > + * @src: IPv4 source address
> > + * @dst: IPv4 destination address
> > + * @in: ICMP packet, including ICMP header
> > + * @len: ICMP packet length, including ICMP header
> > + */
> > +void tap_icmp4_send(const struct ctx *c, in_addr_t src, in_addr_t dst,
> > + void *in, size_t len)
> > +{
> > + char buf[USHRT_MAX];
> > + void *ip4h = tap_l2_hdr(c, buf, ETH_P_IP);
> > + char *data = tap_ip4_hdr(ip4h, src, dst, len, IPPROTO_ICMP);
> > + struct icmphdr *icmp4h = (struct icmphdr *)data;
>
> ...same here, even though perhaps not so apparent.
>
> > +
> > + memcpy(data, in, len);
> > + csum_icmp4(icmp4h, icmp4h + 1, len - sizeof(*icmp4h));
> >
> > if (tap_send(c, buf, len + (data - buf)) < 0)
> > debug("tap: failed to send %lu bytes (IPv4)", len);
> > diff --git a/tap.h b/tap.h
> > index d43c7a0..743bc58 100644
> > --- a/tap.h
> > +++ b/tap.h
> > @@ -7,10 +7,13 @@
> > #define TAP_H
> >
> > in_addr_t tap_ip4_daddr(const struct ctx *c);
> > +void tap_udp4_send(const struct ctx *c, in_addr_t src, in_port_t sport,
> > + in_addr_t dst, in_port_t dport,
> > + const void *in, size_t len);
> > +void tap_icmp4_send(const struct ctx *c, in_addr_t src, in_addr_t dst,
> > + void *in, size_t len);
> > const struct in6_addr *tap_ip6_daddr(const struct ctx *c,
> > const struct in6_addr *src);
> > -void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto,
> > - const char *in, size_t len);
> > void tap_udp6_send(const struct ctx *c,
> > const struct in6_addr *src, in_port_t sport,
> > const struct in6_addr *dst, in_port_t dport,
>
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2022-10-18 12:08 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-17 8:57 [PATCH 00/14] Clean up checksum and header generation for inbound packets David Gibson
2022-10-17 8:57 ` [PATCH 01/14] Add csum_icmp6() helper for calculating ICMPv6 checksums David Gibson
2022-10-18 3:01 ` Stefano Brivio
2022-10-18 12:05 ` David Gibson
2022-10-17 8:57 ` [PATCH 02/14] Add csum_icmp4() helper for calculating ICMPv4 checksums David Gibson
2022-10-18 3:01 ` Stefano Brivio
2022-10-18 12:06 ` David Gibson
2022-10-18 12:28 ` Stefano Brivio
2022-10-17 8:57 ` [PATCH 03/14] Add csum_udp6() helper for calculating UDP over IPv6 checksums David Gibson
2022-10-18 3:02 ` Stefano Brivio
2022-10-18 12:06 ` David Gibson
2022-10-17 8:57 ` [PATCH 04/14] Add csum_udp4() helper for calculating UDP over IPv4 checksums David Gibson
2022-10-18 3:03 ` Stefano Brivio
2022-10-18 12:06 ` David Gibson
2022-10-17 8:57 ` [PATCH 05/14] Add csum_ip4_header() helper to calculate IPv4 header checksums David Gibson
2022-10-18 3:03 ` Stefano Brivio
2022-10-18 12:07 ` David Gibson
2022-10-17 8:57 ` [PATCH 06/14] Add helpers for normal inbound packet destination addresses David Gibson
2022-10-18 3:04 ` Stefano Brivio
2022-10-18 12:07 ` David Gibson
2022-10-17 8:58 ` [PATCH 07/14] Remove support for TCP packets from tap_ip_send() David Gibson
2022-10-17 8:58 ` [PATCH 08/14] tap: Remove unhelpeful vnet_pre optimization from tap_send() David Gibson
2022-10-18 3:05 ` Stefano Brivio
2022-10-18 12:07 ` David Gibson
2022-10-17 8:58 ` [PATCH 09/14] Split tap_ip_send() into IPv4 and IPv6 specific functions David Gibson
2022-10-18 3:06 ` Stefano Brivio
2022-10-18 12:07 ` David Gibson
2022-10-17 8:58 ` [PATCH 10/14] tap: Split tap_ip6_send() into UDP and ICMP variants David Gibson
2022-10-17 8:58 ` [PATCH 11/14] ndp: Remove unneeded eh_source parameter David Gibson
2022-10-17 8:58 ` [PATCH 12/14] ndp: Use tap_icmp6_send() helper David Gibson
2022-10-17 8:58 ` [PATCH 13/14] tap: Split tap_ip4_send() into UDP and ICMP variants David Gibson
2022-10-18 3:06 ` Stefano Brivio
2022-10-18 12:07 ` David Gibson [this message]
2022-10-18 12:27 ` Stefano Brivio
2022-10-18 23:54 ` David Gibson
2022-10-17 8:58 ` [PATCH 14/14] dhcp: Use tap_udp4_send() helper in dhcp() David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y06XHtUTMqxBwq96@yekko \
--to=david@gibson.dropbear.id.au \
--cc=passt-dev@passt.top \
--cc=sbrivio@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).