From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTP id 88D855A0265 for ; Tue, 18 Oct 2022 14:27:12 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1666096031; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yYUow7dxMDw9kkyNtL6i0ETZfUFdkhGRXZSq+dcmEow=; b=QeeAjpc0sEeDlDMYxHsvzrGSnin9DVbdhutSI3Eh1Th6Qi5+9YnnHh11AhKk+eJmhay3VJ feyGRUPQtMhbwrJQ3rFquDNrkRonArebO0heinIgckWBwJcGRMhYI4NRhCu5fKeKZpPiqi bL4ow6mUiEfgJ+ae/tdCpUQrjhbQ2O8= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-663-qzV2hqt1MCuwLAoP8dJYYg-1; Tue, 18 Oct 2022 08:27:09 -0400 X-MC-Unique: qzV2hqt1MCuwLAoP8dJYYg-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 9BB1C85A583; Tue, 18 Oct 2022 12:27:09 +0000 (UTC) Received: from maya.cloud.tilaa.com (ovpn-208-31.brq.redhat.com [10.40.208.31]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 2702140C2065; Tue, 18 Oct 2022 12:27:09 +0000 (UTC) Date: Tue, 18 Oct 2022 14:27:04 +0200 From: Stefano Brivio To: David Gibson Subject: Re: [PATCH 13/14] tap: Split tap_ip4_send() into UDP and ICMP variants Message-ID: <20221018142704.11ee38ac@elisabeth> In-Reply-To: References: <20221017085807.473470-1-david@gibson.dropbear.id.au> <20221017085807.473470-14-david@gibson.dropbear.id.au> <20221018050634.428cf8d6@elisabeth> Organization: Red Hat MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: TCEXYYSWRX4Z6VPBV6WK2U6O2XY7UU6A X-Message-ID-Hash: TCEXYYSWRX4Z6VPBV6WK2U6O2XY7UU6A X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.3 Precedence: list List-Id: Development discussion and patches for passt Archived-At: <> Archived-At: List-Archive: <> List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Tue, 18 Oct 2022 23:07:58 +1100 David Gibson wrote: > On Tue, Oct 18, 2022 at 05:06:34AM +0200, Stefano Brivio wrote: > > On Mon, 17 Oct 2022 19:58:06 +1100 > > David Gibson wrote: > > > > > tap_ip4_send() has special case logic to compute the checksums for UDP > > > and ICMP packets, which is a mild layering violation. By using a suitable > > > helper we can split it into tap_udp4_send() and tap_icmp4_send() functions > > > without greatly increasing the code size, this removing that layering > > > violation. > > > > > > We make some small changes to the interface while there. In both cases > > > we make the destination IPv4 address a parameter, which will be useful > > > later. For the UDP variant we make it take just the UDP payload, and it > > > will generate the UDP header. For the ICMP variant we pass in the ICMP > > > header as before. The inconsistency is because that's what seems to be > > > the more natural way to invoke the function in the callers in each case. > > > > > > Signed-off-by: David Gibson > > > --- > > > icmp.c | 3 ++- > > > tap.c | 75 +++++++++++++++++++++++++++++++++++++++++----------------- > > > tap.h | 7 ++++-- > > > 3 files changed, 60 insertions(+), 25 deletions(-) > > > > > > diff --git a/icmp.c b/icmp.c > > > index 6493ea9..233acf9 100644 > > > --- a/icmp.c > > > +++ b/icmp.c > > > @@ -124,7 +124,8 @@ void icmp_sock_handler(const struct ctx *c, union epoll_ref ref, > > > icmp_id_map[V4][id].seq = seq; > > > } > > > > > > - tap_ip4_send(c, sr4->sin_addr.s_addr, IPPROTO_ICMP, buf, n); > > > + tap_icmp4_send(c, sr4->sin_addr.s_addr, tap_ip4_daddr(c), > > > + buf, n); > > > } > > > } > > > > > > diff --git a/tap.c b/tap.c > > > index 274f4ba..5792880 100644 > > > --- a/tap.c > > > +++ b/tap.c > > > @@ -127,20 +127,10 @@ static void *tap_l2_hdr(const struct ctx *c, void *buf, uint16_t proto) > > > return eh + 1; > > > } > > > > > > -/** > > > - * tap_ip4_send() - Send IPv4 packet, with L2 headers, calculating L3/L4 checksums > > > - * @c: Execution context > > > - * @src: IPv4 source address > > > - * @proto: L4 protocol number > > > - * @in: Payload > > > - * @len: L4 payload length > > > - */ > > > -void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto, > > > - const char *in, size_t len) > > > > I understand why you return ip(4)h + 1 here because I've just reviewed > > 9/14, I wouldn't know otherwise: > > > > /** > > * tap_ip4_hdr() - Build IPv4 header for inbound packet, with checksum > > * @c: Execution context > > * @src: IPv4 source address, network order > > * @dst: IPv4 destination address, network order > > * @len: L4 payload length > > * @proto: L4 protocol number > > * > > * Return: pointer to write payload to > > */ > > Oops, yes, forgot to add a function comment. Done. > > > > +static void *tap_ip4_hdr(char *buf, in_addr_t src, in_addr_t dst, > > > + size_t len, uint8_t proto) > > > { > > > - char buf[USHRT_MAX]; > > > - struct iphdr *ip4h = (struct iphdr *)tap_l2_hdr(c, buf, ETH_P_IP); > > > - char *data = (char *)(ip4h + 1); > > > + struct iphdr *ip4h = (struct iphdr *)buf; > > > > > > ip4h->version = 4; > > > ip4h->ihl = sizeof(struct iphdr) / 4; > > > @@ -151,20 +141,61 @@ void tap_ip4_send(const struct ctx *c, in_addr_t src, uint8_t proto, > > > ip4h->ttl = 255; > > > ip4h->protocol = proto; > > > ip4h->saddr = src; > > > - ip4h->daddr = tap_ip4_daddr(c); > > > + ip4h->daddr = dst; > > > csum_ip4_header(ip4h); > > > + return ip4h + 1; > > > +} > > > + > > > +/** > > > + * tap_udp4_send() - Send UDP over IPv4 packet > > > + * @c: Execution context > > > + * @src: IPv4 source address > > > + * @sport: UDP source port > > > + * @dst: IPv4 destination address > > > + * @dport: UDP destination port > > > + * @in: UDP payload contents (not including UDP header) > > > + * @len: UDP payload length (not including UDP header) > > > + */ > > > +/* cppcheck-suppress unusedFunction */ > > > +void tap_udp4_send(const struct ctx *c, in_addr_t src, in_port_t sport, > > > + in_addr_t dst, in_port_t dport, > > > + const void *in, size_t len) > > > +{ > > > + size_t udplen = len + sizeof(struct udphdr); > > > + char buf[USHRT_MAX]; > > > + void *ip4h = tap_l2_hdr(c, buf, ETH_P_IP); > > > + void *uhp = tap_ip4_hdr(ip4h, src, dst, udplen, IPPROTO_UDP); > > > > Two observations: > > > > - this saves one line and one cast, but it's really a bit unnatural that > > tap_ip4_hdr() doesn't point to the header it just made, or to nothing. > > > > I would rather have to +1 the return value or the original pointer > > instead or having this trick > > > > > + struct udphdr *uh = (struct udphdr *)uhp; > > > + char *data = (char *)(uh + 1); > > > > - it's longer, but in my opinion clearer, if we split a bit more clearly > > the components of the packet, that is, something like (untested): > > I don't really want to change this. Yes, it's a bit counterintuitive > at first blush, but there's a reason for this approach. > > This style of a function which generates a header then points *after* > it works even if the header it generates is of variable length. > Advancing to the payload in the caller doesn't (at least not without > breaking the abstraction I'm trying to generate with these helpers). > > That's not just theoretical, because at some point I'd like to extend > the l2_hdr function to also allocate space for the qemu socket length > header. > > I'm certainly open to name changes to make this behaviour more > obvious, but I think returning the payload pointer not the header > pointer makes for a better abstraction here. Hmm, yes, I think the variable length case is a very valid point, and also in terms of abstraction I see the advantage. There are just two things I can think of: - passing the end pointer as argument (not as practical as your solution, though) - naming it tap_ip4_push_hdr(), tap_ip4_hdr_after(), tap_ip4_hdr_goto_next(), tap_ip4_leave_header_behind()... I can't think of anything better at this point. I'll keep thinking, but at the moment I'd be fine even with the current name. -- Stefano