From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Gibson To: passt-dev@passt.top Subject: Re: [PATCH 1/2] udp: Don't drop zero-length outbound UDP packets Date: Tue, 13 Sep 2022 16:39:26 +1000 Message-ID: In-Reply-To: <20220909180659.7b6fe407@elisabeth> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============8579930012512972809==" --===============8579930012512972809== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable On Fri, Sep 09, 2022 at 06:06:59PM +0200, Stefano Brivio wrote: > On Fri, 9 Sep 2022 20:39:44 +1000 > David Gibson wrote: >=20 > > On Fri, Sep 09, 2022 at 11:26:58AM +0200, Stefano Brivio wrote: > > > On Fri, 9 Sep 2022 14:27:13 +1000 > > > David Gibson wrote: > > > =20 > > > > udp_tap_handler() currently skips outbound packets if they have a pay= load > > > > length of zero. This is not correct, since in a datagram protocol ze= ro > > > > length packets still have meaning. =20 > > >=20 > > > Right, nice catch. As far as I can tell it's an issue I added with > > > commit bb708111833e ("treewide: Packet abstraction with mandatory > > > boundary checks"). > > > =20 > > > > Adjust this to correctly forward the zero-length packets by using a m= sghdr > > > > with msg_iovlen =3D=3D 0. > > > >=20 > > > > Bugzilla: https://bugs.passt.top/show_bug.cgi?id=3D19 > > > >=20 > > > > Signed-off-by: David Gibson > > > > --- > > > > udp.c | 10 +++++----- > > > > 1 file changed, 5 insertions(+), 5 deletions(-) > > > >=20 > > > > diff --git a/udp.c b/udp.c > > > > index c4ebecc..caa852a 100644 > > > > --- a/udp.c > > > > +++ b/udp.c > > > > @@ -1075,19 +1075,19 @@ int udp_tap_handler(struct ctx *c, int af, co= nst void *addr, > > > > uh_send =3D packet_get(p, i, 0, sizeof(*uh), &len); > > > > if (!uh_send) > > > > return p->count; > > > > + > > > > + mm[i].msg_hdr.msg_name =3D sa; > > > > + mm[i].msg_hdr.msg_namelen =3D sl; > > > > + count++; > > > > + > > > > if (!len) > > > > continue; > > > > =20 > > > > m[i].iov_base =3D (char *)(uh_send + 1); > > > > m[i].iov_len =3D len; =20 > > >=20 > > > I haven't tested this yet, but: > > >=20 > > > - shouldn't iov_len be set to 0 (moving also this line before)? Note > > > that I'm not initialising m > > >=20 > > > - shouldn't iov_base point to NULL to avoid noise from valgrind? =20 > >=20 > > No, because with this change m[i] is entirely unreferenced by mm[]. > >=20 > > > Also: > > > =20 > > > > =20 > > > > - mm[i].msg_hdr.msg_name =3D sa; > > > > - mm[i].msg_hdr.msg_namelen =3D sl; > > > > - > > > > mm[i].msg_hdr.msg_iov =3D m + i; > > > > mm[i].msg_hdr.msg_iovlen =3D 1; =20 > > >=20 > > > ...I guess we should still go through those even if the size is zero, > > > because we're appending a message. If we don't, I would expect some > > > subsequent messages in the batch to be dropped (as many as zero sized > > > packets we have). =20 > >=20 > > Here I'm relying on the fact that mm[] (unlike m[]) *is* initialized, > > so if we don't alter it here, msg_iov is NULL and msg_iovlen is 0. >=20 > > I was looking at removing that initialization, but I haven't gotten > > that working yet. >=20 > Oops, I see now. >=20 > So, I suppose that if you want to drop that initialisation, you might > need to zero msg_hdr.controllen as well. Duh. I completely failed to consider the other fields. I actually suspect msg_hdr.flags is the most vital one (without flags I don't know if it will examine control or controllen). But in any case I'm initializing them all now and it's working. > And msg_hdr.control too: other than keeping valgrind happy, not leaking > random stuff to the kernel might make this marginally more secure. >=20 > That should be better than the huge memset() at the beginning, because > we're already writing to msg_iovlen anyway. >=20 > If you already tried that, though, I don't have any other quick idea. >=20 > By the way, I had a mechanism in place, just for TCP though, to avoid > reassigning those pointers and also length descriptors. >=20 > I got rid of it in commit 38fbfdbcb95d ("tcp: Get rid of iov with > cached MSS, drop sendmmsg(), add deferred flush") because it didn't > really help with throughput. I don't see any significant "userspace" > overhead on guest-to-host TCP paths with perf(1). >=20 > ...maybe for UDP that's different, I haven't focused that much on UDP > performance. >=20 > > > That is, I suppose we could just drop the continue statement on if > > > (!len) above -- but, again, I haven't tested it. =20 > >=20 > > My first version actually did that, so it also works, but I think > > setting msg_iovlen to 0 is a bit neater. >=20 > Right. Maybe it was just me being thick, or perhaps that could use a > comment: >=20 > /* Zero-length packet: don't use any buffer, msg_iovlen is 0 */ > if (!len) > continue; >=20 --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --===============8579930012512972809== Content-Type: application/pgp-signature Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="signature.asc" MIME-Version: 1.0 LS0tLS1CRUdJTiBQR1AgU0lHTkFUVVJFLS0tLS0KCmlRSXpCQUVCQ0FBZEZpRUVvVUx4V3U0L1dz MGRCK1h0Z3lwWTRnRXdZU0lGQW1NZ0paZ0FDZ2tRZ3lwWTRnRXcKWVNMem9BLy9YYzJHTFl4RWo2 ZHJra1Y0YURSNUxqSUFjb01qeFhvNkNZWDJSTWppVEdrMHlhaUowb0Y0VlpRRApjMUF6SS9CRytC TlpLQXBPNTZOdXNuamltRzg2SitkaFV1SEVoWDU1amJRNkpneTVzZjM5djE2TWRTcVdDZXpwCmE1 QzI4RWV1ei9YZ1dUbFdOVk56VVROSXdJTGxCQ2ptVnAwL2hxOEdtSmhmd3JBUERYTk92cGlDL0NT bkVxUHoKR0w3U2IzRlJJNHRiSHJubGlxb1ptZFNaQklCcGlCVDQwVlA1YlN0SDl6Q0M5b3lkZ0JN T3pFUE5NdFdmb3FLcAo1R05qT1F6UjZnVEw2WmU0UjFoM2EwOUtTUXVNK1EyL0puTWxIVUlDa1lK bUk3eHpRald4b2Jrc1pXQWtTVWd4CkYwcmZwMHgrUHlBVVM5M1RrbmcraGwxWjM4a1RHbVp6ME5V cHNnTTNiT0xCNm00UFdLdUdwQ1R5RGZnYzZSMysKRE5pdXRLRExGZk1BQlJYelJobk5vdFVtYjh1 OWl3TVlzaFB4ZUNucm5DUXhySS9DTUhlb0tjeVh3eDFyV3dLRAptRFFqYXdvQjMyVmZQL1NPRHZE T1JiV0JzQXVMWXRSUHF1MlJiZTE1VDZ1aXZpTlNYSzlCcUJmT0ZxOTIyN010CnRDOS96K2dwV2NH TXJYUk5YQlZaNnpuNnh2bExyY3Q3dCs3aVU1aWF0Y0hSYmVOY0VPeFY4TWI1ankvYkk1VjAKTFJH d0sySDRWMmRZUGhRaVVsOUpjWmFhOUpFclRtSGIybTJXVnpvWTkzamx4UzFNR0sxUUkrWTVvUWwr OXRYQwpVSWJSaWpGanNhY2RrYi96eHlpMGZwZE5NS3RVQWhNWGJPbG01TVg5dzcrYVRwUHFqZU09 Cj0vemV0Ci0tLS0tRU5EIFBHUCBTSUdOQVRVUkUtLS0tLQo= --===============8579930012512972809==--