From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202606 header.b=eLz51vdk; dkim-atps=neutral Received: from mail.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id 83D9F5A0262 for ; Wed, 17 Jun 2026 05:08:56 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202606; t=1781665732; bh=o5iitIGrsvV56ovcNF7x8skT4beK0CYISqzm9nF0U1U=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=eLz51vdk4k5YcqgLBI7gOoS0DlQE7Hih8PIzOFu5wdVHDrisZYSr3IVJ2SbKLomQ3 yaGBptMbwVia5x8hKIMqOLVaUjPn3AD9cLY9Kt05RBzWTpA8p7e53Ig7Unufpmf977 JloTbgOcybg7fEXYG384TPKjrknoA+A5kXQQ+I7SgbFhhjbS0BMxu+5d6BOsDzsTV6 cwi3S+hTbaCAb6YigHSHgyLn5bsnEvivt0vZDyZqKktgGw/veb2mvRlRjRx3QUwUTo PLEMyA7mf1IQF/TD51gblV2qytbZQWXv4w8Uz5VM1AdmP2RJmpozeBh3/L/T02Lcfe tw3ejNeeFT6LA== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4gg82X6yJgz58fF; Wed, 17 Jun 2026 13:08:52 +1000 (AEST) Date: Wed, 17 Jun 2026 13:08:45 +1000 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH 4/4] flow, treewide: Promote priority of selected flow-linked messages Message-ID: References: <20260609023226.86058-1-david@gibson.dropbear.id.au> <20260609023226.86058-5-david@gibson.dropbear.id.au> <20260617010924.1b40c402@elisabeth> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="Ecpl5C8AqjwfwxKo" Content-Disposition: inline In-Reply-To: <20260617010924.1b40c402@elisabeth> Message-ID-Hash: O26Q7QDR2LBCUSLKYMI6SYN3QK4ZT72H X-Message-ID-Hash: O26Q7QDR2LBCUSLKYMI6SYN3QK4ZT72H X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top, Anshu Kumari X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --Ecpl5C8AqjwfwxKo Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Jun 17, 2026 at 01:09:25AM +0200, Stefano Brivio wrote: > On Tue, 9 Jun 2026 12:32:26 +1000 > David Gibson wrote: >=20 > > Most of out flow specific log messages are debug level for fear of floo= ding > > the logs, even when they report real error conditions that might be off >=20 > of >=20 > > significance. > >=20 > > Now that we have the mechanisms for log message rate limiting, we can do > > better. Promote many flow related messages to warning or error level, = with > > rate limiting. While we're there add ratelimiting to a handful of exis= ting > > warning or error level messages. >=20 > (Cc'ing Anshu) >=20 > > They general heuristic is to promote messages that report a failure whi= ch >=20 > The >=20 > > is not something that should be triggered by the guest doing something > > weird. This mostly means failures from socket operations we expect to = be > > legitimate. >=20 > It all makes sense to me, just two nits below: >=20 > > Adding the ratelimiting means plumbing the 'now' timestamp through much > > more of the code, hence the large churn. > >=20 > > Signed-off-by: David Gibson [snip] > > @@ -593,7 +605,8 @@ void tcp_splice_sock_handler(struct ctx *c, union e= poll_ref ref, > > =20 > > rc =3D getsockopt(ref.fd, SOL_SOCKET, SO_ERROR, &err, &sl); > > if (rc) > > - flow_perror(conn, "Error retrieving SO_ERROR"); > > + flow_perror_ratelimit(conn, now, > > + "Error retrieving SO_ERROR"); >=20 > Shouldn't this be flow_warn_ratelimit(), more or less in line with what > udp_sock_recverr() does, That wouldn't be in line with udp_sock_recverr(). udp_sock_recverr() uses a warn() when it can't propagate a specific error via ICMP. If it fails to read the errors at all, that's an err_perror(). (Should be flow_err_ratelimit(), I'll add a patch fixing that). > because we're not aborting any operation or > any connection, we're simply failing to read out errors which might not > exist for whatever reason? The severity here isn't so much because of the consequences of the error, but because failing to even read the error state is a very unexpected condition. The kernel should generally let us do that. So, this is something going wrong strictly on the host, rather than anything triggered by the guest. In that sense it's _more_ serious than the flow being killed because the guest tried to contact a peer that didn't exist or whatever. [snip] > > @@ -92,8 +95,9 @@ static int udp_flow_sock(const struct ctx *c, > > epoll_del(flow_epollfd(&uflow->f), s); > > close(s); > > =20 > > - flow_dbg(uflow, "Couldn't connect flow socket: %s", > > - strerror_(-rc)); > > + flow_warn_ratelimit(uflow, now, > > + "Couldn't connect flow socket: %s", > > + strerror_(-rc)); >=20 > This should eventually cause an error message in udp_sock_fwd(), but > regardless of that, for consistency, as we're actually throwing packets > away in this case, shouldn't this be flow_perror_ratelimit()? I'd tend towards no, because this can at least partly be triggered by the guest. We were getting this very error with an EADDRNOTAVAIL in https://github.com/podman-container-tools/podman/issues/23739 because the guest was trying to do a multicast that the host could find a source address for. Not one of the listed reasons for EADDRNOTAVAIL in connect(2), but there you go. I suspect you'll also get an error here if the guest attempts to connect to an address that's not routable from the host. >=20 > > return rc; > > } > > uflow->s[sidei] =3D s; > > @@ -154,7 +158,7 @@ static flow_sidx_t udp_flow_new(const struct ctx *c= , union flow *flow, > > =20 > > flow_foreach_sidei(sidei) { > > if (pif_is_socket(uflow->f.pif[sidei])) > > - if (udp_flow_sock(c, uflow, sidei) < 0) > > + if (udp_flow_sock(c, uflow, sidei, now) < 0) > > goto cancel; > > } > > =20 > > @@ -176,7 +180,7 @@ static flow_sidx_t udp_flow_new(const struct ctx *c= , union flow *flow, > > goto cancel; > > } > > if (port !=3D tgt->oport) { > > - flow_err(uflow, "Unexpected local port"); > > + flow_err_ratelimit(uflow, now, "Unexpected local port"); > > goto cancel; > > } > > } > > @@ -248,7 +252,8 @@ flow_sidx_t udp_flow_from_sock(const struct ctx *c,= uint8_t pif, > > * been initiated from a socket bound to 0.0.0.0 or ::, we don't > > * know our address, so we have to leave it unpopulated. > > */ > > - flow_err(flow, "Invalid endpoint on UDP recvfrom()"); > > + flow_err_ratelimit(flow, now, > > + "Invalid endpoint on UDP recvfrom()"); > > flow_alloc_cancel(flow); > > return FLOW_SIDX_NONE; > > } >=20 > --=20 > Stefano >=20 --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --Ecpl5C8AqjwfwxKo Content-Type: application/pgp-signature; name=signature.asc -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmoyD7AACgkQzQJF27ox 2GeSpg/+PditmNgTMO4yQQ+6iiZGi3UxRw0AIt2NvL0CG5hFngdyY2RuvsgwiYsR uxl8YAEFvRMBTqWEKk+28fuWBeqWvT+73GlObpVxS5LVFHM8xLT+R2lh6BuNTxjr ojqlrMtQruo0V1CAJbAdhqgUDAaaha+cBofBApRnzyweCqLbo30uHX+vbZcSvdq9 BsF+b+HZusmYY1mhbJsLlS3mmWIvAtvaBj7DXzeQ+eL5H2OhJHp8uAdBcOhMnJ9E S/qOcYht6UyiNgatUf3n6hgOSH2EYPlNXAeDemMW0Mc2Sy9K3VZy3N7Vy+QUKPlP Z0Oq87KY3bUTtb1XOqFgUEMup9auqgmLYPYW8hoXS68QJiLLgWaOnGJvnloMbv5q J62+1ApQkroX/ZKFbDvlRhEWV7tg2G4Xhpx87tSCR8bwNwNA/TAyqpaQVtuXOCwL //vBLUkg9V6Z7jiJwgvzWDoV2nmvL1pASrT/tY8SF/qzEkBG3HdjrsAb7FcB2Mv4 M57EOYfTw8+F+kYgbAdMtoUDNR3P82f7z8OkHaylo9g+N3pqNpLA1lnMLXWdumVE C2kAs7ohTNw38ju61WtwKzmd8tS9W/l1PkXTi3Me4vgLQ7QP6pRsZs/5ISOF3ypq x+VlU4gFTOarkAipmGvdXk1s+QJV9LdZXGLe3+PKMb3S7ek9T48= =8TS7 -----END PGP SIGNATURE----- --Ecpl5C8AqjwfwxKo--