From: David Gibson <david@gibson.dropbear.id.au>
To: Stefano Brivio <sbrivio@redhat.com>, passt-dev@passt.top
Cc: Jon Maloy <jmaloy@redhat.com>,
David Gibson <david@gibson.dropbear.id.au>
Subject: [PATCH 4/7] udp: Deal with errors as we go in udp_sock_fwd()
Date: Tue, 15 Apr 2025 17:16:21 +1000 [thread overview]
Message-ID: <20250415071624.2618589-5-david@gibson.dropbear.id.au> (raw)
In-Reply-To: <20250415071624.2618589-1-david@gibson.dropbear.id.au>
When we get an epoll event on a listening socket, we first deal with any
errors (udp_sock_errs()), then with any received packets (udp_sock_fwd()).
However, it's theoretically possible that new errors could get flagged on
the socket after we call udp_sock_errs(), in which case we could get errors
returned in in udp_sock_fwd() -> udp_peek_addr() -> recvmsg().
In fact, we do deal with this correctly, although the path is somewhat
non-obvious. The recvmsg() error will cause us to bail out of
udp_sock_fwd(), but the EPOLLERR event will now be flagged, so we'll come
back here next epoll loop and call udp_sock_errs().
Except.. we call udp_sock_fwd() from udp_flush_flow() as well as from
epoll events. This is to deal with any packets that arrived between bind()
and connect(), and so might not be associated with the socket's intended
flow. This expects udp_sock_fwd() to flush _all_ queued datagrams, so that
anything received later must be for the correct flow.
At the moment, udp_sock_errs() might fail to flush all datagrams if errors
occur. In particular this can happen in practice for locally reported
errors which occur immediately after connect() (e.g. connecting to a local
port with nothing listening).
We can deal with the problem case, and also make the flow a little more
natural for the common case by having udp_sock_fwd() call udp_sock_errs()
to handle errors as the occur, rather than trying to deal with all errors
in advance.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
udp.c | 45 ++++++++++++++++++++++++++-------------------
1 file changed, 26 insertions(+), 19 deletions(-)
diff --git a/udp.c b/udp.c
index c51ac955..0bec499d 100644
--- a/udp.c
+++ b/udp.c
@@ -601,7 +601,7 @@ static int udp_sock_errs(const struct ctx *c, int s, flow_sidx_t sidx)
* @src: Socket address (output)
* @dst: (Local) destination address (output)
*
- * Return: 0 on success, -1 otherwise
+ * Return: 0 if no more packets, 1 on success, -ve error code on error
*/
static int udp_peek_addr(int s, union sockaddr_inany *src,
union inany_addr *dst)
@@ -619,9 +619,9 @@ static int udp_peek_addr(int s, union sockaddr_inany *src,
rc = recvmsg(s, &msg, MSG_PEEK | MSG_DONTWAIT);
if (rc < 0) {
- trace("Error peeking at socket address: %s", strerror_(errno));
- /* Bail out and let the EPOLLERR handler deal with it */
- return rc;
+ if (errno == EAGAIN || errno == EWOULDBLOCK)
+ return 0;
+ return -errno;
}
hdr = CMSG_FIRSTHDR(&msg);
@@ -644,7 +644,7 @@ static int udp_peek_addr(int s, union sockaddr_inany *src,
sockaddr_ntop(src, sastr, sizeof(sastr)),
inany_ntop(dst, dstr, sizeof(dstr)));
- return 0;
+ return 1;
}
/**
@@ -740,11 +740,27 @@ void udp_sock_fwd(const struct ctx *c, int s, uint8_t frompif,
{
union sockaddr_inany src;
union inany_addr dst;
+ int rc;
- while (udp_peek_addr(s, &src, &dst) == 0) {
- flow_sidx_t tosidx = udp_flow_from_sock(c, frompif,
- &dst, port, &src, now);
- uint8_t topif = pif_at_sidx(tosidx);
+ while ((rc = udp_peek_addr(s, &src, &dst)) != 0) {
+ flow_sidx_t tosidx;
+ uint8_t topif;
+
+ if (rc < 0) {
+ trace("Error peeking at socket address: %s",
+ strerror_(-rc));
+ /* Clear errors & carry on */
+ if (udp_sock_errs(c, s, FLOW_SIDX_NONE) < 0) {
+ err(
+"UDP: Unrecoverable error on listening socket: (%s port %hu)",
+ pif_name(frompif), port);
+ /* FIXME: what now? close/re-open socket? */
+ }
+ continue;
+ }
+
+ tosidx = udp_flow_from_sock(c, frompif, &dst, port, &src, now);
+ topif = pif_at_sidx(tosidx);
if (pif_is_socket(topif)) {
udp_sock_to_sock(c, s, 1, tosidx);
@@ -776,16 +792,7 @@ void udp_listen_sock_handler(const struct ctx *c,
union epoll_ref ref, uint32_t events,
const struct timespec *now)
{
- if (events & EPOLLERR) {
- if (udp_sock_errs(c, ref.fd, FLOW_SIDX_NONE) < 0) {
- err("UDP: Unrecoverable error on listening socket:"
- " (%s port %hu)", pif_name(ref.udp.pif), ref.udp.port);
- /* FIXME: what now? close/re-open socket? */
- return;
- }
- }
-
- if (events & EPOLLIN)
+ if (events & (EPOLLERR | EPOLLIN))
udp_sock_fwd(c, ref.fd, ref.udp.pif, ref.udp.port, now);
}
--
@@ -601,7 +601,7 @@ static int udp_sock_errs(const struct ctx *c, int s, flow_sidx_t sidx)
* @src: Socket address (output)
* @dst: (Local) destination address (output)
*
- * Return: 0 on success, -1 otherwise
+ * Return: 0 if no more packets, 1 on success, -ve error code on error
*/
static int udp_peek_addr(int s, union sockaddr_inany *src,
union inany_addr *dst)
@@ -619,9 +619,9 @@ static int udp_peek_addr(int s, union sockaddr_inany *src,
rc = recvmsg(s, &msg, MSG_PEEK | MSG_DONTWAIT);
if (rc < 0) {
- trace("Error peeking at socket address: %s", strerror_(errno));
- /* Bail out and let the EPOLLERR handler deal with it */
- return rc;
+ if (errno == EAGAIN || errno == EWOULDBLOCK)
+ return 0;
+ return -errno;
}
hdr = CMSG_FIRSTHDR(&msg);
@@ -644,7 +644,7 @@ static int udp_peek_addr(int s, union sockaddr_inany *src,
sockaddr_ntop(src, sastr, sizeof(sastr)),
inany_ntop(dst, dstr, sizeof(dstr)));
- return 0;
+ return 1;
}
/**
@@ -740,11 +740,27 @@ void udp_sock_fwd(const struct ctx *c, int s, uint8_t frompif,
{
union sockaddr_inany src;
union inany_addr dst;
+ int rc;
- while (udp_peek_addr(s, &src, &dst) == 0) {
- flow_sidx_t tosidx = udp_flow_from_sock(c, frompif,
- &dst, port, &src, now);
- uint8_t topif = pif_at_sidx(tosidx);
+ while ((rc = udp_peek_addr(s, &src, &dst)) != 0) {
+ flow_sidx_t tosidx;
+ uint8_t topif;
+
+ if (rc < 0) {
+ trace("Error peeking at socket address: %s",
+ strerror_(-rc));
+ /* Clear errors & carry on */
+ if (udp_sock_errs(c, s, FLOW_SIDX_NONE) < 0) {
+ err(
+"UDP: Unrecoverable error on listening socket: (%s port %hu)",
+ pif_name(frompif), port);
+ /* FIXME: what now? close/re-open socket? */
+ }
+ continue;
+ }
+
+ tosidx = udp_flow_from_sock(c, frompif, &dst, port, &src, now);
+ topif = pif_at_sidx(tosidx);
if (pif_is_socket(topif)) {
udp_sock_to_sock(c, s, 1, tosidx);
@@ -776,16 +792,7 @@ void udp_listen_sock_handler(const struct ctx *c,
union epoll_ref ref, uint32_t events,
const struct timespec *now)
{
- if (events & EPOLLERR) {
- if (udp_sock_errs(c, ref.fd, FLOW_SIDX_NONE) < 0) {
- err("UDP: Unrecoverable error on listening socket:"
- " (%s port %hu)", pif_name(ref.udp.pif), ref.udp.port);
- /* FIXME: what now? close/re-open socket? */
- return;
- }
- }
-
- if (events & EPOLLIN)
+ if (events & (EPOLLERR | EPOLLIN))
udp_sock_fwd(c, ref.fd, ref.udp.pif, ref.udp.port, now);
}
--
2.49.0
next prev parent reply other threads:[~2025-04-15 7:16 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-15 7:16 [PATCH 0/7] Assorted fixes for UDP socket and error handling problems David Gibson
2025-04-15 7:16 ` [PATCH 1/7] udp: Fix breakage of UDP error handling by PKTINFO support David Gibson
2025-04-15 7:16 ` [PATCH 2/7] udp: Be quieter about errors on UDP receive David Gibson
2025-04-15 7:16 ` [PATCH 3/7] udp: Pass socket & flow information direction to error handling functions David Gibson
2025-04-15 7:16 ` David Gibson [this message]
2025-04-15 7:16 ` [PATCH 5/7] udp: Add udp_pktinfo() helper David Gibson
2025-04-15 18:54 ` Stefano Brivio
2025-04-16 0:37 ` David Gibson
2025-04-15 7:16 ` [PATCH 6/7] udp: Minor re-organisation of udp_sock_recverr() David Gibson
2025-04-15 7:16 ` [PATCH 7/7] udp: Propagate errors on listening and brand new sockets David Gibson
2025-04-15 18:54 ` Stefano Brivio
2025-04-16 0:38 ` David Gibson
2025-04-15 19:10 ` [PATCH 0/7] Assorted fixes for UDP socket and error handling problems Stefano Brivio
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250415071624.2618589-5-david@gibson.dropbear.id.au \
--to=david@gibson.dropbear.id.au \
--cc=jmaloy@redhat.com \
--cc=passt-dev@passt.top \
--cc=sbrivio@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://passt.top/passt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).