public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
* [PATCH] tap: Explicitly drop IPv4 fragments, and give a warning
@ 2023-07-04  4:36 David Gibson
  2023-07-04 11:21 ` Stefano Brivio
  0 siblings, 1 reply; 4+ messages in thread
From: David Gibson @ 2023-07-04  4:36 UTC (permalink / raw)
  To: Stefano Brivio, passt-dev; +Cc: David Gibson

We don't handle defragmentation of IP packets coming from the tap side,
and we're unlikely to any time soon (with our large MTU, it's not useful
for practical use cases).  Currently, however, we simply ignore the
fragmentation flags and treat fragments as though they were whole IP
packets.  This isn't ideal and can lead to rather cryptic behaviour if we
do receive IP fragments.

Change the code to explicitly drop fragmented packets, and print a rate
limited warning if we do encounter them.

Bugzilla: https://bugs.passt.top/show_bug.cgi?id=62

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 tap.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/tap.c b/tap.c
index e3235299..2e6939fa 100644
--- a/tap.c
+++ b/tap.c
@@ -62,6 +62,7 @@ static PACKET_POOL_NOINIT(pool_tap4, TAP_MSGS, pkt_buf);
 static PACKET_POOL_NOINIT(pool_tap6, TAP_MSGS, pkt_buf);
 
 #define TAP_SEQS		128 /* Different L4 tuples in one batch */
+#define FRAGMENT_MSG_RATE	10  /* # seconds between fragment warnings */
 
 /**
  * tap_send() - Send frame, with qemu socket header if needed
@@ -543,6 +544,32 @@ static void tap_packet_debug(const struct iphdr *iph,
 	}
 }
 
+/**
+ * tap4_is_fragment() - Determine if a packet is an IP fragment
+ * @iph:	IPv4 header (length already validated)
+ * @now:	Current timestamp
+ *
+ * Return: true if iph is an IP fragment, false otherwise
+ */
+static bool tap4_is_fragment(const struct iphdr *iph,
+			     const struct timespec *now)
+{
+	if (iph->frag_off & ~IP_DF) {
+		/* Ratelimit messages */
+		static time_t last_message;
+		static unsigned num_dropped;
+
+		num_dropped++;
+		if (now->tv_sec - last_message > FRAGMENT_MSG_RATE) {
+			warn("Can't process IPv4 fragments (%lu dropped)", num_dropped);
+			last_message = now->tv_sec;
+			num_dropped = 0;
+		}
+		return true;
+	}
+	return false;
+}
+
 /**
  * tap4_handler() - IPv4 and ARP packet handler for tap file descriptor
  * @c:		Execution context
@@ -591,6 +618,10 @@ resume:
 		    hlen > l3_len)
 			continue;
 
+		/* We don't handle IP fragments, drop them */
+		if (tap4_is_fragment(iph, now))
+			continue;
+
 		l4_len = l3_len - hlen;
 
 		if (iph->saddr && c->ip4.addr_seen.s_addr != iph->saddr) {
-- 
@@ -62,6 +62,7 @@ static PACKET_POOL_NOINIT(pool_tap4, TAP_MSGS, pkt_buf);
 static PACKET_POOL_NOINIT(pool_tap6, TAP_MSGS, pkt_buf);
 
 #define TAP_SEQS		128 /* Different L4 tuples in one batch */
+#define FRAGMENT_MSG_RATE	10  /* # seconds between fragment warnings */
 
 /**
  * tap_send() - Send frame, with qemu socket header if needed
@@ -543,6 +544,32 @@ static void tap_packet_debug(const struct iphdr *iph,
 	}
 }
 
+/**
+ * tap4_is_fragment() - Determine if a packet is an IP fragment
+ * @iph:	IPv4 header (length already validated)
+ * @now:	Current timestamp
+ *
+ * Return: true if iph is an IP fragment, false otherwise
+ */
+static bool tap4_is_fragment(const struct iphdr *iph,
+			     const struct timespec *now)
+{
+	if (iph->frag_off & ~IP_DF) {
+		/* Ratelimit messages */
+		static time_t last_message;
+		static unsigned num_dropped;
+
+		num_dropped++;
+		if (now->tv_sec - last_message > FRAGMENT_MSG_RATE) {
+			warn("Can't process IPv4 fragments (%lu dropped)", num_dropped);
+			last_message = now->tv_sec;
+			num_dropped = 0;
+		}
+		return true;
+	}
+	return false;
+}
+
 /**
  * tap4_handler() - IPv4 and ARP packet handler for tap file descriptor
  * @c:		Execution context
@@ -591,6 +618,10 @@ resume:
 		    hlen > l3_len)
 			continue;
 
+		/* We don't handle IP fragments, drop them */
+		if (tap4_is_fragment(iph, now))
+			continue;
+
 		l4_len = l3_len - hlen;
 
 		if (iph->saddr && c->ip4.addr_seen.s_addr != iph->saddr) {
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] tap: Explicitly drop IPv4 fragments, and give a warning
  2023-07-04  4:36 [PATCH] tap: Explicitly drop IPv4 fragments, and give a warning David Gibson
@ 2023-07-04 11:21 ` Stefano Brivio
  2023-07-05  1:04   ` David Gibson
  0 siblings, 1 reply; 4+ messages in thread
From: Stefano Brivio @ 2023-07-04 11:21 UTC (permalink / raw)
  To: David Gibson; +Cc: passt-dev

On Tue,  4 Jul 2023 14:36:23 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> We don't handle defragmentation of IP packets coming from the tap side,
> and we're unlikely to any time soon (with our large MTU, it's not useful
> for practical use cases).  Currently, however, we simply ignore the
> fragmentation flags and treat fragments as though they were whole IP
> packets.  This isn't ideal and can lead to rather cryptic behaviour if we
> do receive IP fragments.
> 
> Change the code to explicitly drop fragmented packets, and print a rate
> limited warning if we do encounter them.
> 
> Bugzilla: https://bugs.passt.top/show_bug.cgi?id=62

By the way, I silently replaced those with "Link:" in the past, just in
case we want to automate something around it one day, to avoid
differences between references to different bug trackers.

Once upon a time, I wrote some scripting to automatically link HTML
reports with (Linux kernel) commits to bug trackers, and it was quite
painful to discover all possible spellings of "Bugzilla" plus a few
others, hence my thought. But let me know if something speaks against
this.

> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  tap.c | 31 +++++++++++++++++++++++++++++++
>  1 file changed, 31 insertions(+)
> 
> diff --git a/tap.c b/tap.c
> index e3235299..2e6939fa 100644
> --- a/tap.c
> +++ b/tap.c
> @@ -62,6 +62,7 @@ static PACKET_POOL_NOINIT(pool_tap4, TAP_MSGS, pkt_buf);
>  static PACKET_POOL_NOINIT(pool_tap6, TAP_MSGS, pkt_buf);
>  
>  #define TAP_SEQS		128 /* Different L4 tuples in one batch */
> +#define FRAGMENT_MSG_RATE	10  /* # seconds between fragment warnings */
>  
>  /**
>   * tap_send() - Send frame, with qemu socket header if needed
> @@ -543,6 +544,32 @@ static void tap_packet_debug(const struct iphdr *iph,
>  	}
>  }
>  
> +/**
> + * tap4_is_fragment() - Determine if a packet is an IP fragment

This is actually independent from the "tap" "side", it could also be
e.g. ipv4_is_fragment(), in util.c. Not a strong preference though, I
guess we can also merge it as it is.

> + * @iph:	IPv4 header (length already validated)
> + * @now:	Current timestamp
> + *
> + * Return: true if iph is an IP fragment, false otherwise
> + */
> +static bool tap4_is_fragment(const struct iphdr *iph,
> +			     const struct timespec *now)
> +{
> +	if (iph->frag_off & ~IP_DF) {
> +		/* Ratelimit messages */
> +		static time_t last_message;
> +		static unsigned num_dropped;
> +
> +		num_dropped++;
> +		if (now->tv_sec - last_message > FRAGMENT_MSG_RATE) {
> +			warn("Can't process IPv4 fragments (%lu dropped)", num_dropped);
> +			last_message = now->tv_sec;
> +			num_dropped = 0;
> +		}
> +		return true;
> +	}
> +	return false;
> +}
> +
>  /**
>   * tap4_handler() - IPv4 and ARP packet handler for tap file descriptor
>   * @c:		Execution context
> @@ -591,6 +618,10 @@ resume:
>  		    hlen > l3_len)
>  			continue;
>  
> +		/* We don't handle IP fragments, drop them */
> +		if (tap4_is_fragment(iph, now))
> +			continue;
> +
>  		l4_len = l3_len - hlen;
>  
>  		if (iph->saddr && c->ip4.addr_seen.s_addr != iph->saddr) {

-- 
Stefano


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] tap: Explicitly drop IPv4 fragments, and give a warning
  2023-07-04 11:21 ` Stefano Brivio
@ 2023-07-05  1:04   ` David Gibson
  2023-07-05  4:20     ` David Gibson
  0 siblings, 1 reply; 4+ messages in thread
From: David Gibson @ 2023-07-05  1:04 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev

[-- Attachment #1: Type: text/plain, Size: 3946 bytes --]

On Tue, Jul 04, 2023 at 01:21:04PM +0200, Stefano Brivio wrote:
> On Tue,  4 Jul 2023 14:36:23 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > We don't handle defragmentation of IP packets coming from the tap side,
> > and we're unlikely to any time soon (with our large MTU, it's not useful
> > for practical use cases).  Currently, however, we simply ignore the
> > fragmentation flags and treat fragments as though they were whole IP
> > packets.  This isn't ideal and can lead to rather cryptic behaviour if we
> > do receive IP fragments.
> > 
> > Change the code to explicitly drop fragmented packets, and print a rate
> > limited warning if we do encounter them.
> > 
> > Bugzilla: https://bugs.passt.top/show_bug.cgi?id=62
> 
> By the way, I silently replaced those with "Link:" in the past, just in
> case we want to automate something around it one day, to avoid
> differences between references to different bug trackers.

Oh, ok.  I'll keep that in mind.

> Once upon a time, I wrote some scripting to automatically link HTML
> reports with (Linux kernel) commits to bug trackers, and it was quite
> painful to discover all possible spellings of "Bugzilla" plus a few
> others, hence my thought. But let me know if something speaks against
> this.

No, that makes sense.

> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  tap.c | 31 +++++++++++++++++++++++++++++++
> >  1 file changed, 31 insertions(+)
> > 
> > diff --git a/tap.c b/tap.c
> > index e3235299..2e6939fa 100644
> > --- a/tap.c
> > +++ b/tap.c
> > @@ -62,6 +62,7 @@ static PACKET_POOL_NOINIT(pool_tap4, TAP_MSGS, pkt_buf);
> >  static PACKET_POOL_NOINIT(pool_tap6, TAP_MSGS, pkt_buf);
> >  
> >  #define TAP_SEQS		128 /* Different L4 tuples in one batch */
> > +#define FRAGMENT_MSG_RATE	10  /* # seconds between fragment warnings */
> >  
> >  /**
> >   * tap_send() - Send frame, with qemu socket header if needed
> > @@ -543,6 +544,32 @@ static void tap_packet_debug(const struct iphdr *iph,
> >  	}
> >  }
> >  
> > +/**
> > + * tap4_is_fragment() - Determine if a packet is an IP fragment
> 
> This is actually independent from the "tap" "side", it could also be
> e.g. ipv4_is_fragment(), in util.c. Not a strong preference though, I
> guess we can also merge it as it is.

Well.. the detection of fragments is independent, but the warning
message is specific to tap.  I'm inclined to leave it as is, at least
until we have a need for this logic somewhere else, at which point we
can refactor.

> 
> > + * @iph:	IPv4 header (length already validated)
> > + * @now:	Current timestamp
> > + *
> > + * Return: true if iph is an IP fragment, false otherwise
> > + */
> > +static bool tap4_is_fragment(const struct iphdr *iph,
> > +			     const struct timespec *now)
> > +{
> > +	if (iph->frag_off & ~IP_DF) {
> > +		/* Ratelimit messages */
> > +		static time_t last_message;
> > +		static unsigned num_dropped;
> > +
> > +		num_dropped++;
> > +		if (now->tv_sec - last_message > FRAGMENT_MSG_RATE) {
> > +			warn("Can't process IPv4 fragments (%lu dropped)", num_dropped);
> > +			last_message = now->tv_sec;
> > +			num_dropped = 0;
> > +		}
> > +		return true;
> > +	}
> > +	return false;
> > +}
> > +
> >  /**
> >   * tap4_handler() - IPv4 and ARP packet handler for tap file descriptor
> >   * @c:		Execution context
> > @@ -591,6 +618,10 @@ resume:
> >  		    hlen > l3_len)
> >  			continue;
> >  
> > +		/* We don't handle IP fragments, drop them */
> > +		if (tap4_is_fragment(iph, now))
> > +			continue;
> > +
> >  		l4_len = l3_len - hlen;
> >  
> >  		if (iph->saddr && c->ip4.addr_seen.s_addr != iph->saddr) {
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] tap: Explicitly drop IPv4 fragments, and give a warning
  2023-07-05  1:04   ` David Gibson
@ 2023-07-05  4:20     ` David Gibson
  0 siblings, 0 replies; 4+ messages in thread
From: David Gibson @ 2023-07-05  4:20 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-dev

[-- Attachment #1: Type: text/plain, Size: 4334 bytes --]

Do not apply this version of the patch: I just discovered it's
entirely breaking ping.  Investigating...


On Wed, Jul 05, 2023 at 11:04:27AM +1000, David Gibson wrote:
> On Tue, Jul 04, 2023 at 01:21:04PM +0200, Stefano Brivio wrote:
> > On Tue,  4 Jul 2023 14:36:23 +1000
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> > 
> > > We don't handle defragmentation of IP packets coming from the tap side,
> > > and we're unlikely to any time soon (with our large MTU, it's not useful
> > > for practical use cases).  Currently, however, we simply ignore the
> > > fragmentation flags and treat fragments as though they were whole IP
> > > packets.  This isn't ideal and can lead to rather cryptic behaviour if we
> > > do receive IP fragments.
> > > 
> > > Change the code to explicitly drop fragmented packets, and print a rate
> > > limited warning if we do encounter them.
> > > 
> > > Bugzilla: https://bugs.passt.top/show_bug.cgi?id=62
> > 
> > By the way, I silently replaced those with "Link:" in the past, just in
> > case we want to automate something around it one day, to avoid
> > differences between references to different bug trackers.
> 
> Oh, ok.  I'll keep that in mind.
> 
> > Once upon a time, I wrote some scripting to automatically link HTML
> > reports with (Linux kernel) commits to bug trackers, and it was quite
> > painful to discover all possible spellings of "Bugzilla" plus a few
> > others, hence my thought. But let me know if something speaks against
> > this.
> 
> No, that makes sense.
> 
> > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > > ---
> > >  tap.c | 31 +++++++++++++++++++++++++++++++
> > >  1 file changed, 31 insertions(+)
> > > 
> > > diff --git a/tap.c b/tap.c
> > > index e3235299..2e6939fa 100644
> > > --- a/tap.c
> > > +++ b/tap.c
> > > @@ -62,6 +62,7 @@ static PACKET_POOL_NOINIT(pool_tap4, TAP_MSGS, pkt_buf);
> > >  static PACKET_POOL_NOINIT(pool_tap6, TAP_MSGS, pkt_buf);
> > >  
> > >  #define TAP_SEQS		128 /* Different L4 tuples in one batch */
> > > +#define FRAGMENT_MSG_RATE	10  /* # seconds between fragment warnings */
> > >  
> > >  /**
> > >   * tap_send() - Send frame, with qemu socket header if needed
> > > @@ -543,6 +544,32 @@ static void tap_packet_debug(const struct iphdr *iph,
> > >  	}
> > >  }
> > >  
> > > +/**
> > > + * tap4_is_fragment() - Determine if a packet is an IP fragment
> > 
> > This is actually independent from the "tap" "side", it could also be
> > e.g. ipv4_is_fragment(), in util.c. Not a strong preference though, I
> > guess we can also merge it as it is.
> 
> Well.. the detection of fragments is independent, but the warning
> message is specific to tap.  I'm inclined to leave it as is, at least
> until we have a need for this logic somewhere else, at which point we
> can refactor.
> 
> > 
> > > + * @iph:	IPv4 header (length already validated)
> > > + * @now:	Current timestamp
> > > + *
> > > + * Return: true if iph is an IP fragment, false otherwise
> > > + */
> > > +static bool tap4_is_fragment(const struct iphdr *iph,
> > > +			     const struct timespec *now)
> > > +{
> > > +	if (iph->frag_off & ~IP_DF) {
> > > +		/* Ratelimit messages */
> > > +		static time_t last_message;
> > > +		static unsigned num_dropped;
> > > +
> > > +		num_dropped++;
> > > +		if (now->tv_sec - last_message > FRAGMENT_MSG_RATE) {
> > > +			warn("Can't process IPv4 fragments (%lu dropped)", num_dropped);
> > > +			last_message = now->tv_sec;
> > > +			num_dropped = 0;
> > > +		}
> > > +		return true;
> > > +	}
> > > +	return false;
> > > +}
> > > +
> > >  /**
> > >   * tap4_handler() - IPv4 and ARP packet handler for tap file descriptor
> > >   * @c:		Execution context
> > > @@ -591,6 +618,10 @@ resume:
> > >  		    hlen > l3_len)
> > >  			continue;
> > >  
> > > +		/* We don't handle IP fragments, drop them */
> > > +		if (tap4_is_fragment(iph, now))
> > > +			continue;
> > > +
> > >  		l4_len = l3_len - hlen;
> > >  
> > >  		if (iph->saddr && c->ip4.addr_seen.s_addr != iph->saddr) {
> > 
> 



-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-07-05  4:35 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-04  4:36 [PATCH] tap: Explicitly drop IPv4 fragments, and give a warning David Gibson
2023-07-04 11:21 ` Stefano Brivio
2023-07-05  1:04   ` David Gibson
2023-07-05  4:20     ` David Gibson

Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).