public inbox for passt-user@passt.top
 help / color / mirror / Atom feed
* Failed to open tun socket in namespace
@ 2023-03-26 12:42 Gianluca Stivan
  2023-03-27  9:47 ` Stefano Brivio
  0 siblings, 1 reply; 5+ messages in thread
From: Gianluca Stivan @ 2023-03-26 12:42 UTC (permalink / raw)
  To: passt-user

Hi,
Thank you so much for this project!
I'm running a rootless container using pasta. Everything works great but after a while (anything from minutes to hours) I see the error "Failed to open tun socket in namespace" in my systemd logs, and my container loses access to the internet.
Do you happen to have any idea how I would go about debugging this?
Thanks a lot in advance :)

PS: I'm running pasta from podman with arguments --map-gw,--dns,<my-dns>,-i,wlan0,--outbound-if4,wireguard0,--outbound-if6,wireguard0,--config-net

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Failed to open tun socket in namespace
  2023-03-26 12:42 Failed to open tun socket in namespace Gianluca Stivan
@ 2023-03-27  9:47 ` Stefano Brivio
  2023-03-27 12:22   ` Gianluca Stivan
  0 siblings, 1 reply; 5+ messages in thread
From: Stefano Brivio @ 2023-03-27  9:47 UTC (permalink / raw)
  To: Gianluca Stivan; +Cc: passt-user

Hi Gianluca,

On Sun, 26 Mar 2023 13:42:19 +0100
"Gianluca Stivan" <me@yawnt.com> wrote:

> Hi,
> Thank you so much for this project!

And thanks for testing it!

> I'm running a rootless container using pasta. Everything works great
> but after a while (anything from minutes to hours) I see the error
> "Failed to open tun socket in namespace" in my systemd logs, and my
> container loses access to the internet.

That happens when pasta detects an issue with the tap file descriptor
(error or closure reported by the kernel, or failed read), so it closes
it and tries to open it again.

There are two issues actually: 1. the problem with the tap device
causing this and 2. the fact that pasta fails to close and reopen the
file descriptor.

> Do you happen to have any
> idea how I would go about debugging this? Thanks a lot in advance :)
> 
> PS: I'm running pasta from podman with arguments
> --map-gw,--dns,<my-dns>,-i,wlan0,--outbound-if4,wireguard0,--outbound-if6,wireguard0,--config-net

Let's try to debug issue 1. first: can you pass -d (--debug) and -l
(--log-file) here, then check the lines at the end, when pasta
terminates? You might want to play with --log-size too if there's a lot
of activity.

-- 
Stefano


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Failed to open tun socket in namespace
  2023-03-27  9:47 ` Stefano Brivio
@ 2023-03-27 12:22   ` Gianluca Stivan
  2023-03-27 13:02     ` Stefano Brivio
  0 siblings, 1 reply; 5+ messages in thread
From: Gianluca Stivan @ 2023-03-27 12:22 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-user

Thanks for the reply!

Here is the ending of the log file:

...
<a lot a lot of STALLED and dropped...>
3944.3562:          TCP: index 19, timer expires in 2.000s
3944.3563:          TCP: index 19, timer expires in 2.000s
3944.3566:          TCP: index 19: STALLED
3944.3578:          TCP: index 7: STALLED dropped
3944.3579:          TCP: index 7, timer expires in 2.000s
3944.3584:          TCP: index 7: STALLED
3944.3586:          TCP: index 7: STALLED dropped
3944.3587:          TCP: index 7: ACK_FROM_TAP_DUE dropped
3944.3588:          TCP: index 7: ACK_FROM_TAP_DUE
3944.3588:          TCP: index 7, timer expires in 2.000s
3944.3589:          TCP: index 12: STALLED dropped
3944.3590:          TCP: index 12, timer expires in 2.000s
3944.3620:          tap write: No buffer space available
3944.3620:          TCP: index 17: STALLED dropped
3944.3620:          TCP: index 17: ACK_FROM_TAP_DUE dropped
3944.4420: ERROR:   Failed to open tun socket in namespace

No buffer space available seems interesting?

Thanks!

PS: let me know if you'd like the full file!

On Mon, Mar 27, 2023, at 10:47, Stefano Brivio wrote:
> Hi Gianluca,
>
> On Sun, 26 Mar 2023 13:42:19 +0100
> "Gianluca Stivan" <me@yawnt.com> wrote:
>
>> Hi,
>> Thank you so much for this project!
>
> And thanks for testing it!
>
>> I'm running a rootless container using pasta. Everything works great
>> but after a while (anything from minutes to hours) I see the error
>> "Failed to open tun socket in namespace" in my systemd logs, and my
>> container loses access to the internet.
>
> That happens when pasta detects an issue with the tap file descriptor
> (error or closure reported by the kernel, or failed read), so it closes
> it and tries to open it again.
>
> There are two issues actually: 1. the problem with the tap device
> causing this and 2. the fact that pasta fails to close and reopen the
> file descriptor.
>
>> Do you happen to have any
>> idea how I would go about debugging this? Thanks a lot in advance :)
>> 
>> PS: I'm running pasta from podman with arguments
>> --map-gw,--dns,<my-dns>,-i,wlan0,--outbound-if4,wireguard0,--outbound-if6,wireguard0,--config-net
>
> Let's try to debug issue 1. first: can you pass -d (--debug) and -l
> (--log-file) here, then check the lines at the end, when pasta
> terminates? You might want to play with --log-size too if there's a lot
> of activity.
>
> -- 
> Stefano

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Failed to open tun socket in namespace
  2023-03-27 12:22   ` Gianluca Stivan
@ 2023-03-27 13:02     ` Stefano Brivio
  2023-03-27 13:16       ` Gianluca Stivan
  0 siblings, 1 reply; 5+ messages in thread
From: Stefano Brivio @ 2023-03-27 13:02 UTC (permalink / raw)
  To: Gianluca Stivan; +Cc: passt-user

On Mon, 27 Mar 2023 13:22:12 +0100
"Gianluca Stivan" <me@yawnt.com> wrote:

> Thanks for the reply!
> 
> Here is the ending of the log file:
> 
> ...
> <a lot a lot of STALLED and dropped...>
> 3944.3562:          TCP: index 19, timer expires in 2.000s
> 3944.3563:          TCP: index 19, timer expires in 2.000s
> 3944.3566:          TCP: index 19: STALLED
> 3944.3578:          TCP: index 7: STALLED dropped
> 3944.3579:          TCP: index 7, timer expires in 2.000s
> 3944.3584:          TCP: index 7: STALLED
> 3944.3586:          TCP: index 7: STALLED dropped
> 3944.3587:          TCP: index 7: ACK_FROM_TAP_DUE dropped
> 3944.3588:          TCP: index 7: ACK_FROM_TAP_DUE
> 3944.3588:          TCP: index 7, timer expires in 2.000s
> 3944.3589:          TCP: index 12: STALLED dropped
> 3944.3590:          TCP: index 12, timer expires in 2.000s
> 3944.3620:          tap write: No buffer space available
> 3944.3620:          TCP: index 17: STALLED dropped
> 3944.3620:          TCP: index 17: ACK_FROM_TAP_DUE dropped
> 3944.4420: ERROR:   Failed to open tun socket in namespace
> 
> No buffer space available seems interesting?

Interesting, I had never seen ENOBUFS from a tap file descriptor. Let me
find out where it might come from (and how pasta should react to it),
it might take a bit. What kernel version are you using by the way?

I mean, maybe we just have to try again (as if we received EAGAIN):

diff --git a/tap.c b/tap.c
index 68ef480..21a786d 100644
--- a/tap.c
+++ b/tap.c
@@ -323,7 +323,8 @@ static size_t tap_send_frames_pasta(struct ctx *c,
                if (write(c->fd_tap, (char *)iov[i].iov_base,
                          iov[i].iov_len) < 0) {
                        debug("tap write: %s", strerror(errno));
-                       if (errno != EAGAIN && errno != EWOULDBLOCK)
+                       if (errno != EAGAIN && errno != EWOULDBLOCK &&
+                           errno != ENOBUFS)
                                tap_handler(c, c->fd_tap, EPOLLERR, NULL);
                        i--;
                }

but I'm not sure it's correct. You could probably try that as a
workaround meanwhile.

> PS: let me know if you'd like the full file!

No no, thank you, that was enough!

About issue number 2. (the fact that we can't actually close and reopen
the tap file descriptor after this failure), I'll check if I can easily
reproduce that in a Podman setting.

-- 
Stefano


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: Failed to open tun socket in namespace
  2023-03-27 13:02     ` Stefano Brivio
@ 2023-03-27 13:16       ` Gianluca Stivan
  0 siblings, 0 replies; 5+ messages in thread
From: Gianluca Stivan @ 2023-03-27 13:16 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: passt-user

Thanks for the reply, I'll give that a go and report back!
Apologies, it seems my unusual use case is causing some headaches :)

Kernel version: 5.19.0-35-generic (just a boring Ubuntu 22.04).

Cheers,

On Mon, Mar 27, 2023, at 14:02, Stefano Brivio wrote:
> On Mon, 27 Mar 2023 13:22:12 +0100
> "Gianluca Stivan" <me@yawnt.com> wrote:
>
>> Thanks for the reply!
>> 
>> Here is the ending of the log file:
>> 
>> ...
>> <a lot a lot of STALLED and dropped...>
>> 3944.3562:          TCP: index 19, timer expires in 2.000s
>> 3944.3563:          TCP: index 19, timer expires in 2.000s
>> 3944.3566:          TCP: index 19: STALLED
>> 3944.3578:          TCP: index 7: STALLED dropped
>> 3944.3579:          TCP: index 7, timer expires in 2.000s
>> 3944.3584:          TCP: index 7: STALLED
>> 3944.3586:          TCP: index 7: STALLED dropped
>> 3944.3587:          TCP: index 7: ACK_FROM_TAP_DUE dropped
>> 3944.3588:          TCP: index 7: ACK_FROM_TAP_DUE
>> 3944.3588:          TCP: index 7, timer expires in 2.000s
>> 3944.3589:          TCP: index 12: STALLED dropped
>> 3944.3590:          TCP: index 12, timer expires in 2.000s
>> 3944.3620:          tap write: No buffer space available
>> 3944.3620:          TCP: index 17: STALLED dropped
>> 3944.3620:          TCP: index 17: ACK_FROM_TAP_DUE dropped
>> 3944.4420: ERROR:   Failed to open tun socket in namespace
>> 
>> No buffer space available seems interesting?
>
> Interesting, I had never seen ENOBUFS from a tap file descriptor. Let me
> find out where it might come from (and how pasta should react to it),
> it might take a bit. What kernel version are you using by the way?
>
> I mean, maybe we just have to try again (as if we received EAGAIN):
>
> diff --git a/tap.c b/tap.c
> index 68ef480..21a786d 100644
> --- a/tap.c
> +++ b/tap.c
> @@ -323,7 +323,8 @@ static size_t tap_send_frames_pasta(struct ctx *c,
>                 if (write(c->fd_tap, (char *)iov[i].iov_base,
>                           iov[i].iov_len) < 0) {
>                         debug("tap write: %s", strerror(errno));
> -                       if (errno != EAGAIN && errno != EWOULDBLOCK)
> +                       if (errno != EAGAIN && errno != EWOULDBLOCK &&
> +                           errno != ENOBUFS)
>                                 tap_handler(c, c->fd_tap, EPOLLERR, NULL);
>                         i--;
>                 }
>
> but I'm not sure it's correct. You could probably try that as a
> workaround meanwhile.
>
>> PS: let me know if you'd like the full file!
>
> No no, thank you, that was enough!
>
> About issue number 2. (the fact that we can't actually close and reopen
> the tap file descriptor after this failure), I'll check if I can easily
> reproduce that in a Podman setting.
>
> -- 
> Stefano

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-03-27 13:17 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-26 12:42 Failed to open tun socket in namespace Gianluca Stivan
2023-03-27  9:47 ` Stefano Brivio
2023-03-27 12:22   ` Gianluca Stivan
2023-03-27 13:02     ` Stefano Brivio
2023-03-27 13:16       ` Gianluca Stivan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).