From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202508 header.b=sjtCyLVr; dkim-atps=neutral Received: from mail.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id 689025A0274 for ; Tue, 23 Sep 2025 10:09:29 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202508; t=1758614966; bh=dCPKpphIXxOkkMcNxO50dHqXidcQ2fwjmRVoBjxlWAA=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=sjtCyLVrHrHxwoGeIp4UdEUeonnJjEF6Kd3NOfwel8di/bfG8+rPMZlAY32CUPgjC gYjYB1gv4t/FkDYumlDM9hw84GvG55m69wEZ3CsAmcOt7i9TpVH/m+PswXCQUefnFA 5d+OalJJB/NdeVTTxlXh5aHOh9pbHy7Cl09bJx/pNgBaBhu53SO9E5ByVzvVhrSkal sm53m/WAxd8jeD0PyISNcSLiMPvWzXeEUhjcdUK/9z68XwG6Ry2gOVhwy9JjrO/yDH 2X8NUE+LcT+/RQoxA/er4A5FYxSQBVI8NC4YW/ZqUj69sNic9IKUn+/rf4qBwVq/79 NhIFhAwCo0ilw== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4cWCLZ12Fdz4w1v; Tue, 23 Sep 2025 18:09:26 +1000 (AEST) Date: Tue, 23 Sep 2025 17:53:09 +1000 From: David Gibson To: Stefano Brivio Subject: Re: [PATCH] tap: Drop frames if no client connected Message-ID: References: <20250911085519.24395-1-yuhuang@redhat.com> <20250911115425.79eaaac5@elisabeth> <20250915081319.00e72e53@elisabeth> <20250918091714.77192b00@elisabeth> <20250922220330.436e2b6f@elisabeth> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="HHdIsHlP/VelesX0" Content-Disposition: inline In-Reply-To: <20250922220330.436e2b6f@elisabeth> Message-ID-Hash: APBM7HCI4WBY337LPDTL6CU7WFCXADBK X-Message-ID-Hash: APBM7HCI4WBY337LPDTL6CU7WFCXADBK X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: Yumei Huang , passt-dev@passt.top, lvivier@redhat.com X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --HHdIsHlP/VelesX0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Sep 22, 2025 at 10:03:30PM +0200, Stefano Brivio wrote: > On Mon, 22 Sep 2025 15:17:12 +0800 > Yumei Huang wrote: > > On Fri, Sep 19, 2025 at 9:38=E2=80=AFAM David Gibson > > wrote: > > > On Thu, Sep 18, 2025 at 09:17:14AM +0200, Stefano Brivio wrote: =20 > > > > On Thu, 18 Sep 2025 14:28:37 +1000 > > > > David Gibson wrote: [snip] > > > > Does it work to cover situations where users might start passt a bit > > > > before the guest connects, and try to connect to services right awa= y? > > > > > > > > I suggested using ssh which should have a quite long timeout and re= try > > > > connecting for a while. You mentioned you would assist Yumei in tes= ting > > > > this if needed. =20 > > > > > > Ah, yes, you're right and I'd forgotten that. Following up today. = =20 > >=20 > > I tried both 'ssh' and 'socat'(writing a big file) before a guest > > connects, they get a 'Connection reset' after 10s, even if the guest > > connects in ~2s. > > It's because, when start ssh or socat, passt would try to finish the > > tcp handshake with the guest. It sends SYN to the guest immediately > > and waits for SYN-ACK. However, the SYN frame is dropped/lost due to > > no guest connected. So though the guest connects in seconds, the tcp > > handshake would timeout, and returns rst via tcp_rst(). >=20 > Ah, right. We won't try to resend the SYN, that's simply not > implemented. >=20 > The timeout you see is SYN_TIMEOUT, timer set by tcp_timer_ctl() and > handled by tcp_timer_handler(). >=20 > > Either with or without this patch, they got the same 'connection reset'. > > Maybe it's something to fix? >=20 > First off, this shows that the current patch is harmless, so I would go > ahead and apply it (but see 2. below). >=20 > Strictly speaking, I don't think we really *need* to fix anything, but > for sure the behaviour isn't ideal. I see two alternatives: >=20 > 1. we implement a periodic retry for the SYN segment. This would *seem* > to give the best behaviour in this case, but: >=20 > a. it's quite complicated (we need to calculate some delays for the > timers, etc.), and not really transparent (which is in general a > goal of passt) I'm not really sure why you say it's not transparent, or at least what other option you're comparing it to. The peer has initiated a connection to us in the normal way (which may include resending SYNs). Now we're initiating a connection to the guest in the normal way (which may include resending SYNs). > b. if the guest never appears, we're just wasting client's time. See > db2c91ae86c7 ("tcp: Set ACK flag on *all* RST segments, even for > client in SYN-SENT state") for an example where it's important to > fail fast Sure. I'd say RSTing here would be *less* transparent, but it might still be worth it to make the peer fail fast. > c. if the guest appears but isn't listening to the port, see b. >=20 > 2. reset right away as I was suggesting in > https://archives.passt.top/passt-dev/20250915081319.00e72e53@elisabeth= /: >=20 > > We could mitigate that by making the TCP handler aware of this, and = by > > resetting the connection if the guest isn't there. This would at lea= st > > be consistent with the case where the guest isn't listening on the p= ort > > (we accept(), fail to connect to it, eventually call tcp_rst()). >=20 > and let the client retry as appropriate (if implemented). Those retries > can be quite fast, see this report (from IRC) for 722d347c1932 ("tcp: > Don't reset outbound connection on SYN retries"): I don't see how that commit is relevant to this situation. That's talking about SYN retries. We can see those in the case of outbound connections bot we'll never see them for the case of inbound connections, because the host kernel has already completed the handshake. For inbound we essentially have two options: a) Retry SYNs ourselves, emulating what the peer would do if it was talking directly to an absent guest. b) Reject SYNs quickly, trusting that the guest will have some sort of application level retry. That will depend on the client. I guess my fear here is that a client seeing a completed handshake + RST might assume that the guest server is permanently broken, rather than just temporarily missing as it might if there's no response at all. I suggested Yumei's approach here to aim for (a) on the basis of transparency - it's as close as I think we can get to a bridged guest that's just missing. I'm not necessarily opposed to (b), but I think it's less transparent, so we need an argument that it will lead to better outcomes regardless. > 3.3223: pasta: epoll event on /dev/net/tun device 18 (events: 0x= 00000001) > 3.3223: pasta: epoll event on /dev/net/tun device 18 (events: 0x= 00000001) > 3.3224: tap: protocol 6, 192.168.122.14:55532 -> 192.0.0.1:80 (1= packet) > 3.3224: Flow 0 (NEW): FREE -> NEW > 3.3224: Flow 0 (INI): NEW -> INI > 3.3224: Flow 0 (INI): TAP [192.168.122.14]:55532 -> [192.0.0.1]:= 80 =3D> ? > 3.3224: Flow 0 (TGT): INI -> TGT > 3.3224: Flow 0 (TGT): TAP [192.168.122.14]:55532 -> [192.0.0.1]:= 80 =3D> HOST [0.0.0.0]:0 -> [192.0.0.1]:80 > 3.3224: Flow 0 (TCP connection): TGT -> TYPED > 3.3224: Flow 0 (TCP connection): TAP [192.168.122.14]:55532 -> [= 192.0.0.1]:80 =3D> HOST [0.0.0.0]:0 -> [192.0.0.1]:80 > 3.3224: Flow 0 (TCP connection): event at tcp_conn_from_tap:1489 > 3.3224: Flow 0 (TCP connection): TAP_SYN_RCVD: CLOSED -> SYN_SENT > 3.3224: Flow 0 (TCP connection): failed to set TCP_MAXSEG on soc= ket 21 > 3.3224: Flow 0 (TCP connection): Side 0 hash table insert: bucke= t: 294539 > 3.3225: Flow 0 (TCP connection): TYPED -> ACTIVE > 3.3225: Flow 0 (TCP connection): TAP [192.168.122.14]:55532 -> [= 192.0.0.1]:80 =3D> HOST [0.0.0.0]:0 -> [192.0.0.1]:80 > 4.0027: pasta: epoll event on namespace timer watch 17 (events: = 0x00000001) > 4.3612: pasta: epoll event on /dev/net/tun device 18 (events: 0x= 00000001) > 4.3613: tap: protocol 6, 192.168.122.14:55532 -> 192.0.0.1:80 (1= packet) > 4.3613: Flow 0 (TCP connection): packet length 40 from tap > 4.3613: Flow 0 (TCP connection): TCP reset at tcp_tap_handler:19= 89 > 4.3613: Flow 0 (TCP connection): flag at tcp_prepare_flags:1163 > 4.3613: Flow 0 (TCP connection): event at tcp_rst_do:1206 > 4.3613: Flow 0 (TCP connection): CLOSED: SYN_SENT -> CLOSED > 4.3614: Flow 0 (TCP connection): Side 0 hash table remove: bucke= t: 294539 > 4.3614: Flow 0 (FREE): ACTIVE -> FREE > 4.3614: Flow 0 (FREE): TAP [192.168.122.14]:55532 -> [192.0.0.1]= :80 =3D> HOST [0.0.0.0]:0 -> [192.0.0.1]:80 >=20 > ...the retry happened within one second. This is a container, so Linux > kernel, and the client was wget. I'm not seeing a retry at all in this log, plus it's an outbound connection, which is not the case we're dealing with here. > So, in the end, I would suggest going with 2.: check if the guest / > container is connected in the TCP handler (tcp_data_from_sock()) and > reset the connection if it's not. >=20 > I would suggest checking that together with this patch. They would > still be two different patches, but I think it would be good to > check / test what happens with both of them. >=20 > --=20 > Stefano >=20 --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --HHdIsHlP/VelesX0 Content-Type: application/pgp-signature; name=signature.asc -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmjSUdkACgkQzQJF27ox 2Gf4Tg//X9C19Lt0X2JzuVTuHlqLGJMkqsq5RUOZ7etYeGJULvFne41PPlkpyUCJ aBo6LXX/6zgh5HEDmTivK9BkEpk7y7JM+ISGkX50CwK4dLW5pvQhK6VZgjHSG3CZ APdUIvyvtDojEDlt9pP36eGMq65taHIGPAHL83hfGjCNo6NLUsUKefpKwixCgzRD nV6V62BfGxWjZJ2CPOhnVv0RqilRU622ihs2Rf9Cq9von6yQrZiaRJZL/HI/Y6bj KpIp+vXDhDwfqEpRlJKtpPPJ+1vr18ZRAGxxhXMBPO4KfOMpSEk072Ts3F1ctrwy eqjY+6f4mai41SuoXXSI+q5Uf3AVATzVhomOQ2UtPHf5SW7esw3lfEUmBLsViE5m sPct1WhyCaPWHdxcaOcp+a06Kf7gVcUzubf1RRc8O9bniClVLeacnx4oX0ClvIrd eVDSdYOVRJZLkcwySzWy6YvdVwSVloiIof9s5dDbFpAhnl37sKnR+wYm6N7qmi/m MrjaDR1AWyNAAMLPitN1kSSTWkvNcGPaucGcTTUEoFRVF7keMzTMAUZNCaOoCp6z 6eV/JkLUi9Fj5o86f4jYdXDtPL7Ma4Rsps+Xh3BWB/zbhUhD1Nk441vhBcNNLNsV Y/45ywUH0ICuBeBFQFakAYxiJ7Tiy5Lsl9oroo/Q8Kn5xo0kSi4= =oR3K -----END PGP SIGNATURE----- --HHdIsHlP/VelesX0--