public inbox for passt-dev@passt.top
 help / color / mirror / code / Atom feed
From: Stefano Brivio <sbrivio@redhat.com>
To: passt-dev@passt.top
Subject: Re: [PATCH 2/2] test/passt_in_ns: Consistent sleep commands before starting socat client
Date: Mon, 19 Sep 2022 10:24:34 +0200	[thread overview]
Message-ID: <20220919102434.187bd423@elisabeth> (raw)
In-Reply-To: <YygTh8Ik6Od7Q4L+@yekko>

[-- Attachment #1: Type: text/plain, Size: 8271 bytes --]

On Mon, 19 Sep 2022 17:00:23 +1000
David Gibson <david(a)gibson.dropbear.id.au> wrote:

> On Mon, Sep 19, 2022 at 08:41:50AM +0200, Stefano Brivio wrote:
> > On Mon, 19 Sep 2022 11:48:33 +1000
> > David Gibson <david(a)gibson.dropbear.id.au> wrote:
> >   
> > > On Sat, Sep 17, 2022 at 01:28:42PM +0200, Stefano Brivio wrote:  
> > > > On Sat, 17 Sep 2022 20:19:21 +1000
> > > > David Gibson <david(a)gibson.dropbear.id.au> wrote:
> > > >     
> > > > > On Sat, Sep 17, 2022 at 10:44:41AM +0200, Stefano Brivio wrote:    
> > > > > > On Sat, 17 Sep 2022 13:32:45 +1000
> > > > > > David Gibson <david(a)gibson.dropbear.id.au> wrote:
> > > > > >       
> > > > > > > On Sat, Sep 17, 2022 at 01:55:34AM +0200, Stefano Brivio wrote:      
> > > > > > > > There are some 'sleep 1' commands between starting the socat server
> > > > > > > > and its corresponding client to avoid races due to the server not
> > > > > > > > being ready as we start sending data.
> > > > > > > > 
> > > > > > > > However, those don't cover all the cases where we might need them,
> > > > > > > > and in some cases the sleep command actually ended up being before
> > > > > > > > the server even starts.
> > > > > > > > 
> > > > > > > > This fixes occasional failures in TCP and UDP simple transfer tests,
> > > > > > > > that became apparent with the new command dispatch mechanism.
> > > > > > > > 
> > > > > > > > Signed-off-by: Stefano Brivio <sbrivio(a)redhat.com>        
> > > > > > > 
> > > > > > > Heh, I have a very similar patch in my upcoming series.  I used sleep
> > > > > > > 0.1 though, to avoid taking so long, which seems to be sufficient in
> > > > > > > practice.      
> > > > > > 
> > > > > > Just mind POSIX only specifies integers as argument for sleep(1), a
> > > > > > pattern I commonly use is:
> > > > > > 
> > > > > > 	sleep 0.1 || sleep 1      
> > > > > 
> > > > > Ah, right.
> > > > >     
> > > > > > > I did look for a proper way to wait for the socat server to be ready,
> > > > > > > but I couldn't find anything workable.      
> > > > > > 
> > > > > > I guess we could enable logging and look for "starting accept loop",
> > > > > > from _xioopen_listen() in xio-listen.c.      
> > > > > 
> > > > > Yeah, I suppose.
> > > > >     
> > > > > > Anyway, right now, I'm just trying to get the tests to complete with
> > > > > > all your pending series, because it's been a while. This doesn't look
> > > > > > too bad, we can try to improve it later.      
> > > > > 
> > > > > Yeah, fair enough.  When I rebase I'll see if there's any refinements
> > > > > I can make relatively easily.    
> > > > 
> > > > Okay, thanks.
> > > >     
> > > > > > > I thought I could retry on the
> > > > > > > client side, but that doesn't work (at least for host to guest
> > > > > > > connections) because passt is always listening on the forwarded ports,
> > > > > > > so the client won't see a connection refused from the actual server.
> > > > > > > I don't think there's any sane way to propagate connection refused in
> > > > > > > that way, although we could and probably should forward connection
> > > > > > > resets outwards.      
> > > > > > 
> > > > > > They should be already, and (manual) retries actually worked for me,      
> > > > > 
> > > > > I'm not entirely sure what you mean by manual retries.  I was trying
> > > > > to use socat's retry option, but that only seems to fire on
> > > > > ECONNREFUSED.    
> > > > 
> > > > I was watching the text execution, then as I saw the socat server
> > > > process getting stuck, I just re-run the client with ssh -F ... with the
> > > > original command, and that worked for me every time.    
> > > 
> > > Ah, yeah, I'd expect that to work.  The difficulty is automatically
> > > detecting the failure.
> > >   
> > > > So, the behaviour I'm inferring is:
> > > > 
> > > > - server not listening, no retry option: connect() goes
> > > >   through, but anything else will get ECONNRESET (RST from passt), plus
> > > >   there should be EPOLLERR if socat checks that: client exits    
> > > 
> > > That doesn't seem to be the case.  We *eventually* get an ECONNRESET,
> > > but it seems like we can send some data before it happens - sometimes
> > > quite a lot.  I'm assuming this is because quite a lot can fit into
> > > socket buffers before the RST makes its way back.  
> > 
> > Well, yes, it might simply be that socat relies on ECONNRESET from
> > send(), and the first send()-like call includes the full buffer.  
> 
> Yeah, that appears to be the case.  When it did fail, it seemed to be
> because it was getting an ECONNRESET on one of the send()s or
> write()s.
> 
> > Or, even if it's already checking for EPOLLERR at that point, it could
> > be delivered too late.
> >   
> > > > - server not listening, retry option: client exits without retry because
> > > >   it doesn't get ECONNREFUSED    
> > > 
> > > Right.
> > >   
> > > > ...I guess a cleaner behaviour on passt's side would be to delay the
> > > > accept() (that's in tcp_conn_from_sock()) until we get SYN, ACK from
> > > > the guest. But:    
> > > 
> > > I thought about that, but I don't think its workable.  For starters, I
> > > don't think there's a way of explicitly refusing a connection from
> > > userspace.  Not accept()ing, then closing and re-opening the listening
> > > socket *might* do it.
> > > 
> > > But.. more fundamentally, due to the connection backlog, I think by
> > > the time we get the notification in userspace, the kernel has probably
> > > already completed the TCP handshake, so it's too late to signal a
> > > connection refused to the client.  
> > 
> > Oops, right, listen() by itself already warrants a handshake, so I
> > guess there's no point in trying if we can't ensure a given behaviour.
> > 
> > In real-world applications:
> > 
> > - the user forwards a set of ports because they expect a server to be
> >   reachable on those ports, fine
> > 
> > - or all ports are forwarded (this is the case of KubeVirt) so that the
> >   user doesn't have to configure them. In that case, we can always have
> >   handshake, client sending data, and receiving a RST before any of it
> >   is acknowledged -- this is guaranteed by passt as it won't dequeue
> >   data from buffers before the real server accepted the connection  
> 
> Uuuhh.. does that actually guarantee that?  Does the kernel send an
> ack only once userspace has read the data, or just once it is queued
> in-kernel?

As far as I remember, for the first segment the condition
__tcp_select_window(sk) >= tp->rcv_wnd in __tcp_ack_snd_check(),
net/ipv4/tcp_input.c, should be false, and if more data is sent and we
don't dequeue it, the window doesn't advance, so the first ACK _should_
be deferred to the first tcp_recvmsg().

However, this is a bit vague now in my memory. I guess we should
double check this (by delaying the first recvmsg() in passt and
comparing with captures), and also what happens if we defer the accept()
to after the connection is established guest-side.

> > ...the only alternatives I see sound a lot like a port scan, which is
> > definitely not desirable.  
> 
> Right.
> 
> > I guess we just have to... accept this behaviour.  
> 
> Yeah, I think so.
> 
> > The important fact, I
> > think, is that we don't acknowledge data.
> > 
> > Whether closing and re-opening the listening socket causes ICMP
> > messages to be sent: from kernel code I'd say it's not the case, but it
> > might be worth an actual test, as that would be a marginal improvement.  
> 
> Ok.  It's not really clear to me what ICMP messages would be sent to
> where in that scenario.

Hmm... for UDP, ICMP type 3 code 3 (port unreachable) and ICMPv6 type 1
code 3 (also port unreachable) are sent by the kernel to the source
address of the SYN segment if there's no corresponding socket.

For TCP, they could legitimately be sent ("may", in RFC 792), but now
that I think about it, I can't recall the kernel doing it... unless
netfilter is explicitly configured to do so. So this is probably
irrelevant (also perhaps worth a quick check anyway).

-- 
Stefano


      reply	other threads:[~2022-09-19  8:24 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-16 23:55 [PATCH 0/2] test: Fix two issues made apparent by new command dispatch Stefano Brivio
2022-09-16 23:55 ` [PATCH 1/2] test/perf: Check for /sbin/sysctl with which(1), not simply sysctl Stefano Brivio
2022-09-17  3:27   ` David Gibson
2022-09-17  8:44     ` Stefano Brivio
2022-09-17  9:51       ` David Gibson
2022-09-16 23:55 ` [PATCH 2/2] test/passt_in_ns: Consistent sleep commands before starting socat client Stefano Brivio
2022-09-17  3:32   ` David Gibson
2022-09-17  8:44     ` Stefano Brivio
2022-09-17 10:19       ` David Gibson
2022-09-17 11:28         ` Stefano Brivio
2022-09-19  1:48           ` David Gibson
2022-09-19  6:41             ` Stefano Brivio
2022-09-19  7:00               ` David Gibson
2022-09-19  8:24                 ` Stefano Brivio [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220919102434.187bd423@elisabeth \
    --to=sbrivio@redhat.com \
    --cc=passt-dev@passt.top \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://passt.top/passt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for IMAP folder(s).