From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTP id 9D91D5A026F for ; Fri, 19 Jan 2024 11:45:47 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1705661146; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ker5BljDdel3GuAmFteYUurAXH9LGdgX3+jReDnBDGI=; b=F1YyvnTZqFLESxtR1RWdzLwXqavR6znx9bgF6qXC7iAhvXidbSDufP/hXaJCMbxQCxfYMk c7wmHCyCFWSvs9M5kvuiwiWjSiKpXyLMmksNgME4we6xBWydgJSwhHpCKm54TJaXLsYgGZ kjnESVrJlGBOh8xqsPoIkwCylq66iwk= Received: from mail-ed1-f71.google.com (mail-ed1-f71.google.com [209.85.208.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-39-W5kXkKtDNru2BCElKjf0xg-1; Fri, 19 Jan 2024 05:45:44 -0500 X-MC-Unique: W5kXkKtDNru2BCElKjf0xg-1 Received: by mail-ed1-f71.google.com with SMTP id 4fb4d7f45d1cf-5589ce327e2so395171a12.2 for ; Fri, 19 Jan 2024 02:45:43 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705661142; x=1706265942; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=ker5BljDdel3GuAmFteYUurAXH9LGdgX3+jReDnBDGI=; b=uHA2RqY0DntY/cEI211jXEwKlLCdzDAvEMY6eHcsy2V+Sxi0dD9DIWZDlsEsNrcR10 lIsG8ZlvImlsIivupOmmSJYPpZ6FGm1PNjkVAvO4LfiCTvlDsfRRWitc7AMlfFRAE3Ie 1IUGSpXvj5NBOfsOqYLCcJ9FxWAEP2S90SNlHLqpcjaJEBoD2HlNsPwZG4PhFrrxuBSK IHRyCtlT6sC7zBReApDwBWPJf61gJhncjCv90TNRm0NUx9WGPZCC/kNtYHVh9icbUsoc CRsI0DkCUdYVQi5O12OcoLIGX17+H4bc3iY7B8L88u6u9myutvuartZa/zAiL+WvuFXs CsxA== X-Gm-Message-State: AOJu0Yz/zURplx6nzGvvPWz0Kt0K8gsDyDvntf4qoHtS0W6WuWKO/JNR FpzS1yM+Ifl4vXhu2pu9iyRGtuZ3xU70fQVSh/pGeg/rWmn8U5ANvZc830UA3AOVZICK751K25X KMV5D4J+miVQm2AJ8vkUjGpMNtkC8HMDJSl89a3ErON/EQqRfwA== X-Received: by 2002:a50:8d11:0:b0:558:e349:ec76 with SMTP id s17-20020a508d11000000b00558e349ec76mr692327eds.146.1705661142303; Fri, 19 Jan 2024 02:45:42 -0800 (PST) X-Google-Smtp-Source: AGHT+IETb5IGcuNqMnT6hUddJre4eCatYhPp+EMjfmLPP8UpjGgOcU8Ky6jzRhFExRhGNfYjE7uyKg== X-Received: by 2002:a50:8d11:0:b0:558:e349:ec76 with SMTP id s17-20020a508d11000000b00558e349ec76mr692322eds.146.1705661141963; Fri, 19 Jan 2024 02:45:41 -0800 (PST) Received: from maya.cloud.tilaa.com (maya.cloud.tilaa.com. [164.138.29.33]) by smtp.gmail.com with ESMTPSA id r1-20020a056402018100b00558e2a97fc2sm8978416edv.68.2024.01.19.02.45.41 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 19 Jan 2024 02:45:41 -0800 (PST) Date: Fri, 19 Jan 2024 11:45:05 +0100 From: Stefano Brivio To: David Gibson Subject: Re: [PATCH v2] tcp.c: leverage MSG_PEEK with offset kernel capability when available Message-ID: <20240119105630.089c5d34@elisabeth> In-Reply-To: References: <20240114180755.1008481-1-jmaloy@redhat.com> <20240118172326.73b6f4ba@elisabeth> Organization: Red Hat X-Mailer: Claws Mail 4.1.1 (GTK 3.24.36; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: MLPN33TBLZYU73DEDYDLNW53TZVTRNVL X-Message-ID-Hash: MLPN33TBLZYU73DEDYDLNW53TZVTRNVL X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: Jon Maloy , passt-dev@passt.top, lvivier@redhat.com, dgibson@redhat.com X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Fri, 19 Jan 2024 11:05:02 +1100 David Gibson wrote: > On Thu, Jan 18, 2024 at 05:23:26PM +0100, Stefano Brivio wrote: > > Not a full review, but a couple of comments, mostly about stuff I also > > had in pkt_selfie.c (review of v1): > > > > On Thu, 18 Jan 2024 14:05:38 +1100 > > David Gibson wrote: > > > > > On Sun, Jan 14, 2024 at 01:07:55PM -0500, Jon Maloy wrote: > > > > > > > > [...] > > > > > > > > + > > > > + s[0] = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP); > > > > + s[1] = socket(AF_INET, SOCK_STREAM | SOCK_NONBLOCK, IPPROTO_TCP); > > > > + if (s[0] < 0 || s[1] < 0) { > > > > + perror("Temporary probe socket creation failed\n"); > > > > + goto out; > > > > + } > > > > + if (0 > bind(s[0], &a, sizeof(a))) { > > > > > > Since the socket address is unspecified, why do you need to bind at > > > all? It might be clearer to explicitly set a to localhost + a > > > specific port - because you're in a temporary namespace, you can rely > > > on every port being available. > > > > There are two advantages of bind() without port, and then getsockname(): > > first, ip_unprivileged_port_start might have whatever value in our new > > namespace (we don't touch it), and I wouldn't take for granted we'll > > have CAP_SYS_ADMIN in it for all the possible start-up combinations. > > > > Second, there's no need for a magic value. > > Good point. Note that at present we're not bind()ing to an address > either. > > > > > + perror("Temporary probe socket bind() failed\n"); > > > > + goto out; > > > > + } > > > > + if (0 > getsockname(s[0], &a, &((socklen_t) { sizeof(a) }))) { > > > > + perror("Temporary probe socket getsockname() failed\n"); > > > > + goto out; > > > > + } > > > > + if (0 > listen(s[0], 0)) { > > > > + perror("Temporary probe socket listen() failed\n"); > > > > + goto out; > > > > + } > > > > + if (0 <= connect(s[1], &a, sizeof(a)) || errno != EINPROGRESS) { > > > > + perror("Temporary probe socket connect() failed\n"); > > > > + goto out; > > > > + } > > > > > > This is assuming that a will now contain the correct address to > > > connect to. Although it will have the right port, I think the address > > > may still be unspecified for the listening socket. > > > > Hmm, why? From getsockname(2): > > > > getsockname() returns the current address to which the socket > > sockfd is bound [...] > > But we've only bound ourselves to 0.0.0.0, which while perfectly > cromulent for a listening socket, is no good for connect(). Hah, "cromulent" just embiggened my dictionary! Why not, though? From RFC 6890, 2.2.2: +----------------------+----------------------------+ | Attribute | Value | +----------------------+----------------------------+ | Address Block | 0.0.0.0/8 | | Name | "This host on this network"| and: $ strace -e connect ./pkt_selfie --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=891325, si_uid=1000, si_status=0, si_utime=0, si_stime=0} --- connect(5, {sa_family=AF_INET, sin_port=htons(51155), sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EINPROGRESS (Operation now in progress) MSG_PEEK with offset not supported +++ exited with 0 +++ with pkt_selfie.c from review of v1: https://archives.passt.top/passt-dev/20231206160808.3d312733@elisabeth/ -- Stefano