From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTP id 8699D5A0274 for ; Wed, 13 Mar 2024 16:28:50 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1710343729; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GXs5J8+rxt6mQUMCMG/pLwv1FN/kTLzDCYnGXfE7E/E=; b=fSWa3PGl1P67NgyYcg0rspMQhCwYfZDDhsWy2OWpbVvbjAE+lBp0cfk4eik25xaUNNORF3 B55RXRUUOPBhjevJXEkbQ2omGWbGLLhCyXHVUmYY/CGOs5o9BrPaYooLYbazcn/SJ6mv55 DpZHp89r+THS2eWcmbfEmIUS1dvQICo= Received: from mail-ej1-f70.google.com (mail-ej1-f70.google.com [209.85.218.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-595-ziYTkIHFNnm47AV5Pyb_DQ-1; Wed, 13 Mar 2024 11:28:48 -0400 X-MC-Unique: ziYTkIHFNnm47AV5Pyb_DQ-1 Received: by mail-ej1-f70.google.com with SMTP id a640c23a62f3a-a46376dcdcbso108555866b.1 for ; Wed, 13 Mar 2024 08:28:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710343726; x=1710948526; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=GXs5J8+rxt6mQUMCMG/pLwv1FN/kTLzDCYnGXfE7E/E=; b=VOGYvY5V2Qc2+3SHAo0ru4fLBBtMfUE6TwhX27Q0Fg/HJnsH9KBkQwykgHzdqEAXWj NzGewmgZ7c9MtoqtJX8V+1c4eEq9f2KF0aFqc9BrBetlrv9pfzjPH+iImBuw0mHwrQ/R RuxvxGwYcufdDMojRvHPjf86XzlrlO5MG/ZudJT1EfA9RLnh/xo61M7RW0ZE5+i+OGBp Nzd2kSfImW85B/JLJ47TIqy2J1Ro0/OkG297ogDRhatdX39/Su0LnzTSEJOXoswlT4gS qEtwp7It0E2PlR/GlbpljhUyDLPrUrSk9wgG2g6jZLn5XYGress1BxfDn9x0vcI2hx1A h05A== X-Gm-Message-State: AOJu0YzPVon/sYCO+l1KdV3YNZS2U2OpBhlpjlMuRDH3Rsx+BzzkK+9U NiRDdxnpdLkwZY9xUpNBYchl/Wzqnog+8MhZWhYqrA56daG/p3Vp1YTRsUGOuEoSxrdJ9Z6N5xf cL0IVmPsEn0fRe4IqUJVLX7SEC8wqdOEhJv/1KnVgBaqIGwJi/kkRdH84eBXr/36HZr476szMXM H7nOnsM6bI8qzHUznSKahOSg3+Id04xHpQi3Q= X-Received: by 2002:a17:906:6b12:b0:a45:ada9:4bf1 with SMTP id q18-20020a1709066b1200b00a45ada94bf1mr2639863ejr.12.1710343725656; Wed, 13 Mar 2024 08:28:45 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFRmlqPDE15ddMTNhxUDrWW2DW2H+qBr4qMCL2DzX647Rh8g21A0KGrCIbBNBdZuV5Xl3JMPg== X-Received: by 2002:a17:906:6b12:b0:a45:ada9:4bf1 with SMTP id q18-20020a1709066b1200b00a45ada94bf1mr2639835ejr.12.1710343725059; Wed, 13 Mar 2024 08:28:45 -0700 (PDT) Received: from maya.cloud.tilaa.com (maya.cloud.tilaa.com. [164.138.29.33]) by smtp.gmail.com with ESMTPSA id p8-20020a056402074800b00568803d97d1sm970672edy.9.2024.03.13.08.28.44 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 13 Mar 2024 08:28:44 -0700 (PDT) Date: Wed, 13 Mar 2024 16:27:47 +0100 From: Stefano Brivio To: Laurent Vivier Subject: Re: [RFC] tcp: Replace TCP buffer structure by an iovec array Message-ID: <20240313162747.120169de@elisabeth> In-Reply-To: <8e56f30b-cdcc-4276-acbe-9fc87fa51a7e@redhat.com> References: <20240311133356.1405001-1-lvivier@redhat.com> <20240313123725.7a37f311@elisabeth> <8e56f30b-cdcc-4276-acbe-9fc87fa51a7e@redhat.com> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.36; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: LC2OVVXGFO4C5LHH6HSZRBIEYXWMDNSJ X-Message-ID-Hash: LC2OVVXGFO4C5LHH6HSZRBIEYXWMDNSJ X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Wed, 13 Mar 2024 15:42:27 +0100 Laurent Vivier wrote: > On 3/13/24 12:37, Stefano Brivio wrote: > > On Mon, 11 Mar 2024 14:33:56 +0100 > > Laurent Vivier wrote: > > > ... > >> diff --git a/tcp.c b/tcp.c > >> index d5eedf4d0138..5c8488108ef7 100644 > >> --- a/tcp.c > >> +++ b/tcp.c > >> @@ -318,39 +318,7 @@ > ... > >> -#define MSS4 ROUND_DOWN(USHRT_MAX - sizeof(struct tcp4_l2_head), 4) > >> -#define MSS6 ROUND_DOWN(USHRT_MAX - sizeof(struct tcp6_l2_head), 4) > >> > >> +#define MSS (USHRT_MAX - sizeof(struct tcphdr)) > > > > We can't exceed USHRT_MAX at Layer-2 level, though, so the MSS for IPv4 > > should now be: > > > > #define MSS4 ROUND_DOWN(USHRT_MAX - sizeof(struct ethhdr) - > > sizeof(struct iphdr) - > > sizeof(struct tcphdr), > > 4) > > > > ...and similar for IPv6. > > Is there a specification that limits the MSS? Well, on Linux, virtual network interfaces such as virtio-net still have their maximum configurable MTU defined as ETH_MAX_MTU (same as the maximum IPv4 MTU, 65535), and that's a symmetric value (in standards, drivers, and elsewhere in the network stack). Side note: the TCP MSS doesn't need to be the same value for both directions, instead. But other than that, no, there are no normative references. It's an implementation "detail" if you want. > Or is it only a compatibility problem with the network stack implementation? Kind of, yes. Except that: > At headers level the only limitation we have is the length field size in the IP header > that is a 16bit (it's why I put "USHRT_MAX - sizeof(struct tcphdr)"). ...in IPv4, that field also contains the length of the *IP header*, so, ignoring for a moment Linux and virtio-net with ETH_MAX_MTU, you would be limited to 65495 bytes (65535 minus 20 bytes of IP header, minus 20 bytes of TCP header). As RFC 791, section 3.1, states: Total Length: 16 bits Total Length is the length of the datagram, measured in octets, including internet header and data. This field allows the length of a datagram to be up to 65,535 octets. [...] ...but 65535 is the *MTU*, not MSS. And if it needs to fit ETH_MAX_MTU, you're now back to 65520 bytes of MTU (it needs to match IPv4 32-bit words, if I recall correctly), hence 65480 bytes of MSS. IPv6 is different, in that the equivalent field doesn't include the size of the IPv6 header. But the header is 40 bytes long, so the outcome is the same. Well, except that you could have jumbograms (RFC 2675) and exceed 65535 bytes of IPv6 MTU, but that won't work anyway with Ethernet-like interfaces. So, well, I haven't actually tried to send an Ethernet frame to a virtio-net interface that's bigger than 65535 bytes. As far as normative references are concerned, you could send 65549 (65535 bytes maximum IPv4 MTU, plus 14 bytes of 802.3 header on top) bytes. I guess it will be dropped by the kernel, but it's perhaps worth a try. -- Stefano