From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: passt.top; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=cseaoITh; dkim-atps=neutral Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by passt.top (Postfix) with ESMTP id 04F1F5A004E for ; Wed, 27 Nov 2024 11:14:38 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1732702478; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TT/VqO2f1ks/F4QH3nIbSa3Zzr1z+GVC0IMz6J8+JNg=; b=cseaoIThnDl+FDJm+hic0ukVQNiKT2PlU8j9TZl2ozPV+N/YUTLd91JJh80IB8fG0b/lI3 24KUSreuEcHcFrh+TtIxPRgJH7wChzALPCuG4SPTbSkHbuMALPaF7t9WJTqIsAkM9HVx+b MCchi3C2UYtCz0KwoVNBQLsdoBK3Vec= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-211-mlKuCDwxMcqWqzJvSIzJqA-1; Wed, 27 Nov 2024 05:14:36 -0500 X-MC-Unique: mlKuCDwxMcqWqzJvSIzJqA-1 X-Mimecast-MFC-AGG-ID: mlKuCDwxMcqWqzJvSIzJqA Received: by mail-wm1-f71.google.com with SMTP id 5b1f17b1804b1-43499c1342aso27872965e9.1 for ; Wed, 27 Nov 2024 02:14:36 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732702474; x=1733307274; h=content-transfer-encoding:mime-version:organization:references :in-reply-to:message-id:subject:cc:to:from:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=TT/VqO2f1ks/F4QH3nIbSa3Zzr1z+GVC0IMz6J8+JNg=; b=Z0J4BNl91CpZGtTB9Lcn7lPWLVMpd/RqkJPEnXq5qeUYVnxP9Vy0Punz0xkQ7+1cOk hPfCq1Y5ImWA/4LmGFP49tRT5JPfC3VaBkezujbfqcD6FIQ7L8ThQxzA4D2wFiIiVqK/ O4e4ASoc+Jxjuzxd36tXEp5SA3iJUhLew/8d9Xx8WlSqKKWHfB7mSd7Ucc5hQ3MnRZV0 tNYCDKNhwHQrlMUE9EJVtJ1wVjf2Qnn2tORGX5qPo3Euo0Q/rmauHEWoa7OMI9rAY5tN e37vCHGhTd1wSr9q+Qfl8Q4dw1ZbxiyayxE2h74BbIzBqANSNcinrTXRqaV3DFdkI7+e oqeQ== X-Gm-Message-State: AOJu0YxDr5BvkoxDnrZXngD/gbY2hx/lvdk9ZrGtoZfoHa41yddHX+bb RxO6jRkVb+xY87JIHV67+SAFBmtzT1CIorAvJdtPc5jcFjzXuT3fqbhmFKO01lfFLeX3Q1KPxbm v7vsb21Z1aHbZkr69v1ZABW9+TUz52r8jJ6MVpgeD+ilHMz+H+ntzWrORHDlzjfHsS2S6jnmBRB /OAlAindG41R4S+01iKUvIYq/dgP0Qjv37 X-Gm-Gg: ASbGnct9A2a8CSTKNcEcThgfdxreItrSkAAGz3iRbWdw+Hf0FG3NKWCVOJ7xpplOWqj xGffQjAHOv+iK9S1KbhWBVymD9bYlt7cwhxlR1Nt0Xnj9MnC3ijHrZIZzQVZLr+TJTGKL/4sdrb DtOBD3NKZpM2dRP/ubD3yEgyr6LgUCur7VCzkn+jMEFvYPJcFxOBgEVtDxHm7gPIXvPxyZfy+2H 7y19DHxFJP9eJCgjjgybS7XGZQmQqnp01PiFaQDT1fJ11J3uQ+p4hdQFd+UBlckiRpLFdtS3x+S FEU= X-Received: by 2002:a05:6000:1449:b0:382:4b6f:24ea with SMTP id ffacd0b85a97d-385c6eb7f73mr2008985f8f.11.1732702474093; Wed, 27 Nov 2024 02:14:34 -0800 (PST) X-Google-Smtp-Source: AGHT+IGyvvkWu2+B29BcfjO/ckyba7wuqH6ccegbTxCfi0pGiES/LD3elZ7QQ45wdYgkAA/eRtumhg== X-Received: by 2002:a05:6000:1449:b0:382:4b6f:24ea with SMTP id ffacd0b85a97d-385c6eb7f73mr2008960f8f.11.1732702473612; Wed, 27 Nov 2024 02:14:33 -0800 (PST) Received: from maya.myfinge.rs (ifcgrfdd.trafficplex.cloud. [176.103.220.4]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3825fb2683bsm16083686f8f.48.2024.11.27.02.14.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Nov 2024 02:14:32 -0800 (PST) Date: Wed, 27 Nov 2024 11:14:31 +0100 From: Stefano Brivio To: Laurent Vivier Subject: Re: [PATCH v14 7/9] vhost-user: add vhost-user Message-ID: <20241127111431.4f8ef6ea@elisabeth> In-Reply-To: References: <20241122164337.3377854-1-lvivier@redhat.com> <20241122164337.3377854-8-lvivier@redhat.com> <20241127054749.7f1cfb25@elisabeth> <20241127104514.5a09c0d0@elisabeth> <83566556-2d9b-42ae-8876-588fe6b02b17@redhat.com> <20241127110355.402b1dbe@elisabeth> Organization: Red Hat X-Mailer: Claws Mail 4.2.0 (GTK 3.24.41; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: by4p8j4bjJEiiVcXK1HmQ_5kpH8prAaxLjXHSUEeltw_1732702475 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Message-ID-Hash: XINF2VJTJQ6RSHBXKWJIZOODR5HHZVMB X-Message-ID-Hash: XINF2VJTJQ6RSHBXKWJIZOODR5HHZVMB X-MailFrom: sbrivio@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Wed, 27 Nov 2024 11:11:33 +0100 Laurent Vivier wrote: > On 27/11/2024 11:03, Stefano Brivio wrote: > > On Wed, 27 Nov 2024 10:48:41 +0100 > > Laurent Vivier wrote: > > > >> On 27/11/2024 10:45, Stefano Brivio wrote: > >>> On Wed, 27 Nov 2024 10:09:53 +0100 > >>> Laurent Vivier wrote: > >>> > >>>> On 27/11/2024 05:47, Stefano Brivio wrote: > >>>>> On Fri, 22 Nov 2024 17:43:34 +0100 > >>>>> Laurent Vivier wrote: > >>>>> > >>>>>> +/** > >>>>>> + * tcp_vu_send_flag() - Send segment with flags to vhost-user (no payload) > >>>>>> + * @c: Execution context > >>>>>> + * @conn: Connection pointer > >>>>>> + * @flags: TCP flags: if not set, send segment only if ACK is due > >>>>>> + * > >>>>>> + * Return: negative error code on connection reset, 0 otherwise > >>>>>> + */ > >>>>>> +int tcp_vu_send_flag(const struct ctx *c, struct tcp_tap_conn *conn, int flags) > >>>>>> +{ > >>>>>> + struct vu_dev *vdev = c->vdev; > >>>>>> + struct vu_virtq *vq = &vdev->vq[VHOST_USER_RX_QUEUE]; > >>>>>> + const struct flowside *tapside = TAPFLOW(conn); > >>>>>> + size_t l2len, l4len, optlen, hdrlen; > >>>>>> + struct vu_virtq_element flags_elem[2]; > >>>>>> + struct tcp_payload_t *payload; > >>>>>> + struct ipv6hdr *ip6h = NULL; > >>>>>> + struct iovec flags_iov[2]; > >>>>>> + struct iphdr *iph = NULL; > >>>>>> + struct ethhdr *eh; > >>>>>> + uint32_t seq; > >>>>>> + int elem_cnt; > >>>>>> + int nb_ack; > >>>>>> + int ret; > >>>>>> + > >>>>>> + hdrlen = tcp_vu_hdrlen(CONN_V6(conn)); > >>>>>> + > >>>>>> + vu_set_element(&flags_elem[0], NULL, &flags_iov[0]); > >>>>>> + > >>>>>> + elem_cnt = vu_collect(vdev, vq, &flags_elem[0], 1, > >>>>>> + hdrlen + sizeof(struct tcp_syn_opts), NULL); > >>>>> > >>>>> Oops, I made this crash, by starting a number of iperf3 client threads > >>>>> on the host: > >>>>> > >>>>> $ iperf3 -c localhost -p 6001 -Z -l 500 -w 256M -t 600 -P20 > >>>>> > >>>>> with matching server in the guest, then terminating QEMU while the test > >>>>> is running. > >>>>> > >>>>> Details (I saw it first, then I reproduced it under gdb): > >>>>> > >>>>> accepted connection from PID 3115463 > >>>>> NDP: received RS, sending RA > >>>>> DHCP: offer to discover > >>>>> from 52:54:00:12:34:56 > >>>>> DHCP: ack to request > >>>>> from 52:54:00:12:34:56 > >>>>> NDP: sending unsolicited RA, next in 212s > >>>>> Client connection closed > >>>>> > >>>>> Program received signal SIGSEGV, Segmentation fault. > >>>>> 0x00005555555884f5 in vring_avail_idx (vq=0x555559343f10 ) at virtio.c:138 > >>>>> 138 vq->shadow_avail_idx = le16toh(vq->vring.avail->idx); > >>>>> (gdb) list > >>>>> 133 * > >>>>> 134 * Return: the available ring index of the given virtqueue > >>>>> 135 */ > >>>>> 136 static inline uint16_t vring_avail_idx(struct vu_virtq *vq) > >>>>> 137 { > >>>>> 138 vq->shadow_avail_idx = le16toh(vq->vring.avail->idx); > >>>>> 139 > >>>>> 140 return vq->shadow_avail_idx; > >>>>> 141 } > >>>>> 142 > >>>>> (gdb) bt > >>>>> #0 0x00005555555884f5 in vring_avail_idx (vq=0x555559343f10 ) at virtio.c:138 > >>>>> #1 vu_queue_empty (vq=vq@entry=0x555559343f10 ) at virtio.c:290 > >>>>> #2 vu_queue_pop (dev=dev@entry=0x555559343a00 , vq=vq@entry=0x555559343f10 , elem=elem@entry=0x7ffffff6f510) at virtio.c:505 > >>>>> #3 0x0000555555588c8c in vu_collect (vdev=vdev@entry=0x555559343a00 , vq=vq@entry=0x555559343f10 , elem=elem@entry=0x7ffffff6f510, max_elem=max_elem@entry=1, > >>>>> size=size@entry=74, frame_size=frame_size@entry=0x0) at vu_common.c:86 > >>>>> #4 0x000055555557e00e in tcp_vu_send_flag (c=0x7ffffff6f7a0, conn=0x5555555bd2d0 , flags=4) at tcp_vu.c:116 > >>>>> #5 0x0000555555578125 in tcp_send_flag (flags=4, conn=0x5555555bd2d0 , c=0x7ffffff6f7a0) at tcp.c:1278 > >>>>> #6 tcp_rst_do (conn=, c=) at tcp.c:1293 > >>>>> #7 tcp_timer_handler (c=c@entry=0x7ffffff6f7a0, ref=..., ref@entry=...) at tcp.c:2266 > >>>>> #8 0x0000555555558f26 in main (argc=, argv=) at passt.c:342 > >>>>> (gdb) p *vq > >>>>> $1 = {vring = {num = 256, desc = 0x0, avail = 0x0, used = 0x0, log_guest_addr = 4338774592, flags = 0}, last_avail_idx = 35133, shadow_avail_idx = 35133, used_idx = 35133, signalled_used = 0, > >>>>> signalled_used_valid = false, notification = true, inuse = 0, call_fd = -1, kick_fd = -1, err_fd = -1, enable = 1, started = false, vra = {index = 0, flags = 0, desc_user_addr = 139660501995520, > >>>>> used_user_addr = 139660502000192, avail_user_addr = 139660501999616, log_guest_addr = 4338774592}} > >>>>> (gdb) p *vq->vring.avail > >>>>> Cannot access memory at address 0x0 > >>>>> > >>>>> ...so we're sending a RST segment to the guest, but the ring doesn't > >>>>> exist anymore. > >>>>> > >>>>> By the way, I still have the gdb session running, if you need something > >>>>> else out of it. > >>>>> > >>>>> Now, I guess we should eventually introduce a more comprehensive > >>>>> handling of the case where the guest suddenly terminates (not specific > >>>>> to vhost-user), but given that we have vu_cleanup() working as expected > >>>>> in this case, I wonder if we shouldn't simply avoid calling > >>>>> vring_avail_idx() (it has a single caller) by checking for !vring.avail > >>>>> in the caller, or something like that. > >>>>> > >>>> > >>>> Yes, I think it's the lines I removed during the reviews: > >>>> > >>>> if (!vq->vring.avail) > >>>> return true; > >>> > >>> Ah, right: > >>> > >>> https://archives.passt.top/passt-dev/20241114163859.7eeafa38@elisabeth/ > >>> > >>> ...so, at least in our case, it's more than "sanity checks" after all. > >>> :) Well, I guess it depends on the definition. > >>> > >>>> Could you try to checkout virtio.c from v11? > >>> > >>> That would take a rather lengthy rebase, but I tried to reintroduce all > >>> the checks you had: > >>> > >>> -- > >>> diff --git a/virtio.c b/virtio.c > >>> index 6a97435..0598ff4 100644 > >>> --- a/virtio.c > >>> +++ b/virtio.c > >>> @@ -284,6 +284,9 @@ static int virtqueue_read_next_desc(const struct vring_desc *desc, > >>> */ > >>> bool vu_queue_empty(struct vu_virtq *vq) > >>> { > >>> + if (!vq->vring.avail) > >>> + return true; > >>> + > >>> if (vq->shadow_avail_idx != vq->last_avail_idx) > >>> return false; > >>> > >>> @@ -327,6 +330,9 @@ static bool vring_can_notify(const struct vu_dev *dev, struct vu_virtq *vq) > >>> */ > >>> void vu_queue_notify(const struct vu_dev *dev, struct vu_virtq *vq) > >>> { > >>> + if (!vq->vring.avail) > >>> + return; > >>> + > >>> if (!vring_can_notify(dev, vq)) { > >>> debug("vhost-user: virtqueue can skip notify..."); > >>> return; > >>> @@ -502,6 +508,9 @@ int vu_queue_pop(struct vu_dev *dev, struct vu_virtq *vq, struct vu_virtq_elemen > >>> unsigned int head; > >>> int ret; > >>> > >>> + if (!vq->vring.avail) > >>> + return -1; > >>> + > >>> if (vu_queue_empty(vq)) > >>> return -1; > >>> > >>> @@ -591,6 +600,9 @@ void vu_queue_fill_by_index(struct vu_virtq *vq, unsigned int index, > >>> { > >>> struct vring_used_elem uelem; > >>> > >>> + if (!vq->vring.avail) > >>> + return; > >>> + > >>> idx = (idx + vq->used_idx) % vq->vring.num; > >>> > >>> uelem.id = htole32(index); > >>> @@ -633,6 +645,9 @@ void vu_queue_flush(struct vu_virtq *vq, unsigned int count) > >>> { > >>> uint16_t old, new; > >>> > >>> + if (!vq->vring.avail) > >>> + return; > >>> + > >>> /* Make sure buffer is written before we update index. */ > >>> smp_wmb(); > >>> > >>> -- > >>> > >>> and it's all fine with those, I tried doing a few nasty things and > >>> didn't observe any issue. > >>> > >>> Any check I missed? Do you want to submit it as follow-up patch? I can > >>> also do that. I'd rather (still) avoid a re-post of v14 if possible. > >> > >> As you prefer. Let me know. > > > > It would save me some time if you could... it should be based on v14 as > > it is. > > I can. > > > > > I didn't have time to take care of gcc warnings on 32-bit and of the > > build failure on musl, yet. > > > I will do too. > > Do you want them before to merge? Yes, thanks before the merge would be great. -- Stefano