From mboxrd@z Thu Jan 1 00:00:00 1970 Authentication-Results: passt.top; dmarc=none (p=none dis=none) header.from=gibson.dropbear.id.au Authentication-Results: passt.top; dkim=pass (2048-bit key; secure) header.d=gibson.dropbear.id.au header.i=@gibson.dropbear.id.au header.a=rsa-sha256 header.s=202510 header.b=d6pzduKJ; dkim-atps=neutral Received: from mail.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) by passt.top (Postfix) with ESMTPS id 3FA985A0BC3 for ; Mon, 10 Nov 2025 06:54:23 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gibson.dropbear.id.au; s=202510; t=1762754059; bh=ou12b2+B1uQoBXHgkm6zs1nwZhAKu+W32HOgiYKpLb0=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=d6pzduKJa31iZ8zRwY1vvn1ffIoDM60qiCIqVq/SxAQ8C+HoAWMQqF5fcKdfemSdL nys5wTirxhR+olTLpErb8ayqQg1mVti5qnGlz/kPPrQdXdFpvYCreAGp84Lemzb7S/ PQd9o9HkSbTgIQP33MLtdbVdSCZ6FYHqM9Z6Px8p7ICJzO0gJfFOXNU5prixf60bG7 OV36eUh6ulxp+46Fok5/9bNIZxsDekrhFtWiri0NuErlwueayNExP/i1hWJoc9qbrd 908FML0eT+JDjtUMlv10UrFD+h6UKLVSY0VEHImqQPbxtPPf759bqU52diKXUcVx+P LPq3dYYjHmcWA== Received: by gandalf.ozlabs.org (Postfix, from userid 1007) id 4d4f4W4p1Bz4wDR; Mon, 10 Nov 2025 16:54:19 +1100 (AEDT) Date: Mon, 10 Nov 2025 16:54:14 +1100 From: David Gibson To: Laurent Vivier Subject: Re: [PATCH 3/4] multiqueue: Add queue-aware flow management for multiqueue support Message-ID: References: <20251107143901.89955-1-lvivier@redhat.com> <20251107143901.89955-4-lvivier@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="mePDtlOii17f5X2C" Content-Disposition: inline In-Reply-To: <20251107143901.89955-4-lvivier@redhat.com> Message-ID-Hash: DQI3DUD3QWFBDXKEL657GVWXO5BBNYJG X-Message-ID-Hash: DQI3DUD3QWFBDXKEL657GVWXO5BBNYJG X-MailFrom: dgibson@gandalf.ozlabs.org X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: passt-dev@passt.top X-Mailman-Version: 3.3.8 Precedence: list List-Id: Development discussion and patches for passt Archived-At: Archived-At: List-Archive: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --mePDtlOii17f5X2C Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Nov 07, 2025 at 03:39:00PM +0100, Laurent Vivier wrote: > Packets are now routed to the correct RX queue based on which TX queue > they arrived on, rather than always using queue 0. Am I missing something, or is this not quite accurate. The packets are associated by flow with other packets that came on a particular Tx queue, but AIUI the packet itself didn't arrive on a Tx queue, but rather from a socket. > Note: Flows initiated from the host (via sockets, udp_flow_from_sock()) > currently default to queue 0, as they don't have an associated incoming > queue. >=20 > Signed-off-by: Laurent Vivier > --- > flow.c | 32 ++++++++++++++++++++++ > flow.h | 10 +++++++ > icmp.c | 23 +++++++++------- > icmp.h | 2 +- > tap.c | 77 +++++++++++++++++++++++++++------------------------ > tap.h | 5 ++-- > tcp.c | 79 +++++++++++++++++++++++++++++------------------------ > tcp.h | 2 +- > tcp_vu.c | 8 ++++-- > udp.c | 33 ++++++++++++---------- > udp.h | 12 ++++---- > udp_flow.c | 8 +++++- > udp_flow.h | 2 +- > udp_vu.c | 4 ++- > vu_common.c | 4 +-- > 15 files changed, 187 insertions(+), 114 deletions(-) >=20 > diff --git a/flow.c b/flow.c > index 278a9cf0ac6d..8e9d7e5e1847 100644 > --- a/flow.c > +++ b/flow.c > @@ -405,6 +405,37 @@ void flow_epollid_register(int epollid, int epollfd) > epoll_id_to_fd[epollid] =3D epollfd; > } > =20 > +/** > + * flow_rx_virtqueue() - Get RX (receive) queue number for a flow Maybe avoid the name "virtqueue". I believe the tap device can also do multiqueue, and we may want to reuse this queue infrastructure for that. > + * @f: Flow to query (may be NULL) > + * > + * Return: RX queue number for the flow, or 0 if flow is NULL or has no > + * valid queue assignment > + */ > +int flow_rx_virtqueue(const struct flow_common *f) > +{ > + if (f =3D=3D NULL || f->queueid =3D=3D FLOW_QUEUEID_INVALID) Any reason to have the second special case here, rather than initializing f->queueid to 0? > + return 0; > + return f->queueid << 1; I generally think it's clearer to use * 2 rather than << 1 here. I expect the compiler to turn it into the same thing. > +} > + > +/** > + * flow_queue_set() - Set queue pair assignment for a flow > + * @f: Flow to update > + * @queueid: Queue pair number to assign (even number, RX queue; TX is R= X+1) In comments using "from guest" / "to guest" instead of Rx/Tx is probably a good idea. Particularly in code that's not necessarily specific to VU. I think it is a bit confusing that somewhere we have queue pair IDs, sometimes absolute queue numbers. I think we should standardise on one throughout as much of the code as possible. > + */ > +void flow_queue_set(struct flow_common *f, int queueid) > +{ > + queueid >>=3D 1; > + > + ASSERT(queueid < FLOW_QUEUEID_MAX); > + > + flow_trace((union flow *)f, "updating queue from %d to %d", > + f->queueid, queueid); > + > + f->queueid =3D queueid; > +} > + > /** > * flow_initiate_() - Move flow to INI, setting pif[INISIDE] > * @flow: Flow to change state > @@ -609,6 +640,7 @@ union flow *flow_alloc(void) > flow_new_entry =3D flow; > memset(flow, 0, sizeof(*flow)); > flow_epollid_clear(&flow->f); > + flow->f.queueid =3D FLOW_QUEUEID_INVALID; > flow_set_state(&flow->f, FLOW_STATE_NEW); > =20 > return flow; > diff --git a/flow.h b/flow.h > index b43b0b1dd7f2..44ab4ae8fd6a 100644 > --- a/flow.h > +++ b/flow.h > @@ -179,6 +179,8 @@ int flowside_connect(const struct ctx *c, int s, > * @side[]: Information for each side of the flow > * @tap_omac: MAC address of remote endpoint as seen from the guest > * @epollid: epollfd identifier, or EPOLLFD_ID_INVALID > + * @queueid: Queue pair number assigned to this flow > + * (FLOW_QUEUEID_INVALID if not assigned) > */ > struct flow_common { > #ifdef __GNUC__ > @@ -199,6 +201,8 @@ struct flow_common { > =20 > #define EPOLLFD_ID_BITS 8 > unsigned int epollid:EPOLLFD_ID_BITS; > +#define FLOW_QUEUEID_BITS 5 > + unsigned int queueid:FLOW_QUEUEID_BITS; > }; > =20 > #define EPOLLFD_ID_DEFAULT 0 > @@ -206,6 +210,10 @@ struct flow_common { > #define EPOLLFD_ID_MAX (EPOLLFD_ID_SIZE - 1) > #define EPOLLFD_ID_INVALID EPOLLFD_ID_MAX > =20 > +#define FLOW_QUEUEID_SIZE (1 << FLOW_QUEUEID_BITS) "SIZE" is a bit ambiguous here. Maybe "NUM" instead? > +#define FLOW_QUEUEID_MAX (FLOW_QUEUEID_SIZE - 1) > +#define FLOW_QUEUEID_INVALID FLOW_QUEUEID_MAX > + > #define FLOW_INDEX_BITS 17 /* 128k - 1 */ > #define FLOW_MAX MAX_FROM_BITS(FLOW_INDEX_BITS) > =20 > @@ -266,6 +274,8 @@ int flow_epollfd(const struct flow_common *f); > void flow_epollid_set(struct flow_common *f, int epollid); > void flow_epollid_clear(struct flow_common *f); > void flow_epollid_register(int epollid, int epollfd); > +int flow_rx_virtqueue(const struct flow_common *f); > +void flow_queue_set(struct flow_common *f, int queueid); > void flow_defer_handler(const struct ctx *c, const struct timespec *now); > int flow_migrate_source_early(struct ctx *c, const struct migrate_stage = *stage, > int fd); > diff --git a/icmp.c b/icmp.c > index d58499c3bf5c..80e8753072fa 100644 > --- a/icmp.c > +++ b/icmp.c > @@ -132,13 +132,13 @@ void icmp_sock_handler(const struct ctx *c, union e= poll_ref ref) > const struct in_addr *daddr =3D inany_v4(&ini->eaddr); > =20 > ASSERT(saddr && daddr); /* Must have IPv4 addresses */ > - tap_icmp4_send(c, VHOST_USER_RX_QUEUE, *saddr, *daddr, buf, > + tap_icmp4_send(c, flow_rx_virtqueue(&pingf->f), *saddr, *daddr, buf, > pingf->f.tap_omac, n); > } else if (pingf->f.type =3D=3D FLOW_PING6) { > const struct in6_addr *saddr =3D &ini->oaddr.a6; > const struct in6_addr *daddr =3D &ini->eaddr.a6; > =20 > - tap_icmp6_send(c, VHOST_USER_RX_QUEUE, saddr, daddr, buf, > + tap_icmp6_send(c, flow_rx_virtqueue(&pingf->f), saddr, daddr, buf, > pingf->f.tap_omac, n); > } > return; > @@ -238,17 +238,18 @@ cancel: > =20 > /** > * icmp_tap_handler() - Handle packets from tap > - * @c: Execution context > - * @pif: pif on which the packet is arriving > - * @af: Address family, AF_INET or AF_INET6 > - * @saddr: Source address > - * @daddr: Destination address > - * @data: Single packet with ICMP/ICMPv6 header > - * @now: Current timestamp > + * @c: Execution context > + * @incoming_queue: Incoming queue number > + * @pif: pif on which the packet is arriving > + * @af: Address family, AF_INET or AF_INET6 > + * @saddr: Source address > + * @daddr: Destination address > + * @data: Single packet with ICMP/ICMPv6 header > + * @now: Current timestamp > * > * Return: count of consumed packets (always 1, even if malformed) > */ > -int icmp_tap_handler(const struct ctx *c, uint8_t pif, sa_family_t af, > +int icmp_tap_handler(const struct ctx *c, int incoming_queue, uint8_t pi= f, sa_family_t af, > const void *saddr, const void *daddr, > struct iov_tail *data, const struct timespec *now) > { > @@ -309,6 +310,8 @@ int icmp_tap_handler(const struct ctx *c, uint8_t pif= , sa_family_t af, > else if (!(pingf =3D icmp_ping_new(c, af, id, saddr, daddr))) > return 1; > =20 > + flow_queue_set(&pingf->f, incoming_queue); > + I'd kind of like the initial setting of the flow's queue to be tied to one of the existing flow state transitions, to ensure it can't be forgotten. > tgt =3D &pingf->f.side[TGTSIDE]; > =20 > ASSERT(flow_proto[pingf->f.type] =3D=3D proto); > diff --git a/icmp.h b/icmp.h > index 1a0e6205f087..6d6d6358bb33 100644 > --- a/icmp.h > +++ b/icmp.h > @@ -10,7 +10,7 @@ struct ctx; > struct icmp_ping_flow; > =20 > void icmp_sock_handler(const struct ctx *c, union epoll_ref ref); > -int icmp_tap_handler(const struct ctx *c, uint8_t pif, sa_family_t af, > +int icmp_tap_handler(const struct ctx *c, int incoming_queue, uint8_t pi= f, sa_family_t af, > const void *saddr, const void *daddr, > struct iov_tail *data, const struct timespec *now); > void icmp_init(void); > diff --git a/tap.c b/tap.c > index 1308d49242e8..9a4399d947a3 100644 > --- a/tap.c > +++ b/tap.c > @@ -702,15 +702,17 @@ static bool tap4_is_fragment(const struct iphdr *ip= h, > =20 > /** > * tap4_handler() - IPv4 and ARP packet handler for tap file descriptor > - * @c: Execution context > - * @in: Ingress packet pool, packets with Ethernet headers > - * @now: Current timestamp > + * @c: Execution context > + * @incoming_queue: Incoming queue number > + * @in: Ingress packet pool, packets with Ethernet headers > + * @now: Current timestamp > * > * Return: count of packets consumed by handlers > */ > -static int tap4_handler(struct ctx *c, const struct pool *in, > - const struct timespec *now) > +static int tap4_handler(struct ctx *c, int incoming_queue, > + const struct pool *in, const struct timespec *now) > { > + int outgoing_queue =3D incoming_queue & ~1; > unsigned int i, j, seq_count; > struct tap4_l4_t *seq; > =20 > @@ -736,7 +738,7 @@ resume: > if (!eh) > continue; > if (ntohs(eh->h_proto) =3D=3D ETH_P_ARP) { > - arp(c, VHOST_USER_RX_QUEUE, &data); > + arp(c, outgoing_queue, &data); > continue; > } > =20 > @@ -783,7 +785,7 @@ resume: > =20 > tap_packet_debug(iph, NULL, NULL, 0, NULL, 1); > =20 > - icmp_tap_handler(c, PIF_TAP, AF_INET, > + icmp_tap_handler(c, incoming_queue, PIF_TAP, AF_INET, > &iph->saddr, &iph->daddr, > &data, now); > continue; > @@ -797,7 +799,7 @@ resume: > struct iov_tail eh_data; > =20 > packet_get(in, i, &eh_data); > - if (dhcp(c, VHOST_USER_RX_QUEUE, &eh_data)) > + if (dhcp(c, outgoing_queue, &eh_data)) > continue; > } > =20 > @@ -860,14 +862,14 @@ append: > if (c->no_tcp) > continue; > for (k =3D 0; k < p->count; ) > - k +=3D tcp_tap_handler(c, PIF_TAP, AF_INET, > + k +=3D tcp_tap_handler(c, incoming_queue, PIF_TAP, AF_INET, > &seq->saddr, &seq->daddr, > 0, p, k, now); > } else if (seq->protocol =3D=3D IPPROTO_UDP) { > if (c->no_udp) > continue; > for (k =3D 0; k < p->count; ) > - k +=3D udp_tap_handler(c, PIF_TAP, AF_INET, > + k +=3D udp_tap_handler(c, incoming_queue, PIF_TAP, AF_INET, > &seq->saddr, &seq->daddr, > seq->ttl, p, k, now); > } > @@ -881,15 +883,17 @@ append: > =20 > /** > * tap6_handler() - IPv6 packet handler for tap file descriptor > - * @c: Execution context > - * @in: Ingress packet pool, packets with Ethernet headers > - * @now: Current timestamp > + * @c: Execution context > + * @incoming_queue: Incoming queue number > + * @in: Ingress packet pool, packets with Ethernet headers > + * @now: Current timestamp > * > * Return: count of packets consumed by handlers > */ > -static int tap6_handler(struct ctx *c, const struct pool *in, > - const struct timespec *now) > +static int tap6_handler(struct ctx *c, int incoming_queue, > + const struct pool *in, const struct timespec *now) > { > + int outgoing_queue =3D incoming_queue & ~1; > unsigned int i, j, seq_count =3D 0; > struct tap6_l4_t *seq; > =20 > @@ -965,12 +969,12 @@ resume: > continue; > =20 > ndp_data =3D data; > - if (ndp(c, VHOST_USER_RX_QUEUE, saddr, &ndp_data)) > + if (ndp(c, outgoing_queue, saddr, &ndp_data)) > continue; > =20 > tap_packet_debug(NULL, ip6h, NULL, proto, NULL, 1); > =20 > - icmp_tap_handler(c, PIF_TAP, AF_INET6, > + icmp_tap_handler(c, incoming_queue, PIF_TAP, AF_INET6, > saddr, daddr, &data, now); > continue; > } > @@ -984,8 +988,7 @@ resume: > if (proto =3D=3D IPPROTO_UDP) { > struct iov_tail uh_data =3D data; > =20 > - if (dhcpv6(c, VHOST_USER_RX_QUEUE, &uh_data, saddr, > - daddr)) > + if (dhcpv6(c, outgoing_queue, &uh_data, saddr, daddr)) > continue; > } > =20 > @@ -1053,14 +1056,14 @@ append: > if (c->no_tcp) > continue; > for (k =3D 0; k < p->count; ) > - k +=3D tcp_tap_handler(c, PIF_TAP, AF_INET6, > + k +=3D tcp_tap_handler(c, incoming_queue, PIF_TAP, AF_INET6, > &seq->saddr, &seq->daddr, > seq->flow_lbl, p, k, now); > } else if (seq->protocol =3D=3D IPPROTO_UDP) { > if (c->no_udp) > continue; > for (k =3D 0; k < p->count; ) > - k +=3D udp_tap_handler(c, PIF_TAP, AF_INET6, > + k +=3D udp_tap_handler(c, incoming_queue, PIF_TAP, AF_INET6, > &seq->saddr, &seq->daddr, > seq->hop_limit, p, k, now); > } > @@ -1083,22 +1086,24 @@ void tap_flush_pools(void) > =20 > /** > * tap_handler() - IPv4/IPv6 and ARP packet handler for tap file descrip= tor > - * @c: Execution context > - * @now: Current timestamp > + * @c: Execution context > + * @incoming_queue: Incoming queue number > + * @now: Current timestamp > */ > -void tap_handler(struct ctx *c, const struct timespec *now) > +void tap_handler(struct ctx *c, int incoming_queue, const struct timespe= c *now) > { > - tap4_handler(c, pool_tap4, now); > - tap6_handler(c, pool_tap6, now); > + tap4_handler(c, incoming_queue, pool_tap4, now); > + tap6_handler(c, incoming_queue, pool_tap6, now); > } > =20 > /** > * tap_add_packet() - Queue/capture packet, update notion of guest MAC a= ddress > - * @c: Execution context > - * @data: Packet to add to the pool > - * @now: Current timestamp > + * @c: Execution context > + * @incoming_queue: Incoming queue number > + * @data: Packet to add to the pool > + * @now: Current timestamp > */ > -void tap_add_packet(struct ctx *c, struct iov_tail *data, > +void tap_add_packet(struct ctx *c, int incoming_queue, struct iov_tail *= data, > const struct timespec *now) > { > struct ethhdr eh_storage; > @@ -1123,14 +1128,14 @@ void tap_add_packet(struct ctx *c, struct iov_tai= l *data, > case ETH_P_ARP: > case ETH_P_IP: > if (!pool_can_fit(pool_tap4, data)) { > - tap4_handler(c, pool_tap4, now); > + tap4_handler(c, incoming_queue, pool_tap4, now); This is a bit murky. You're using the incoming_queue for the new packet to process all the existiing packets in the pool. I guess that will work out because you'll have a separate pool for each thread/queue. But that's kind of unclear at this point. Maybe the queue number should be more explicitly part of the pool metadata? > pool_flush(pool_tap4); > } > packet_add(pool_tap4, data); > break; > case ETH_P_IPV6: > if (!pool_can_fit(pool_tap6, data)) { > - tap6_handler(c, pool_tap6, now); > + tap6_handler(c, incoming_queue, pool_tap6, now); > pool_flush(pool_tap6); > } > packet_add(pool_tap6, data); > @@ -1217,7 +1222,7 @@ static void tap_passt_input(struct ctx *c, const st= ruct timespec *now) > n -=3D sizeof(uint32_t); > =20 > data =3D IOV_TAIL_FROM_BUF(p, l2len, 0); > - tap_add_packet(c, &data, now); > + tap_add_packet(c, 0, &data, now); > =20 > p +=3D l2len; > n -=3D l2len; > @@ -1226,7 +1231,7 @@ static void tap_passt_input(struct ctx *c, const st= ruct timespec *now) > partial_len =3D n; > partial_frame =3D p; > =20 > - tap_handler(c, now); > + tap_handler(c, 0, now); > } > =20 > /** > @@ -1285,10 +1290,10 @@ static void tap_pasta_input(struct ctx *c, const = struct timespec *now) > continue; > =20 > data =3D IOV_TAIL_FROM_BUF(pkt_buf + n, len, 0); > - tap_add_packet(c, &data, now); > + tap_add_packet(c, 0, &data, now); > } > =20 > - tap_handler(c, now); > + tap_handler(c, 0, now); > } > =20 > /** > diff --git a/tap.h b/tap.h > index 76403a43edbc..fe9455ffcf4b 100644 > --- a/tap.h > +++ b/tap.h > @@ -119,7 +119,8 @@ int tap_sock_unix_open(char *sock_path); > void tap_sock_reset(struct ctx *c); > void tap_backend_init(struct ctx *c); > void tap_flush_pools(void); > -void tap_handler(struct ctx *c, const struct timespec *now); > -void tap_add_packet(struct ctx *c, struct iov_tail *data, > +void tap_handler(struct ctx *c, int incoming_queue, > + const struct timespec *now); > +void tap_add_packet(struct ctx *c, int incoming_queue, struct iov_tail *= data, > const struct timespec *now); > #endif /* TAP_H */ > diff --git a/tcp.c b/tcp.c > index 5ce34baa8a5a..78494a2dc69f 100644 > --- a/tcp.c > +++ b/tcp.c > @@ -1496,21 +1496,23 @@ static void tcp_bind_outbound(const struct ctx *c, > =20 > /** > * tcp_conn_from_tap() - Handle connection request (SYN segment) from tap > - * @c: Execution context > - * @af: Address family, AF_INET or AF_INET6 > - * @saddr: Source address, pointer to in_addr or in6_addr > - * @daddr: Destination address, pointer to in_addr or in6_addr > - * @th: TCP header from tap: caller MUST ensure it's there > - * @opts: Pointer to start of options > - * @optlen: Bytes in options: caller MUST ensure available length > - * @now: Current timestamp > + * @c: Execution context > + * @incoming_queue: TX queue number for the flow > + * @af: Address family, AF_INET or AF_INET6 > + * @saddr: Source address, pointer to in_addr or in6_addr > + * @daddr: Destination address, pointer to in_addr or in6_addr > + * @th: TCP header from tap: caller MUST ensure it's there > + * @opts: Pointer to start of options > + * @optlen: Bytes in options: caller MUST ensure available length > + * @now: Current timestamp > * > * #syscalls:vu getsockname > */ > -static void tcp_conn_from_tap(const struct ctx *c, sa_family_t af, > - const void *saddr, const void *daddr, > - const struct tcphdr *th, const char *opts, > - size_t optlen, const struct timespec *now) > +static void tcp_conn_from_tap(const struct ctx *c, int incoming_queue, > + sa_family_t af, const void *saddr, > + const void *daddr, const struct tcphdr *th, > + const char *opts, size_t optlen, > + const struct timespec *now) > { > in_port_t srcport =3D ntohs(th->source); > in_port_t dstport =3D ntohs(th->dest); > @@ -1622,6 +1624,7 @@ static void tcp_conn_from_tap(const struct ctx *c, = sa_family_t af, > conn_event(c, conn, TAP_SYN_ACK_SENT); > } > =20 > + flow_queue_set(&conn->f, incoming_queue); > tcp_epoll_ctl(c, conn); > =20 > if (c->mode =3D=3D MODE_VU) { /* To rebind to same oport after migratio= n */ > @@ -1983,16 +1986,16 @@ static void tcp_conn_from_sock_finish(const struc= t ctx *c, > =20 > /** > * tcp_rst_no_conn() - Send RST in response to a packet with no connecti= on > - * @c: Execution context > - * @queue: Queue to use to send the reply > - * @af: Address family, AF_INET or AF_INET6 > - * @saddr: Source address of the packet we're responding to > - * @daddr: Destination address of the packet we're responding to > - * @flow_lbl: IPv6 flow label (ignored for IPv4) > - * @th: TCP header of the packet we're responding to > - * @l4len: Packet length, including TCP header > - */ > -static void tcp_rst_no_conn(const struct ctx *c, int queue, int af, > + * @c: Execution context > + * @outgoing_queue: Queue to use to send the reply > + * @af: Address family, AF_INET or AF_INET6 > + * @saddr: Source address of the packet we're responding to > + * @daddr: Destination address of the packet we're responding to > + * @flow_lbl: IPv6 flow label (ignored for IPv4) > + * @th: TCP header of the packet we're responding to > + * @l4len: Packet length, including TCP header > + */ > +static void tcp_rst_no_conn(const struct ctx *c, int outgoing_queue, int= af, > const void *saddr, const void *daddr, > uint32_t flow_lbl, > const struct tcphdr *th, size_t l4len) > @@ -2050,24 +2053,25 @@ static void tcp_rst_no_conn(const struct ctx *c, = int queue, int af, > =20 > tcp_update_csum(psum, rsth, &payload); > rst_l2len =3D ((char *)rsth - buf) + sizeof(*rsth); > - tap_send_single(c, queue, buf, rst_l2len); > + tap_send_single(c, outgoing_queue, buf, rst_l2len); > } > =20 > /** > * tcp_tap_handler() - Handle packets from tap and state transitions > - * @c: Execution context > - * @pif: pif on which the packet is arriving > - * @af: Address family, AF_INET or AF_INET6 > - * @saddr: Source address > - * @daddr: Destination address > - * @flow_lbl: IPv6 flow label (ignored for IPv4) > - * @p: Pool of TCP packets, with TCP headers > - * @idx: Index of first packet in pool to process > - * @now: Current timestamp > + * @c: Execution context > + * @incoming_queue: Incoming queue number > + * @pif: pif on which the packet is arriving > + * @af: Address family, AF_INET or AF_INET6 > + * @saddr: Source address > + * @daddr: Destination address > + * @flow_lbl: IPv6 flow label (ignored for IPv4) > + * @p: Pool of TCP packets, with TCP headers > + * @idx: Index of first packet in pool to process > + * @now: Current timestamp > * > * Return: count of consumed packets > */ > -int tcp_tap_handler(const struct ctx *c, uint8_t pif, > +int tcp_tap_handler(const struct ctx *c, int incoming_queue, uint8_t pif, > sa_family_t af, const void *saddr, const void *daddr, > uint32_t flow_lbl, const struct pool *p, int idx, > const struct timespec *now) > @@ -2107,11 +2111,11 @@ int tcp_tap_handler(const struct ctx *c, uint8_t = pif, > /* New connection from tap */ > if (!flow) { > if (opts && th->syn && !th->ack) > - tcp_conn_from_tap(c, af, saddr, daddr, th, > + tcp_conn_from_tap(c, incoming_queue, af, saddr, daddr, th, > opts, optlen, now); > else > - tcp_rst_no_conn(c, VHOST_USER_RX_QUEUE, af, saddr, > - daddr, flow_lbl, th, l4len); > + tcp_rst_no_conn(c, incoming_queue & ~1, af, saddr, daddr, > + flow_lbl, th, l4len); > return 1; > } > =20 > @@ -2119,6 +2123,9 @@ int tcp_tap_handler(const struct ctx *c, uint8_t pi= f, > ASSERT(pif_at_sidx(sidx) =3D=3D PIF_TAP); > conn =3D &flow->tcp; > =20 > + /* update queue */ > + flow_queue_set(&flow->f, incoming_queue); > + > flow_trace(conn, "packet length %zu from tap", l4len); > =20 > if (th->rst) { > diff --git a/tcp.h b/tcp.h > index 320683ce5679..cddd36cadc97 100644 > --- a/tcp.h > +++ b/tcp.h > @@ -15,7 +15,7 @@ void tcp_listen_handler(const struct ctx *c, union epol= l_ref ref, > const struct timespec *now); > void tcp_sock_handler(const struct ctx *c, > union epoll_ref ref, uint32_t events); > -int tcp_tap_handler(const struct ctx *c, uint8_t pif, > +int tcp_tap_handler(const struct ctx *c, int incoming_queue, uint8_t pif, > sa_family_t af, const void *saddr, const void *daddr, > uint32_t flow_lbl, const struct pool *p, int idx, > const struct timespec *now); > diff --git a/tcp_vu.c b/tcp_vu.c > index 1c81ce376dad..40f552087bc5 100644 > --- a/tcp_vu.c > +++ b/tcp_vu.c > @@ -70,15 +70,16 @@ static size_t tcp_vu_hdrlen(bool v6) > */ > int tcp_vu_send_flag(const struct ctx *c, struct tcp_tap_conn *conn, int= flags) > { > + int rx_queue =3D flow_rx_virtqueue(&conn->f); > struct vu_dev *vdev =3D c->vdev; > - struct vu_virtq *vq =3D &vdev->vq[VHOST_USER_RX_QUEUE]; > - size_t optlen, hdrlen; > + struct vu_virtq *vq =3D &vdev->vq[rx_queue]; > struct vu_virtq_element flags_elem[2]; > struct ipv6hdr *ip6h =3D NULL; > struct iphdr *ip4h =3D NULL; > struct iovec flags_iov[2]; > struct tcp_syn_opts *opts; > struct iov_tail payload; > + size_t optlen, hdrlen; > struct tcphdr *th; > struct ethhdr *eh; > uint32_t seq; > @@ -348,8 +349,9 @@ static void tcp_vu_prepare(const struct ctx *c, struc= t tcp_tap_conn *conn, > int tcp_vu_data_from_sock(const struct ctx *c, struct tcp_tap_conn *conn) > { > uint32_t wnd_scaled =3D conn->wnd_from_tap << conn->ws_from_tap; > + int rx_queue =3D flow_rx_virtqueue(&conn->f); > struct vu_dev *vdev =3D c->vdev; > - struct vu_virtq *vq =3D &vdev->vq[VHOST_USER_RX_QUEUE]; > + struct vu_virtq *vq =3D &vdev->vq[rx_queue]; > ssize_t len, previous_dlen; > int i, iov_cnt, head_cnt; > size_t hdrlen, fillsize; > diff --git a/udp.c b/udp.c > index 868ffebb5802..a0eb719888cd 100644 > --- a/udp.c > +++ b/udp.c > @@ -636,14 +636,15 @@ static int udp_sock_recverr(const struct ctx *c, in= t s, flow_sidx_t sidx, > if (hdr->cmsg_level =3D=3D IPPROTO_IP && > (o4 =3D inany_v4(&otap)) && inany_v4(&toside->eaddr)) { > dlen =3D MIN(dlen, ICMP4_MAX_DLEN); > - udp_send_tap_icmp4(c, VHOST_USER_RX_QUEUE, ee, toside, *o4, > - data, dlen); > + udp_send_tap_icmp4(c, flow_rx_virtqueue(&uflow->f), ee, toside, Kind of a something for an earlier patch, but given how frequently and where it's used, it's worth having a shorter name for flow_rx_virtqueue(). And probably a wrapper macro so it can be called on uflow directly as well. > + *o4, data, dlen); > return 1; > } > =20 > if (hdr->cmsg_level =3D=3D IPPROTO_IPV6 && !inany_v4(&toside->eaddr)) { > - udp_send_tap_icmp6(c, VHOST_USER_RX_QUEUE, ee, toside, > - &otap.a6, data, dlen, FLOW_IDX(uflow)); > + udp_send_tap_icmp6(c, flow_rx_virtqueue(&uflow->f), ee, > + toside, &otap.a6, data, dlen, > + FLOW_IDX(uflow)); > return 1; > } > =20 > @@ -971,25 +972,27 @@ fail: > =20 > /** > * udp_tap_handler() - Handle packets from tap > - * @c: Execution context > - * @pif: pif on which the packet is arriving > - * @af: Address family, AF_INET or AF_INET6 > - * @saddr: Source address > - * @daddr: Destination address > - * @ttl: TTL or hop limit for packets to be sent in this call > - * @p: Pool of UDP packets, with UDP headers > - * @idx: Index of first packet to process > - * @now: Current timestamp > + * @c: Execution context > + * @incoming_queue: Incoming queue number > + * @pif: pif on which the packet is arriving > + * @af: Address family, AF_INET or AF_INET6 > + * @saddr: Source address > + * @daddr: Destination address > + * @ttl: TTL or hop limit for packets to be sent in this call > + * @p: Pool of UDP packets, with UDP headers > + * @idx: Index of first packet to process > + * @now: Current timestamp > * > * Return: count of consumed packets > * > * #syscalls sendmmsg > */ > -int udp_tap_handler(const struct ctx *c, uint8_t pif, > +int udp_tap_handler(const struct ctx *c, int incoming_queue, uint8_t pif, > sa_family_t af, const void *saddr, const void *daddr, > uint8_t ttl, const struct pool *p, int idx, > const struct timespec *now) > { > + int outgoing_queue =3D incoming_queue & ~1; > const struct flowside *toside; > struct mmsghdr mm[UIO_MAXIOV]; > union sockaddr_inany to_sa; > @@ -1019,7 +1022,7 @@ int udp_tap_handler(const struct ctx *c, uint8_t pi= f, > src =3D ntohs(uh->source); > dst =3D ntohs(uh->dest); > =20 > - tosidx =3D udp_flow_from_tap(c, pif, af, saddr, daddr, src, dst, now); > + tosidx =3D udp_flow_from_tap(c, outgoing_queue, pif, af, saddr, daddr, = src, dst, now); > if (!(uflow =3D udp_at_sidx(tosidx))) { > char sstr[INET6_ADDRSTRLEN], dstr[INET6_ADDRSTRLEN]; > =20 > diff --git a/udp.h b/udp.h > index f1d83f380b3f..8ba4ccfe646a 100644 > --- a/udp.h > +++ b/udp.h > @@ -7,11 +7,13 @@ > #define UDP_H > =20 > void udp_portmap_clear(void); > -void udp_listen_sock_handler(const struct ctx *c, union epoll_ref ref, > - uint32_t events, const struct timespec *now); > -void udp_sock_handler(const struct ctx *c, union epoll_ref ref, > - uint32_t events, const struct timespec *now); > -int udp_tap_handler(const struct ctx *c, uint8_t pif, > +void udp_listen_sock_handler(const struct ctx *c, > + union epoll_ref ref, uint32_t events, > + const struct timespec *now); > +void udp_sock_handler(const struct ctx *c, > + union epoll_ref ref, uint32_t events, > + const struct timespec *now); > +int udp_tap_handler(const struct ctx *c, int incoming_queue, uint8_t pif, > sa_family_t af, const void *saddr, const void *daddr, > uint8_t ttl, const struct pool *p, int idx, > const struct timespec *now); > diff --git a/udp_flow.c b/udp_flow.c > index 8907f2f72741..b4a709b8d976 100644 > --- a/udp_flow.c > +++ b/udp_flow.c > @@ -266,17 +266,19 @@ flow_sidx_t udp_flow_from_sock(const struct ctx *c,= uint8_t pif, > /** > * udp_flow_from_tap() - Find or create UDP flow for tap packets > * @c: Execution context > + * @queue: RX queue number for the flow > * @pif: pif on which the packet is arriving > * @af: Address family, AF_INET or AF_INET6 > * @saddr: Source address on guest side > * @daddr: Destination address guest side > * @srcport: Source port on guest side > * @dstport: Destination port on guest side > + * @now: Current timestamp > * > * Return: sidx for the destination side of the flow for this packet, or > * FLOW_SIDX_NONE if we couldn't find or create a flow. > */ > -flow_sidx_t udp_flow_from_tap(const struct ctx *c, > +flow_sidx_t udp_flow_from_tap(const struct ctx *c, int queue, > uint8_t pif, sa_family_t af, > const void *saddr, const void *daddr, > in_port_t srcport, in_port_t dstport, > @@ -293,6 +295,8 @@ flow_sidx_t udp_flow_from_tap(const struct ctx *c, > srcport, dstport); > if ((uflow =3D udp_at_sidx(sidx))) { > uflow->ts =3D now->tv_sec; > + /* update queue */ > + flow_queue_set(&uflow->f, queue); > return flow_sidx_opposite(sidx); > } > =20 > @@ -316,6 +320,8 @@ flow_sidx_t udp_flow_from_tap(const struct ctx *c, > return FLOW_SIDX_NONE; > } > =20 > + flow_queue_set(&flow->f, queue); > + > return udp_flow_new(c, flow, now); > } > =20 > diff --git a/udp_flow.h b/udp_flow.h > index 4c528e95ca66..4a057a9d44a8 100644 > --- a/udp_flow.h > +++ b/udp_flow.h > @@ -36,7 +36,7 @@ flow_sidx_t udp_flow_from_sock(const struct ctx *c, uin= t8_t pif, > const union inany_addr *dst, in_port_t port, > const union sockaddr_inany *s_in, > const struct timespec *now); > -flow_sidx_t udp_flow_from_tap(const struct ctx *c, > +flow_sidx_t udp_flow_from_tap(const struct ctx *c, int queue, > uint8_t pif, sa_family_t af, > const void *saddr, const void *daddr, > in_port_t srcport, in_port_t dstport, > diff --git a/udp_vu.c b/udp_vu.c > index 099677f914e7..cd2c9c516d44 100644 > --- a/udp_vu.c > +++ b/udp_vu.c > @@ -202,9 +202,11 @@ static void udp_vu_csum(const struct flowside *tosid= e, int iov_used) > void udp_vu_sock_to_tap(const struct ctx *c, int s, int n, flow_sidx_t t= osidx) > { > const struct flowside *toside =3D flowside_at_sidx(tosidx); > + const struct udp_flow *uflow =3D udp_at_sidx(tosidx); > bool v6 =3D !(inany_v4(&toside->eaddr) && inany_v4(&toside->oaddr)); > + int rx_queue =3D flow_rx_virtqueue(&uflow->f); > struct vu_dev *vdev =3D c->vdev; > - struct vu_virtq *vq =3D &vdev->vq[VHOST_USER_RX_QUEUE]; > + struct vu_virtq *vq =3D &vdev->vq[rx_queue]; > int i; > =20 > for (i =3D 0; i < n; i++) { > diff --git a/vu_common.c b/vu_common.c > index 8904403e66af..56f26317b192 100644 > --- a/vu_common.c > +++ b/vu_common.c > @@ -196,11 +196,11 @@ static void vu_handle_tx(struct vu_dev *vdev, int i= ndex, > =20 > data =3D IOV_TAIL(elem[count].out_sg, elem[count].out_num, 0); > if (IOV_DROP_HEADER(&data, struct virtio_net_hdr_mrg_rxbuf)) > - tap_add_packet(vdev->context, &data, now); > + tap_add_packet(vdev->context, index, &data, now); > =20 > count++; > } > - tap_handler(vdev->context, now); > + tap_handler(vdev->context, index, now); > =20 > if (count) { > int i; > --=20 > 2.51.0 >=20 --=20 David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson --mePDtlOii17f5X2C Content-Type: application/pgp-signature; name=signature.asc -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEEO+dNsU4E3yXUXRK2zQJF27ox2GcFAmkRfgUACgkQzQJF27ox 2GdMSRAAqkg0CvOjFooLLyod2jmq26j/Ngo47BQWQOJ5oJzFmtqRWZLsv4bH4dDP RzgLPbI0i+AkcF0bs709FzivRKnXZ1emSO2k4epTW46YOhwrW0sqZi4kNNAWi8xR c2PxCNVLFivi8Pbd8lbC7yGz2Jg6ieP1pEo3vk2f7ospfIOMmHIHivWl+2+9Fr3K r/gy0KC9A1xjv+f5owXnFPqWlrVQN2pK088RHIO6VF+kqOTFkGFYToLx+QFEHHn0 QX70XlhJcTSk06eeO0l4Jsryqqb+ZWSybveVOyZO57IQxuVOTz/+CowFievHZkDS vKlCHjDOs+inhSUvsCuWhsHiZBEWGlTuQFxtJTes9XaSUNeGryf2kt41qCdci1v6 Zko6nBEEZx7S5Gpcx+HqSCZuNZeA64rQMkBAJKjwS7cHp0n2i4CcC6ich/b6gq4a bDZlKw+ns4sX60qE3R4jvBE8xYj3nZhVdcvBDpm5TRNqjGyLZO4OKGfZDTqVwNen f7bHAKBtDh6xx/JcOgv/xM/Oq8UKOo8AkWxSr4nDVBZqJufgWm95NrE2lwKf16Jf mNMVhNGu7B3Fv1KamrqzKtL2C7dlex3NJFWim2nwPFNkfAUxW7z4TXV6cxX55hvY hbFYMKOewKCHL/tDdS/wlLpteC13xSVbOeMyOadVyrSgqAN4zxY= =3Spo -----END PGP SIGNATURE----- --mePDtlOii17f5X2C--