On Mon, Mar 17, 2025 at 01:40:51PM -0400, Jon Maloy wrote: > > > On 2025-03-16 22:58, David Gibson wrote: > > On Sat, Mar 15, 2025 at 11:32:45AM -0400, Jon Maloy wrote: > > > Now that ICMP pass-through from socket-to-tap is in place, it is > > > easy to support UDP based traceroute functionality in direction > > > tap-to-socket. > > [...] > > > > + uint8_t ttl, const struct pool *p, int idx, > > > + const struct timespec *now) > > > { > > > const struct flowside *toside; > > > struct mmsghdr mm[UIO_MAXIOV]; > > > @@ -933,6 +935,12 @@ int udp_tap_handler(const struct ctx *c, uint8_t pif, > > > mm[i].msg_hdr.msg_controllen = 0; > > > mm[i].msg_hdr.msg_flags = 0; > > > + if (ttl <= 30) { > > > + if (setsockopt(s, IPPROTO_IP, IP_TTL, > > > + &ttl, sizeof(ttl)) < 0) > > > > AFAIK this will set the TTL on every subsequent packet, not just the > > next one, so this isn't quite right. To use this I guess we'd have to > > store the correct TTL in the flow, and do the sockopt if it's > > different from the guest side one. > > I forgot one thing at our discussion about this at our meeting. > Traceroute works by sending three packets with ttl=1, then > three with ttl=2, then three with ttl=3 and so forth. > For each packet sent it steps the destination port number, > starting at 33334, and in the most extreme case (assuming it > stops at ttl=32, which linux traceroute does in practice) > stopping at port number 33524. > The whole port range [33434:33535] is unassigned by IANA, and > no sane developer would use it for his application, for obvious reasons. > > This means that every single traceroute/udp message will become > a separate flow, and there is no need to complicate things by > storing the ttl in the flow or use it as criteria. It also > means that my current approach works fine. I don't even need > to restore the ttl in the outgoing socket, since that, just > like the associated flow, is a single-shot affair. Urgh. I mean, yes it works for (the current implementation of) traceroute, but doing it like this really isn't correct for anything else that manipulates TTL. I mean, yes, traceroute is *probably* the only thing to do that, but I think we should endeavour to be more generally correct if it's not too complicated. And in this case, I don't think it's too complicated. > > Unless there's a way to set TTL > > via cmsg. Maybe there is, but I haven't spotted it in a quick glance. > > I seems I can add it as ancillary data if I add a control buffer to struct > mmsghdr.msghdr. However, it is hard to see any benefit of this, given the > above. It will cost more performance on the critical code path > than the simple test I am doing now. Thinking about this more, I don't actually think setting the TTL per-packet with a cmsg is particularly useful anyway. For traceroute to a small extent - and I suspect in the unlikely case of anything else manipulating TTL - it seems like the guest is likely to use the same TTL for multiple packets in a row. That means just calling the setsockopt() when it changes from the current value should work well. -- David Gibson (he or they) | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you, not the other way | around. http://www.ozlabs.org/~dgibson