On Wed, Feb 28, 2024 at 01:30:05PM +1100, David Gibson wrote: > On Tue, Feb 27, 2024 at 12:56:59PM +0100, Stefano Brivio wrote: > > On Thu, 22 Feb 2024 10:21:12 +1100 > > David Gibson wrote: [snip] > > I'm having a hard time reviewing the rest because of this, I think. > > Earlier, UDP_ACT_TAP meant that the activity timestamp was observed on > > a non-spliced port. > > Ah! I'd read those as "ACTions we needed to take" in the expiry > timers, rather than tying them to specific activity. It amounts to > the same thing pre-patch, but it makes more sense to me with that new > lens. > > > Now, we have UDP_ACT_FLAGS which seems to mean that activity was seen > > in the direction of (to) the "tap" side, and UDP_ACT_TAP_SOCK meaning > > that activity was seen in the direction of (to) the socket. > > > > But that's somehow the opposite of UDP_ACT_SPLICE_NS and > > UDP_ACT_SPLICE_INIT -- even though they don't actually indicate > > activity, rather the fact that activity timestamps refer to sockets that > > originated in (*from*) the target namespace, or in the initial > > namespace. > > > > Anyway, I'm not quite sure: why do we need two separate flags for "tap" > > (from/to) activity? > > We don't, this is an ugly side effect. The problem is that in > scanning for expiries we only have a single port number. But the > socket is associated with a particular forwarding port, while the > flags are associated with a particular endpoint port. Without flow > tracking, we don't have a way to match one to the other. > > The compromise here is - or is supposed to be - that we have > independent timestamps for the socket and the flags. The two > timestamps associated with a single flow are supposed to be updated at > basically the same time - looks like I missed some things to do that > as well. > > > I mean, as long as we observe activity on a non-spliced flow, we'll > > update the related timestamp, and we make sure the activity flag is > > set. Can't it simply be UDP_ACT_TAP? Urgh. So it seems like every time I re-examine this, I find more pre-existing bugs. So, part of what this patch was attempting to fix originally is that previously, although we updated the timestamp host->guest packets, we updated the wrong one for expiring the socket (based on endpoint rather than forwarding port). In trying to fix it up, I additionally realised, we currently don't update the timestamp for host to guest packets at all, except in the when remapping the src to gateway address, which as noted previously itself doesn't honour the no_map_gw flag. *Then* I realised that if we do update the correct timestamp and ACT flag, we could now set an ACT flag for a socket created by -u, and therefore cause a port that should remain open to be expired. I don't think this any longer counts as a "small" fix, so I will remove it from this series and tackle it elsewhere. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson