Previously, every packet we see gets a lock on the conntrack table and updates it. When running with multiple routines, this can cause heavy lock contention and limit our ability for the threads to run independently. This change caches reads from the conntrack table for a very short period of time to reduce this lock contention. This cache will currently default to disabled unless you are running with multiple routines, in which case the default cache delay will be 1 second. This means that entries in the conntrack table may be up to 1 second out of date and remain in a routine local cache for up to 1 second longer than the global table.
Instead of calling time.Now() for every packet, this cache system relies on a tick thread that updates the current cache "version" each tick. Every packet we check if the cache version is out of date, and reset the cache if so.
This change is for Linux only.
Previously, when running with multiple tun.routines, we would only have one file descriptor. This change instead sets IFF_MULTI_QUEUE and opens a file descriptor for each routine. This allows us to process with multiple threads while preventing out of order packet reception issues.
To attempt to distribute the flows across the queues, we try to write to the tun/UDP queue that corresponds with the one we read from. So if we read a packet from tun queue "2", we will write the outgoing encrypted packet to UDP queue "2". Because of the nature of how multi queue works with flows, a given host tunnel will be sticky to a given routine (so if you try to performance benchmark by only using one tunnel between two hosts, you are only going to be using a max of one thread for each direction).
Because this system works much better when we can correlate flows between the tun and udp routines, we are deprecating the undocumented "tun.routines" and "listen.routines" parameters and introducing a new "routines" parameter that sets the value for both. If you use the old undocumented parameters, the max of the values will be used and a warning logged.
Co-authored-by: Nate Brown <nbrown.us@gmail.com>
We noticed that the number of memory allocations LightHouse.HandleRequest creates for each call can seriously impact performance for high traffic lighthouses. This PR introduces a benchmark in the first commit and then optimizes memory usage by creating a LightHouseHandler struct. This struct allows us to re-use memory between each lighthouse request (one instance per UDP listener go-routine).
* Remove unused (*udpConn).Read method
* Align linux UDP performance optimizations with configuration
While attempting to run nebula on an older Synology NAS, it became
apparent that some of the performance optimizations effectively
block support for older kernels. The recvmmsg syscall was added in
linux kernel 2.6.33, and the Synology DS212j (among other models)
is pinned to 2.6.32.12.
Similarly, SO_REUSEPORT was added to the kernel in the 3.9 cycle.
While this option has been backported into some older trees, it
is also missing from the Synology kernel.
This commit allows nebula to be run on linux devices with older
kernels if the config options are set up with a single listener
and a UDP batch size of 1.
* Use golang.org/x/sys/unix for _linux.go sources
To support builds on GOARCH=386 and possibly elsewhere, it's necessary
to use the x/sys/unix package instead of the syscall package. This is
because the syscall package is frozen and does not support
SYS_GETSOCKNAME, SYS_RECVFROM, nor SYS_SENDTO for GOARCH=386.
This commit alone doesn't add support for 386 builds, just gets things
onto x/sys/unix so that it's possible.
The remaining uses of the syscall package relate to signals, which
cannot be switched to the x/sys/unix package at this time. Windows
support breaks, so they can either continue using the syscall package
(it's frozen, this is safe for Go 1.x at minimum), or something can be
written to just use both windows- and unix-compatible signals.
* Add linux-386, ppc64le targets to Makefile
Because 'linux' is linux-amd64 already, just add linux-386 and
linux-ppc64le targets to distinguish them. Would rename the linux
target but that might break existing uses.