net - Fix vlan input packet processing w/ if_bridge, if_carp, etc * if_bridge does not understand VLAN-tagged packets, do not try to bridge such packets from the primary interface. If the user wants to bridge such packets it can be done via the virtual vlan interface and the vlan tag can be regenerated (or not) with appropriate bridge groupings. This was causing unicast vlan packets to be discarded in the bridge code. * Unicast VLAN-tagged packets were not being properly bpf tapped on the virtual vlan interface. * Carp should operate on vlan interfaces, not the original interface, when presented with a VLAN-tagged packet. * Fix all of this by having ether_input_oncpu() bypass more or less directly to ether_demux_oncpu() when a M_VLANTAG packet is encountered. What will happen is that if_vlan will issue ether_reinput_oncpu() from the appropriate virtual vlan interface, which ultimately re-enters ether_input_oncpu() without the tag. This is more along the lines of how we want vlans to be treated. They really are supposed to be virtual LANs.
inet6: Introduce nd6_resolve, the mirror of arpresolve nd6_output now just sends the packet, nd6_resolve handles the NUD that nd6_output used to. nd6_resolve also returns sensible errors, but we mask out EWOULDBLOCK in the callers. There is no longer a need for nd6_storelladdr and this makes the code a lot easier to follow. Heavily inspired by FreeBSD/Git 49332534.
inet: return EHOSTDOWN if we cannot resolve an address in time This allows programs to make informed decisions about what to do if anything goes wrong trying to resolve the address. For example, ping(8) now reports sendto: Host is down, which is more useful than not reporting anything. Taken-from: NetBSD
ipfw: Implement state based "redirect", i.e. without using libalias. Redirection creates two states, i.e. one before the translation (xlat0) and one after the translation (xlat1). If the hash of the translated packet indicates that it is owned by a remote CPU: - If the packet triggers the state pair creation, the 'xlat1' will be piggybacked by the translated packet, which will be forwarded to the remote CPU for further evalution. And the 'xlat1' will be installed on the remote CPU before the evalution of the translated packet. - Else only the translated packet will be forwarded to the remote CPU for further evalution. The 'xlat1' is called the slave state, which will be deleted only when the 'xlat0' (the master state) is deleted. The state pair is always deleted on the CPU owning the 'xlat1'; the 'xlat0' will be forwarded there. The reference counting of the state pair is maintained independently in each state, the memory of the state pair will be freed only after the sum of the counter in each state reaches 0. This avoids expensive per-packet atomic ops. As far as I have tested, this implementation of "redirect" does _not_ introduce any noticeable performance reduction, latency increasing or latency destability. This commit makes most of the necessary bits for NAT ready too.
polling: Implement direct input support. When "direct input" is enabled by driver, driver's RX polling handler will run ethernet/ip/tcp processing directly, which avoids cache-miss on mbufs themselves. Currently it is enabled on ix(4) by default. The normal IP forwarding performance is improved by %12, while the fast IP forwarding performance is improved by 10%. 13.2Mpps is achieved for dual side IP forwarding! 1 request/connection HTTP/1.1 performance and avg-latency stay same, but the latency is further stablized: 1K 5.20ms -> 4.60ms 8K 6.43ms -> 5.76ms 16K 16.30ms -> 14.90ms
ifnet: Allow drivers to adjust mbuf cluster/jcluster limits This is mainly for raising mbuf clusters/jclusters limits to a high enough value for device reception queues, e.g. modern network devices w/ multiple reception queues and each reception queue could consume >=512 mbuf clusters.
kernel: Move us to using M_NOWAIT and M_WAITOK for mbuf functions. The main reason is that our having to use the MB_WAIT and MB_DONTWAIT flags was a recurring issue when porting drivers from FreeBSD because it tended to get forgotten and the code would compile anyway with the wrong constants. And since MB_WAIT and MB_DONTWAIT ended up as ocflags for an objcache_get() or objcache_reclaimlist call (which use M_WAITOK and M_NOWAIT), it was just one big converting back and forth with some sanitization in between. This commit allows M_* again for the mbuf functions and keeps the sanitizing as it was before: when M_WAITOK is among the passed flags, objcache functions will be called with M_WAITOK and when it is absent, they will be called with M_NOWAIT. All other flags are scrubbed by the MB_OCFLAG() macro which does the same as the former MBTOM(). Approved-by: dillon