mbuf(9): Add 'm_pkthdr.loop_cnt' for loop detection Extend the 'm_pkthdr' struct to provide the 'loop_cnt' member by using currently unused space. Therefore, drivers (e.g., gif, gre, wg) can make use of this new member to easily implement loop detection. Bump __DragonFly_version. Discussed-with: dillon Referred-to: OpenBSD
mbuf(9): Various minor updates and style cleanups - Fix the comment that MSIZE/MCLBYTES is defined in <sys/param.h> instead of <machine/param.h>; update the man page accordingly. - Adjust some type casts in mtod() to be more consistent. - Add the '__unused' attribute to actually unused parameters. - Remove unused NCL_INIT/NMB_INIT macros from 'uipc_mbuf.c'. - Use '__func__' instead of hard-coding function names. - Fix several typos. - Various style cleanups, mainly whitesapce adjustments.
mbuf(9): Use 'void *' in several public APIs to save casts in callers Update the following public APIs to use 'void *' or 'const void *' instead of 'caddr_t'/'c_caddr_t'/'char *', so that callers no longer need to do explicit casts: - m_append() - m_copyback() - m_copyback2() - m_copydata() - m_devget() - m_extadd()
mbuf(9): Add m_copyback2() for a better m_copyback() The existing m_copyback() may extends the mbuf chain if necessary, but it doesn't return a value to indicate whether the allocation fails. In addition, it doesn't allow to specify the M_WAITOK/M_NOWAIT flag for mbuf allocation. Extend m_copyback() and name it m_copyback2() that has the 'how' parameter to specify M_WAITOK/M_NOWAIT flag and return an error code to indication the success/failure. Reimplement the original m_copyback() using m_copyback2() with how=M_NOWAIT. Referred-to: OpenBSD
mbuf(9): Remove obsolete and unused 'kern.ipc.mbuf_wait' sysctl This sysctl MIB has been obsolete and unused since the re-implementation of mbuf allocation using objcache(9) in commit 7b6f875 (year 2005). Remove this sysctl MIB. Update the mbuf.9 manpage about the 'how' argument to avoid ambiguity, i.e., MGET()/m_get() etc. would not fail if how=M_WAITOK.
ip_forward: Migrate cpu if hash doesn't match. Packet filter re-writes can cause the call to ip_forward to be on the wrong CPU. Detect this case and correct it. Check M_HASH at the beginning of ip_input and dispatch to a new CPU if we aren't in the right place. This mirrors what is done for packets that are destined to the transport layer. This causes ip_forward and ip_output to be called on the correct CPU, including any states that are created by output rules.
Remove IPsec and related code from the system. It was unmaintained ever since we inherited it from FreeBSD 4.8. In fact, we had two implementations from that time: IPSEC and FAST_IPSEC. FAST_IPSEC is the implementation to which FreeBSD has moved since, but it didn't even build in DragonFly. Fixes for dports have been committed to DeltaPorts. Requested-by: dillon Dports-testing-and-fixing: zrj
ipfw: Implement state based "redirect", i.e. without using libalias. Redirection creates two states, i.e. one before the translation (xlat0) and one after the translation (xlat1). If the hash of the translated packet indicates that it is owned by a remote CPU: - If the packet triggers the state pair creation, the 'xlat1' will be piggybacked by the translated packet, which will be forwarded to the remote CPU for further evalution. And the 'xlat1' will be installed on the remote CPU before the evalution of the translated packet. - Else only the translated packet will be forwarded to the remote CPU for further evalution. The 'xlat1' is called the slave state, which will be deleted only when the 'xlat0' (the master state) is deleted. The state pair is always deleted on the CPU owning the 'xlat1'; the 'xlat0' will be forwarded there. The reference counting of the state pair is maintained independently in each state, the memory of the state pair will be freed only after the sum of the counter in each state reaches 0. This avoids expensive per-packet atomic ops. As far as I have tested, this implementation of "redirect" does _not_ introduce any noticeable performance reduction, latency increasing or latency destability. This commit makes most of the necessary bits for NAT ready too.
kernel - Remove PG_ZERO and zeroidle (page-zeroing) entirely * Remove the PG_ZERO flag and remove all page-zeroing optimizations, entirely. Aftering doing a substantial amount of testing, these optimizations, which existed all the way back to CSRG BSD, no longer provide any benefit on a modern system. - Pre-zeroing a page only takes 80ns on a modern cpu. vm_fault overhead in general is ~at least 1 microscond. - Pre-zeroing a page leads to a cold-cache case on-use, forcing the fault source (e.g. a userland program) to actually get the data from main memory in its likely immediate use of the faulted page, reducing performance. - Zeroing the page at fault-time is actually more optimal because it does not require any reading of dynamic ram and leaves the cache hot. - Multiple synth and build tests show that active idle-time zeroing of pages actually reduces performance somewhat and incidental allocations of already-zerod pages (from page-table tear-downs) do not affect performance in any meaningful way. * Remove bcopyi() and obbcopy() -> collapse into bcopy(). These other versions existed because bcopy() used to be specially-optimized and could not be used in all situations. That is no longer true. * Remove bcopy function pointer argument to m_devget(). It is no longer used. This function existed to help support ancient drivers which might have needed a special memory copy to read and write mapped data. It has long been supplanted by BUSDMA.