kernel - Misc adjustments used by the vkernel and VMM, misc optimizations * This section committed separately because it is basically independent of VMM. * Improve pfind(). Don't get proc_token if the process being looked up is the current process. * Improve kern_kill(). Do not obtain proc_token any more. p->p_token is sufficient and the process group has its own lock now. * Call pthread_yield() when spinning on various things. x Spinlocks x Tokens (spinning in lwkt_switch) x cpusync (ipiq) * Rewrite sched_yield() -> dfly_yield(). dfly_yield() will unconditionally round-robin the LWP, ignoring estcpu. It isn't perfect but it works fairly well. The dfly scheduler will also no longer attempt to migrate threads across cpus when handling yields. They migrate normally in all other circumstances. This fixes situations where the vkernel is spinning waiting for multiple events from other cpus and in particular when it is doing a global IPI for pmap synchronization of the kernel_pmap.
tcp: Use mbuf jcluster (4K) in sosendtcp(); improve 10Ge TSO performance For the Myricom PCIE-8AL-C I have tested, this gives ~900Mbps performance boost using 1500 MTU when TSO is enabled (from 7.7Gbps to 8.6Gbps). Using mbuf jcluster could: - Reduce the number of TX descriptors needed for one TSO packet. - Let the NIC chip perform longer large data burst. I believe this is the main reasons for the 10Ge performance boost. Reduce sosend_agglim from 3 to 2, which means 8K aggregation (was 6K aggregation) before user thread dispatches the sending buf to netisr. net.inet.tcp.sosend_jcluster is added to enable this feature; it is enabled by default.
mbuf: Fix jcluster support - Free the jcluster to the correct objcache, which solves "jcluster mbuf" objcache exhaustion. - By default, set the amount of the jclusters to half the amount of the normal clusters. jcluster will be used on TCP sending path in the later commit to improve TSO performance for 10Ge. - Add mbuf stat for jcluster; adjust netstat(1) to show it. - Add m_getlj(), which will be used on TCP sending path in the later commit to improve TSO performnace for 10Ge.
drm: Replace the i915 driver by i915kms i915kms was already an updated version of i915, there's no need to keep maintaining two separate instances of the same driver.
mxge: Sync w/ FreeBSD - Update firmware - Sync w/ FreeBSD if_mxge.c rev 254263 Local changes: - Adjust output path a little bit: o Utilize bus_dmamap_load_mbuf_defrag() o Utilize the header length fields in mbuf pkthdr instead of peeking at the packet data - Change TSO mode to NDIS mode - Minor white space changes - Nuke mxge_lro.c; we won't need it