kernel: Move semicolon from the definition of SYSINIT() to its invocations. This affected around 70 of our (more or less) 270 SYSINIT() calls. style(9) advocates the terminating semicolon to be supplied by the invocation too, because it can make life easier for editors and other source code parsing programs.
kernel - Optimize the x86-64 lwbuf API * Change lwbuf_alloc(m) to lwbuf_alloc(m, &lwb_cache), passing a pointer to a struct lwb which lwbuf_alloc() may used if it desires. * The x86-64 lwbuf_alloc() now just fills in the passed lwb and returns it. The i386 lwbuf_alloc() still uses the objcache w/ its kva mappings. This removes objcache calls from the critical path. * The x86-64 lwbuf_alloc()/lwbuf_free() functions are now inlines (ALL x86-64 lwbuf functions are now inlines).
kernel - Introduce lightweight buffers * Summary: The lightweight buffer (lwbuf) subsystem is effectively a reimplementation of the sfbuf (sendfile buffers) implementation. It was designed to be lighter weight than the sfbuf implementation when possible, on x86_64 we use the DMAP and the implementation is -very- simple. It was also designed to be more SMP friendly. * Replace all consumption of sfbuf with lwbuf * Refactor sfbuf to act as an external refcount mechanism for sendfile(2), this will probably go away eventually as well.
kernel - pmap (i386) - Reduce kmem use for foreign pmap mapping * We've been having problems running out of KVA on i386 systems due to numerous reasons. KVA use by the kernel is just too tight. * Reserve space for foreign pmap page table mappings on a cpu-by-cpu basis instead of for SMP_MAXCPU. This reduces KVM use from 68MB to (ncpu*4MB). Use the APT entry for cpu0 and use kmem_alloc_nofault() for the APs. This frees up 52MB of KVA which doesn't sound like a lot but actually is. * Add an alignment argument to kmem_alloc_nofault() and vm_map_find(). * vm_map_findspace() already had an alignment argument, but adjust the value passed to be at least PAGE_SIZE (this has no operational effect but is more correct).
Revamp SYSINIT ordering. Relabel sysinit IDs (SI_* in sys/kernel.h) to make them less confusing, particularly with regard to the relative order init routines are called in. Reorder many sysinits. Reorder the SMP and CLOCK code to bring all the cpus up far earlier in the boot sequence and to make the full threading and clocking subsystems available for device config.
Make kernel_map, buffer_map, clean_map, exec_map, and pager_map direct structural declarations instead of pointers. Clean up all related code, in particular kmem_suballoc(). Remove the offset calculation for kernel_object. kernel_object's page indices used to be relative to the start of kernel virtual memory in order to improve the performance of VM page scanning algorithms. The optimization is no longer needed now that VM objects use Red-Black trees. Removal of the offset simplifies a number of calculations and makes the code more readable.
Fix a SFBUF memory leak in sendfile(). We were not properly tracking references which would lead to SFBUFs not getting freed when two or more sendfile()'s are operating on the same file at the same time (e.g. parallel ftp downloads of the same file). Get rid of the sf_buf->aux1 and aux2 hacks for sendfile. Add a sysctl to allow the number of free SFBUFs to be monitored.