kernel - Implement segment pmap optimizations for x86-64 * Implement 2MB segment optimizations for x86-64. Any shared read-only or read-write VM object mapped into memory, including physical objects (so both sysv_shm and mmap), which is a multiple of the segment size and segment-aligned can be optimized. * Enable with sysctl machdep.pmap_mmu_optimize=1 Default is off for now. This is an experimental feature. * It works as follows: A VM object which is large enough will, when VM faults are generated, store a truncated pmap (PD, PT, and PTEs) in the VM object itself. VM faults whos vm_map_entry's can be optimized will cause the PTE, PT, and also the PD (for now) to be stored in a pmap embedded in the VM_OBJECT, instead of in the process pmap. The process pmap then creates PT entry in the PD page table that points to the PT page table page stored in the VM_OBJECT's pmap. * This removes nearly all page table overhead from fork()'d processes or even unrelated process which massively share data via mmap() or sysv_shm. We still recommend using sysctl kern.ipc.shm_use_phys=1 (which is now the default), which also removes the PV entries associated with the shared pmap. However, with this optimization PV entries are no longer a big issue since they will not be replicated in each process, only in the common pmap stored in the VM_OBJECT. * Features of this optimization: * Number of PV entries is reduced to approximately the number of live pages and no longer multiplied by the number of processes separately mapping the shared memory. * One process faulting in a page naturally makes the PTE available to all other processes mapping the same shared memory. The other processes do not have to fault that same page in. * Page tables survive process exit and restart. * Once page tables are populated and cached, any new process that maps the shared memory will take far fewer faults because each fault will bring in an ENTIRE page table. Postgres w/ 64-clients, VM fault rate was observed to drop from 1M faults/sec to less than 500 at startup, and during the run the fault rates dropped from a steady decline into the hundreds of thousands into an instant decline to virtually zero VM faults. * We no longer have to depend on sysv_shm to optimize the MMU. * CPU caches will do a better job caching page tables since most of them are now themselves shared. Even when we invltlb, more of the page tables will be in the L1, L2, and L3 caches. * EXPERIMENTAL!!!!!
Revamp SYSINIT ordering. Relabel sysinit IDs (SI_* in sys/kernel.h) to make them less confusing, particularly with regard to the relative order init routines are called in. Reorder many sysinits. Reorder the SMP and CLOCK code to bring all the cpus up far earlier in the boot sequence and to make the full threading and clocking subsystems available for device config.
Major kernel build infrastructure changes, part 1/2 (sys). These changes are primarily designed to create a 2-layer machine and cpu build hierarchy in order to support virtual kernel builds in the near term and future porting efforts in the long term. * Split arch/ into a set of platform architectures under machine/ and a set of cpu architectures under cpu/. All platform and cpu header files will be accessible via <machine/*.h>. Platform header files may override cpu header files (the platform header file then typically #include's the cpu header file). * Any cpu header files that are not overridden will be copied directly into /usr/include/machine/, allowing the platform to omit those header files (not have to create degenerate forwarding header files). * All source files access platform and cpu architecture files via the <machine/*.h> path. The <cpu/*.h> path should only be used by platform header files when including the lower level cpu header files. * Require both the 'machine' and the 'machine_arch' directives in the kernel config file. * When building modules in the presence of a kernel config, use the IF files, use*.h files, and opt*.h files provided by the kernel config and do not generate them in each module's object directory. This streamlines the module build considerably.
Do a major clean-up of the BUSDMA architecture. A large number of essentially machine-independant drivers use the structures and definitions in machine-dependant directories that are really machine-independant in nature. Split <machine/bus_dma.h> into machine-depdendant and machine-independant parts and make the primary access run through <sys/bus_dma.h>. Remove <machine/bus.h>, <machine/bus_memio.h> and <machine/bus_pio.h>. The optimizations related to bus_memio.h and bus_pio.h made a huge mess, introduced machine-specific knowledge into essentially machine-independant drivers, and required specific #include file orderings to do their job. They may be reintroduced in some other form later on. Move <machine/resource.h> to <sys/bus_resource.h>. The contents of the file is machine-independant or can be made a superset across many platforms. Make <sys/bus.h> include <sys/bus_dma.h> and <sys/bus_resource.h> and include <sys/bus.h> where necessary. Remove all #include's of <machine/resource.h> and <machine/bus.h>. That is, make the BUSDMA infrastructure integral to I/O-mapped and memory-mapped accesses to devices and remove a large chunk of machine-specific dependancies from drivers. bus_if.h and device_if.h are now required to be present when using <sys/bus.h>.
Further normalize the _XXX_H_ symbols used to conditionalize header file inclusion. Use _MACHINE_BLAH_H_ for headers found in "/usr/src/sys/arch/<arch>/include". Most headers already did this, but some did not. Use _ARCH_SUBDIR_BLAH_H_ for headers found in "/usr/src/sys/arch/<arch>/subdir " instead of _I386_SUBDIR_BLAH_H_. Change #include's made in architecture-specific directories to use <machine/blah.h> instead of "blah.h", allowing the included header files to be overrdden by another architecture. For example, a virtual kernel architecture might include a header from arch/i386/include which then includes some other header in arch/i386/include. But really we want that other header to also go via the arch/vkernel/include, so the header files in arch/i386/include must use <machine/blah.h> instead of "blah.h" for most of their sub-includes. Change most architecture-specific includes such as <i386/icu/icu.h> to use a generic path through the "arch" softlink, such as <arch/icu/icu.h>. Remove the temporary -I@/arch shim made in a recent commit, the <arch/...> mechanism replaces it. These changes allow us to implement heirarchical architectural overrides, primarily intended for virtual kernel support. A virtual kernel uses an architecture of 'vkernel' but must be able to access actual cpu-specific header files such as those found in arch/i386. It does this using a "cpu" softlink. For example, someone including <machine/atomic.h> in a vkernel build would hit the "arch/vkernel/include/atomic.h" header, and this header could then #include <cpu/atomic.h> to access the actual cpu's atomic.h file: "arch/i386/include/atomic.h". The ultimate effect is that an architecture can build on another architecture's header and source files.
acpica5 update part 2/3: Fix a bug introduced in the original acpica5 port from FreeBSD-5. FreeBSD-5 no longer uses a VM object for the page table but we still do. This fixes a panic that occurs when waking up from a sleep mode. Submitted-by: YONETANI Tomokazu <qhwt+dragonfly-submit@les.ath.cx>