kernel - extend cpus past 64 - fixes and adjustments
* Reorder the SMP cpu boot code to remove a great deal of lock contention.
The APs must still loop waiting for the BSP to adjust the stage, but
they no longer need to hold a token or spinlock so startup under emulation
is considerably faster.
* Do not initialize our systimer periodics on each target cpu from the
idle thread bootstrap. Previously with the MP lock held the locks acquired
during this initialization were serialized and could not block. Now
that cpu startup runs mostly concurrently, that is no longer the
case.
Instead, systimer periodics are handled by process 0 as a post-smp-startup
call.
* statclock() now uses sys_cputimer() directly to calculate the delta time.
* The TSC is now implemented as sys_cputimer before any systimer periodics
(particularly statclock()) are set-up, allowing the system to take control
away from the i8254 earlier.
* Clean up struct lwkt_ipiq. Remove the 'lwkt_ipiq' typedef. Calculate
allocation sizes separately.
* Add a new loader.conf tunable, hw.tsc_cputimer_force. If set to 1 and
a TSC is present, the system will force invariant and mpsync operation
and always use the TSC as the cputimer (primarily useful for qemu).
* Remove unnecessary kmem_alloc() of the globaldata structure. We are
using a static array now. This access was wasting memory for a long
time.
* Make the boot stack bigger for the APs.