kernel - Add support for up to 63 cpus & 512G of ram for 64-bit builds.
* Increase SMP_MAXCPU to 63 for 64-bit builds.
* cpumask_t is 64 bits on 64-bit builds now. It remains 32 bits on 32-bit
builds.
* Add #define's for atomic_set_cpumask(), atomic_clear_cpumask, and
atomic_cmpset_cpumask(). Replace all use cases on cpu masks with
these functions.
* Add CPUMASK(), BSRCPUMASK(), and BSFCPUMASK() macros. Replace all
use cases on cpu masks with these functions.
In particular note that (1 << cpu) just doesn't work with a 64-bit
cpumask.
Numerous bits of assembly also had to be adjusted to use e.g. btq instead
of btl, etc.
* Change __uint32_t declarations that were meant to be cpu masks to use
cpumask_t (most already have).
Also change other bits of code which work on cpu masks to be more agnostic.
For example, poll_cpumask0 and lwp_cpumask.
* 64-bit atomic ops cannot use "iq", they must use "r", because most x86-64
do NOT have 64-bit immediate value support.
* Rearrange initial kernel memory allocations to start from KvaStart and
not KERNBASE, because only 2GB of KVM is available after KERNBASE.
Certain VM allocations with > 32G of ram can exceed 2GB. For example,
vm_page_array[]. 2GB was not enough.
* Remove numerous mdglobaldata fields that are not used.
* Align CPU_prvspace[] for now. Eventually it will be moved into a
mapped area. Reserve sufficient space at MPPTDI now, but it is
still unused.
* When pre-allocating kernel page table PD entries calculate the number
of page table pages at KvaStart and at KERNBASE separately, since
the KVA space starting at KERNBASE caps out at 2GB.
* Change kmem_init() and vm_page_startup() to not take memory range
arguments. Instead the globals (virtual_start and virtual_end) are
manipualted directly.