Fix a number of SMP issues.
* Add required pause instructions in code paths that skip running "HLT". This
occurs when e.g. cpu #1 is running code with the BGL while cpu #2's only
runnable thread requires the BGL. cpu #2's LWKT scheduler spins in that
case. Similar situations can occur when the only runnable threads on a
cpu are waiting for a token.
* Add required pause instructions to spin loops. DragonFly has very few spin
locks (e.g. things like com_lock()) but its a good idea anyway to avoid
known livelock issues on Intel cpus.
* Fix a pending interrupt / HLT race. We were not atomically retiring
pending interrupts prior to potentially HLTing the cpu. This could
result in an SMP machine's network locking up until a key is hit on
the console, then magically resuming.
Lockups-reported-by: Peter Avalos <pavalos@theshell.com>