kernel - Add usched_dfly algorith, set as default for now (7)
* Reenable weight2 (the process pairing heuristic) and fix the
edge cases associated with it.
* Change the process pulling behavior. Now we pull the 'worst' thread
from some other cpu instead of the best (duh!), we only pull when a
cpu winds up with no designated user threads, or we pull via a
schedulerclock-implemented rover.
The schedulerclock-implemented rover will allow ONE cpu to pull the
'worst' thread across all cpus (with some locality) once every
round-robin ticks (4 scheduler ticks).
The rover is responsible for taking excess processes that are unbalancing
one or more cpu's (for example, you have 6 running batch processes and
only 4 cpus) and slowly moving them between cpus. If we did not do this
the 'good' processes running on the unbalanced cpus are put at an unfair
disadvantage.
* This should fix all known edge cases, including ramp-down edge cases.