kernel - more kmalloc and nlookup performance optimizations
* Give the pcpu counters in struct malloc_type their own cache line per
cpu. This removes a large kmalloc/kfree bottleneck on multi-socket
systems
* Avoid having to ref, lock, and GETATTR intermediate directory components
in nlookup() by adding the NCF_WXOK flag. This flag is set in the ncp
when the directory permissions are at least 555. This saves significant
overhead in all situations, including single-threaded.
Discussed-with: Mateusz Guzik (mjg_)