kernel - Optimize bcopy, bzero, memset
* Use __builtin_memset() for bzero() and __builtin_memmove()
for bcopy().
- Must use _bcopy in a few places where GCC complains about
structural punning. Even casting doesn't help.
- GCC's __builtin_memset() and __builtin_memmove() has a side
effect where it assumes that the pointer arguments cannot be
NULL. In fact, they can be NULL when the byte count is 0.
This assumption by GCC causes later unrelated conditionals
on the pointers against NULL to be improperly optimized-out.
We had to fix one place where this blew the system up.
* Implement memset() in assembly (remove from libkern).
* Implement memmove() in assembly (remove from libkern).