polling: Implement direct input support.
When "direct input" is enabled by driver, driver's RX polling handler
will run ethernet/ip/tcp processing directly, which avoids cache-miss
on mbufs themselves. Currently it is enabled on ix(4) by default.
The normal IP forwarding performance is improved by %12, while the fast
IP forwarding performance is improved by 10%. 13.2Mpps is achieved for
dual side IP forwarding!
1 request/connection HTTP/1.1 performance and avg-latency stay same,
but the latency is further stablized:
1K 5.20ms -> 4.60ms
8K 6.43ms -> 5.76ms
16K 16.30ms -> 14.90ms