em(4): Avoid allocating a csum offloading TX desc whenever possible.
According to Intel's PCIe GbE Controllers Open Source Software
Developer's Manual Revision 1.8: a csum offloading TX desc will
prevent TX data read requests from being pipelined, thus reduce TX
performance. The pipelining effect is not obvious when transmitting
bulk data (e.g. 1472 bytes UDP datagram), but it could be dominant
when transmitting tiny packets. So we should avoid allocating a
csum offloading TX desc whenever possible to take advantage of the
pipelining effect.
On 82573E_IAMT,
Before this commit: ~700Kpps
After this commit: ~990Kpps
The funny thing about this commit is:
Old driver code from Intel's FreeBSD driver 6.2.9 roughly did what
we are doing in this commit, while Intel's FreeBSD driver 6.9.6
simply follows Linux's way to flush the performance to the toilet ...