tcp: Let sosendtcp() call tcp_usrreq.pru_send asynchronous
- Embed netmsg_pru_send into mbuf.m_hdr, which shares the space with
netmsg_pru_packet.
- Use the newly added netmsg_pru_send in mbuf to perform asynchronous
pru_send. For asynchronous pru_send, PRUS_NOREPLY is added, which
prevents pru_send to reply the message.
- In sosendtcp(), if we have more data to call pru_send, we call it
asynchronously. The last piece of data or OOB data will still be
passed to pru_send synchronously.
On Phenom 9550 (4 core, 2.2GHz):
8 parallel netperf -H 127.0.0.1 (4 runs, unit: Mbps)
old 5863.85 5773.13 5534.14 5506.72
new 6735.98 6903.13 6971.89 7056.66
This give ~20% performance improvement.
It has no obvious impact on 1000BaseT or 100baseTX network performace.