tcp: Aggregate the mbuf in sosendtcp() a little bit
This greatly reducse the ipi interrupts caused by ipi sending (both
domsg and sendmsg), thus improves overall performance a bit.
net.inet.tcp.sosnd_agglim is added to tune how much mbuf should be
aggregated; it is default to 2. Setting this sysctl to 1 restores
the old behaviour: one full mbuf at a time.
On Phenom 9550 (4 core, 2.2GHz):
8 parallel netperf -H 127.0.0.1 -P0 (4 runs, unit; Mbps)
IPIs/s (sum of 4 core)
1 mbuf 6735.98 6903.13 6971.89 7056.66 ~400K
2 mbuf 7675.47 7757.28 7815.45 7514.50 ~240K
4 mbuf 7895.33 7584.22 7704.12 7723.33 ~180K
8 mbuf 8006.94 8077.87 7701.23 8061.12 ~120K
16 mbuf 8151.68 8023.03 7972.42 8046.13 ~100K
The default value (2) for the sosnd_agglim improve the whole
performance by ~10%. IPI rate is also reduce greatly.
It has no obvious impact on 1000BaseT or 100baseTX network performace.