ipfw: Rework states and tracks.
- Use RB tree for states and tracks. And put them into their own RB
trees. This avoid worst case hash collision.
- Make states per-cpu. Upper limit is still shared, and is managed in
the same fashion as our slab allocator's upper limit, i.e. loosely
updated, which allows 5% over-allocation at most.
- Use two tiers for tracks. The top tier is shared, which maintains
the counter. The second tier is per-cpu, most of the track looking
up should be coverd by this tier. Track counters are updated by
atomic ops, since per-track upper limit is usually too small to use
loose updating.
- Implement progressive state/track expiration and keepalive. It is
mainly intended to make the packet processing latency more smooth.
- Fix fast TCP state recycling issue by tracking the SEQs in addition
to the ACKs.
This drastically improves performance, and reduces/stablizes latency.
For exmaple, nginx, 1KB web object, 30K concurrent connections,
1 request/connection. ipfw is running on the server side.
ipfw non-default setting:
- Max # of states for new-ipfw is 100K (~14MB memory).
- Max # of states for old-ipfw is 500K, and # of hash buckets is 64K.
ipfw rules:
ipfw add 1 check-state
ipfw add allow tcp from any to me 80 setup keep-state
(default deny)
| perf-avg | lat-avg | lat-stdev | lat-99% | lat-max
| (tps) | (ms) | (ms) | (ms) | (ms)
---------+-----------+---------+-----------+---------+---------
no-ipfw | 210658.80 | 58.01 | 5.20 | 68.73 | 146.46
---------+-----------+---------+-----------+---------+---------
new-ipfw | 191626.58 | 64.74 | 5.69 | 75.87 | 166.08
---------+-----------+---------+-----------+---------+---------
old-ipfw | 43481.19 | 153.76 | 47.32 | 296.61 | 425.09
If it is compared w/ no-ipfw case, the performance and latency impacts
of the ipfw after this commit are pretty small.