kernel - Refactor smp collision statistics * Add an indefinite wait timing API (sys/indefinite.h, sys/indefinite2.h). This interface uses the TSC and will record lock latencies to our pcpu stats in microseconds. The systat -pv 1 display shows this under smpcoll. Note that latencies generated by tokens, lockmgr, and mutex locks do not necessarily reflect actual lost cpu time as the kernel will schedule other threads while those are blocked, if other threads are available. * Formalize TSC operations more, supply a type (tsc_uclock_t and tsc_sclock_t). * Reinstrument lockmgr, mutex, token, and spinlocks to use the new indefinite timing interface.
ifnet: Make ifnet and ifindex2ifnet MPSAFE - Accessing to these two global variables from non-netisr threads uses ifnet lock. This kind of accessing is from - Accessing to ifindex2ifnet from netisrs are lockless MPSAFE. - Netisrs no longer access ifnet, instead they access ifnet array as of this commit, which is lockless MPSAFE. Rules for accessing ifnet and ifindex2ifnet is commented near the declaration of the related global variables/functions in net/if_var.h.
ifsq: Let ifaltq_subque know its related hardware TX queue's serializer This avoids following operations on packet transmission hot path: - Dereferening device driver supplied serialize function pointers - Locating hardware TX queue's serializer Comparing to the lwkt_serialize functions, the above two operations are costful. Driver changes: - For device drivers which use the default ifnet serializer, no additional code will be needed, if_attach() will assign ifnet serializer to ifaltq_subque. - For device drivers which use independent serializers for main function, RX queues and TX queues, ifsq_set_hw_serialize() must be called to properly assign the hardware TX queue's serializer to ifaltq_subque. Drivers in this category are bce(4), emx(4), igb(4) and jme(4).
if: Multiple TX queue support step 3 of 3; map CPUID to subqueue Add CPUID to subqueue mapping method to ifaltq. Driver could provide its own CPUID to subqueue mapping method through ifnet.if_mapsubq, which is used when ALTQ's packet scheduler is not enabled. ALTQ's packet schedulers always map CPUID to the default subqueue.
if: Multiple TX queue support step 1 of many; introduce ifaltq subqueue Put the plain queue information, e.g. queue header and tail, serializer, packet staging scoreboard and ifnet.if_start schedule netmsg etc. into its own structure (subqueue). ifaltq structure could have multiple of subqueues based on the count that drivers can specify. Subqueue's enqueue, dequeue, purging and states updating are protected by the subqueue's serializer, so for hardwares supporting multiple TX queues, contention on queuing operation could be greatly reduced. The subqueue is passed to if_start to let the driver know which hardware TX queue to work on. Only the related driver's TX queue serializer will be held, so for hardwares supporting multiple TX queues, contention on driver's TX queue serializer could be greatly reduced. Bunch of ifsq_ prefixed functions are added, which is used to perform various operations on subqueues. Commonly used ifq_ prefixed functions are still kept mainly for the drivers which do not support multiple TX queues (well, these functions also ease the netif/ convertion in this step :). All of the pseudo network devices under sys/net are converted to use the new subqueue operation. netproto/802_11 is converted too. igb(4) is converted to use the new subqueue operation, the rest of the network drivers are only changed for the if_start interface modification. For ALTQs which have packet scheduler enabled, only the first subqueue is used (*). (*) Whether we should utilize multiple TX queues if ALTQ's packet scheduler is enabled is quite questionable. Mainly because hardware's multiple TX queue packet dequeue mechanism could have negative impact on ALTQ's packet scheduler's decision.