dragonfly.git
5 years agokernel/iscsi: Do not conditionalize on undefined names.
Sascha Wildner [Sat, 22 Dec 2012 19:27:36 +0000 (20:27 +0100)]
kernel/iscsi: Do not conditionalize on undefined names.

5 years agokernel/procfs: Silence gcc47's whining.
Sascha Wildner [Sat, 22 Dec 2012 12:19:56 +0000 (13:19 +0100)]
kernel/procfs: Silence gcc47's whining.

5 years agoifq_dispatch: Avoid accessing the mbuf after it has been queued to if_snd
Sepherosa Ziehau [Sat, 22 Dec 2012 15:32:20 +0000 (23:32 +0800)]
ifq_dispatch: Avoid accessing the mbuf after it has been queued to if_snd

The enqueued mbuf could have be freed when it is used to update the stats.

5 years agoarcmsr(4): Add missing D_MPSAFE (forgot when porting).
Sascha Wildner [Fri, 21 Dec 2012 07:49:09 +0000 (08:49 +0100)]
arcmsr(4): Add missing D_MPSAFE (forgot when porting).

5 years agokernel/mmc: Remove an unused variable.
Sascha Wildner [Sat, 22 Dec 2012 11:40:20 +0000 (12:40 +0100)]
kernel/mmc: Remove an unused variable.

5 years agoarcmsr(4): Remove some dead code and an unused variable.
Sascha Wildner [Fri, 21 Dec 2012 07:20:11 +0000 (08:20 +0100)]
arcmsr(4): Remove some dead code and an unused variable.

Interrupts won't be enabled again in arcmsr_shutdown().

5 years agoifq_dispatch: If mbuf can't be enqueued and ifq has data; kick if_start
Sepherosa Ziehau [Fri, 21 Dec 2012 05:50:20 +0000 (13:50 +0800)]
ifq_dispatch: If mbuf can't be enqueued and ifq has data; kick if_start

5 years agoip_forward: Optimize out the mbuf allocation for ICMP messages
Sepherosa Ziehau [Fri, 21 Dec 2012 05:00:12 +0000 (13:00 +0800)]
ip_forward: Optimize out the mbuf allocation for ICMP messages

A per-netisr mbuf template is used to save the necessary information
for later ICMP messages; this avoids unnecessary mbuf allocation
on forwarding path.  The mbufs for ICMP messages are allocated only
when ICMP messages do need to be sent.

Inspired-by: OpenBSD ip_forward
5 years agoip_output: Don't drop packet based on if_snd queue length
Sepherosa Ziehau [Thu, 20 Dec 2012 08:29:19 +0000 (16:29 +0800)]
ip_output: Don't drop packet based on if_snd queue length

Later if_snd queue enqueuing should make this decision

5 years agoinstaller: Remove CAPS remains.
Sascha Wildner [Thu, 20 Dec 2012 04:54:44 +0000 (05:54 +0100)]
installer: Remove CAPS remains.

5 years agoshare/Makefile: Break at initial letter boundary and sort terminfo.
Sascha Wildner [Mon, 17 Dec 2012 19:05:03 +0000 (20:05 +0100)]
share/Makefile: Break at initial letter boundary and sort terminfo.

5 years agoinstaller: Use the LiveDVD's pfi.conf generally (works for LiveCD too).
Sascha Wildner [Mon, 17 Dec 2012 17:23:13 +0000 (18:23 +0100)]
installer: Use the LiveDVD's pfi.conf generally (works for LiveCD too).

5 years agoRevert "mknod(2): Restrict functionality to creating FIFOs."
Sascha Wildner [Thu, 20 Dec 2012 03:54:32 +0000 (04:54 +0100)]
Revert "mknod(2): Restrict functionality to creating FIFOs."

This reverts commit d5056fe0532f6e09c1c52b6384f3ef6e6db77a68.

After the commit, stuff like cpdup, tar, etc. used on dev would
start whining when before they would create the nodes, even
though they were not actually usable.

Since the potential breakage in external software is unknown, make
up my mind and go with the lower risk approach, even if it is kind
of pointless.

The mknod(8) utility is left deleted.

5 years agoRemove the mknod(8) utility.
Sascha Wildner [Thu, 20 Dec 2012 03:09:25 +0000 (04:09 +0100)]
Remove the mknod(8) utility.

Now that we have devfs(5), it's of no use anymore.

5 years agomknod(2): Restrict functionality to creating FIFOs.
Sascha Wildner [Thu, 20 Dec 2012 02:57:05 +0000 (03:57 +0100)]
mknod(2): Restrict functionality to creating FIFOs.

Now that we have devfs(5) for handling our device nodes, we can retire
part of mknod(2) functionality and restrict it to what POSIX requires:

"The only portable use of mknod() is to create a FIFO-special file.
 If mode is not S_IFIFO or dev is not 0, the behavior of mknod() is
 unspecified."

In-discussion-with: beket

5 years agostandards.7: Add URLs for a couple of standards.
Sascha Wildner [Thu, 20 Dec 2012 02:26:49 +0000 (03:26 +0100)]
standards.7: Add URLs for a couple of standards.

5 years agomdoc: Add definition for XSH, Issue 4, Version 2.
Sascha Wildner [Thu, 20 Dec 2012 02:26:25 +0000 (03:26 +0100)]
mdoc: Add definition for XSH, Issue 4, Version 2.

5 years agotools/toeplitz: Force 0 padding in result printing
Sepherosa Ziehau [Wed, 19 Dec 2012 10:00:32 +0000 (18:00 +0800)]
tools/toeplitz: Force 0 padding in result printing

5 years agopktgenctl: Allow pktgen device to be specified
Sepherosa Ziehau [Wed, 19 Dec 2012 09:12:49 +0000 (17:12 +0800)]
pktgenctl: Allow pktgen device to be specified

5 years agopktgen: Create 4 device nodes by default
Sepherosa Ziehau [Wed, 19 Dec 2012 09:12:23 +0000 (17:12 +0800)]
pktgen: Create 4 device nodes by default

5 years agopktgen: This module is MPSAFE
Sepherosa Ziehau [Wed, 19 Dec 2012 09:07:21 +0000 (17:07 +0800)]
pktgen: This module is MPSAFE

While im here, nuke no longer needed CDEV_MAJOR

5 years agonetisr: Remove unused macros
Sepherosa Ziehau [Tue, 18 Dec 2012 13:53:12 +0000 (21:53 +0800)]
netisr: Remove unused macros

5 years agonetisr: Add priority for netisr "rollup" functions
Sepherosa Ziehau [Tue, 18 Dec 2012 13:47:39 +0000 (21:47 +0800)]
netisr: Add priority for netisr "rollup" functions

Higher priority "rollup" will be run first.  This mechanism auguments
the original "rollup" functionality which now could be used to implement
things like transmission packets aggregation and software TCP LRO.

5 years agopolling: Increase default polling rate to 6000Hz
Sepherosa Ziehau [Tue, 18 Dec 2012 09:38:51 +0000 (17:38 +0800)]
polling: Increase default polling rate to 6000Hz

This increases the bidirational normal IP forwarding rate by 30~40Kpps

5 years agoif_start: Fix indentation
Sepherosa Ziehau [Tue, 18 Dec 2012 08:26:46 +0000 (16:26 +0800)]
if_start: Fix indentation

5 years agoinstaller: Always take the root directory's /dev.
Sascha Wildner [Mon, 17 Dec 2012 21:13:38 +0000 (22:13 +0100)]
installer: Always take the root directory's /dev.

Taking /dev relative to the directory we want to copy from was fine
until we got devfs, because we shipped actual device nodes in /dev
until then.

It only continued working because the directory we copy from is always
the distribution media's root directory currently.

5 years agokernel/atm: Fix wrong rt_tables[] access.
Sascha Wildner [Mon, 17 Dec 2012 08:25:33 +0000 (09:25 +0100)]
kernel/atm: Fix wrong rt_tables[] access.

5 years agoif_start: Fix a race that could delay the packets transmission
Sepherosa Ziehau [Mon, 17 Dec 2012 04:18:08 +0000 (12:18 +0800)]
if_start: Fix a race that could delay the packets transmission

Since if_start_need_schedule is called w/ the cached IFF_OACTIVE out
side of ifnet's TX serializer, there could be a race that IFF_OACTIVE
could be cleared before if_start_need_schedule but after releasing
ifnet's TX serializer.  This could delay already queued packets
transmission until the new packet is coming.  Fix this race by calling
if_start_need_schedule inside ifnet's TX serializer.

5 years agoRemove VFS_INIT(9) manpage via 'make upgrade'.
Sascha Wildner [Sun, 16 Dec 2012 21:07:37 +0000 (22:07 +0100)]
Remove VFS_INIT(9) manpage via 'make upgrade'.

5 years agosglist.9: Add a missing include to the SYNOPSIS.
Sascha Wildner [Sun, 16 Dec 2012 20:37:55 +0000 (21:37 +0100)]
sglist.9: Add a missing include to the SYNOPSIS.

5 years agoRemove VFS_INIT.9 manual page, there is no such macro.
Sascha Wildner [Sun, 16 Dec 2012 18:38:39 +0000 (19:38 +0100)]
Remove VFS_INIT.9 manual page, there is no such macro.

5 years agoVFS_MOUNT.9: Adjust to the current state in /usr/src.
Sascha Wildner [Sun, 16 Dec 2012 18:30:57 +0000 (19:30 +0100)]
VFS_MOUNT.9: Adjust to the current state in /usr/src.

5 years agoUpdate the pciconf(8) database.
Sascha Wildner [Sat, 15 Dec 2012 10:31:44 +0000 (11:31 +0100)]
Update the pciconf(8) database.

December 13, 2012 snapshot from http://pciids.sourceforge.net/

5 years agohammer2 - Split flush code out into its own source file
Matthew Dillon [Sat, 15 Dec 2012 07:26:08 +0000 (23:26 -0800)]
hammer2 - Split flush code out into its own source file

* Add hammer2_flush.c, move flush code into its own source file.

5 years agokernel - Fix buffer cache mismatch assertion (hammer)
Matthew Dillon [Fri, 14 Dec 2012 22:42:52 +0000 (14:42 -0800)]
kernel - Fix buffer cache mismatch assertion (hammer)

* Fix an issue where cluster_write() could instantiate buffers with
  the wrong buffer size.  Only effects HAMMER1 which uses two different
  buffer sizes for files.

* Bug could cause a mismatched buffer size assertion in the kernel.

5 years agolibkiconv: Remove unneeded SHLIBDIR in the Makefile.
Sascha Wildner [Fri, 14 Dec 2012 21:33:56 +0000 (22:33 +0100)]
libkiconv: Remove unneeded SHLIBDIR in the Makefile.

5 years agokernel - ufs softdep fix under heavy load
Matthew Dillon [Fri, 14 Dec 2012 21:05:00 +0000 (13:05 -0800)]
kernel - ufs softdep fix under heavy load

Fix is from OpenBSD ffs_softdep.c v1.79, originally from FreeBSD
ffs_softdep.c 1.196. Commit message from OpenBSD:

"due to ffs_sync not be able to sync some buffers here is another
instance of softdep code that must ensure proper syncing.
try harder to flush MKDIR_BODY dependancy if such still exists
during pagedep flush (that is by syncing first block of the dir)."

5 years agowlandebug.8: Don't reference manual pages which we don't have.
Sascha Wildner [Fri, 14 Dec 2012 16:59:38 +0000 (17:59 +0100)]
wlandebug.8: Don't reference manual pages which we don't have.

Instead, point at the tools directories in our tree.

5 years agopolling: Increase default rx.each_burst to 50
Sepherosa Ziehau [Fri, 14 Dec 2012 09:15:49 +0000 (17:15 +0800)]
polling: Increase default rx.each_burst to 50

With this default the CPU usage still could be throttled to the desired
value (rx.user_frac), it gives reasonable burst for modern systems and
number of empty RX polling is reduced.

5 years agopolling: Diverge each CPU's polling frequency a little bit (within 50Hz)
Sepherosa Ziehau [Fri, 14 Dec 2012 08:50:04 +0000 (16:50 +0800)]
polling: Diverge each CPU's polling frequency a little bit (within 50Hz)

This avoids possible thunder hurd effect on ifnet.if_snd's serializer.

5 years agohammer2 - bigger stabilization & performance pass
Matthew Dillon [Fri, 14 Dec 2012 07:23:21 +0000 (23:23 -0800)]
hammer2 - bigger stabilization & performance pass

* Fix additional namecache bogons that could result in a crash.

* Fix volume header synchronization.  It was possible for the voldata
  structure to be modified after its crc had been calculated but before
  its write.  A mount after a crash would then refuse to use the volume
  header.

* Each flush now iterates available volume header backups instead of
  just writing to block 0.  (The mount code selects the most recent
  valid volume header from available backups.  There is nothing special
  about the volume header in block 0).

* Fix volume header flush staging, the fsync of the device buffers
  was not ensuring a complete flush before synchronizing the volume
  header.

5 years agoipfw: Don't spam the log if dynamic rules allocation failed
Sepherosa Ziehau [Fri, 14 Dec 2012 05:05:43 +0000 (13:05 +0800)]
ipfw: Don't spam the log if dynamic rules allocation failed

There is high chance that kmalloc w/ M_NOWAIT fails; no need to bark

5 years agohammer2 - small stabilization & performance pass
Matthew Dillon [Fri, 14 Dec 2012 03:18:33 +0000 (19:18 -0800)]
hammer2 - small stabilization & performance pass

* Change a few bawrite()'s to bdwrite()'s to the bufdaemon can cluster
  the meta-data.  This greatly improves continuous flushes (e.g. when
  running blogbench).

* Fix a number of improper cache_setvp() calls which can cause the
  system to panic later.  Basically the vnode was being cleared without
  the namecache error getting set (to ENOENT).  Easiest solution is to
  just not call cache_setvp() for those cases in the first place and
  just leave the ncp unresolved.

5 years agopktgen: Clear ip.ip_sum before calling in_cksum_hdr()
Sepherosa Ziehau [Fri, 14 Dec 2012 01:45:51 +0000 (09:45 +0800)]
pktgen: Clear ip.ip_sum before calling in_cksum_hdr()

5 years agohammer2 - redo the flush collision handling
Matthew Dillon [Thu, 13 Dec 2012 23:26:28 +0000 (15:26 -0800)]
hammer2 - redo the flush collision handling

* Add HAMMER2_CHAIN_ONRBTREE and change HAMMER2_CHAIN_DELETED.  The DELETED
  flag now indicates a chain has been deleted (as in unlink, rmdir, truncate,
  etc) but could not be removed due to a conflicting flush.

  DELETED means something different from dropped chains with 0 refs which
  wind up sticking around due to the lastdrop code not being able to
  acquire a lock on the parent or colliding with a flush.  0-ref chains
  can be considered to cache clean media state when it comes down to it.
  (modified chains have a ref and so don't hit the lastdrop code).

* The flush code is now able to reliably unlock the chain parent when
  processing a child.

* Clean up a number of flush cases.

* Began to add error handling in the hammer2_chain_create() path.  The path
  now handles EAGAIN errors when insertions would collide with a flush.
  (The wait/retry code is currently just a sleep/retry).

* Removed the MAYDELETE junk.  It was too junky.

*

5 years agokernel: Remove USERFS.
Sascha Wildner [Thu, 13 Dec 2012 18:05:39 +0000 (19:05 +0100)]
kernel: Remove USERFS.

It's an old project of Matt, but won't be continued anymore.

Approved-by: dillon
5 years agohammer2 - Cleanup various races, better flush
Matthew Dillon [Thu, 13 Dec 2012 08:08:05 +0000 (00:08 -0800)]
hammer2 - Cleanup various races, better flush

* Cleanup various topological scan races

* Temporarily release the parent lock when diving a child.

* Start work on a chain movement (disconnect from parent) interlock,
  which we need for stability during flushes, renames, and hardlink
  operations.  This will likely be rewritten.

5 years agokernel - Fix sync() system call
Matthew Dillon [Thu, 13 Dec 2012 07:19:43 +0000 (23:19 -0800)]
kernel - Fix sync() system call

* The sync() system call was syncing the filesystems MNT_NOWAIT | MNT_LAZY.
  We need the MNT_NOWAIT to avoid an endless sync on a busy filesystem, but
  MNT_LAZY is another issue entirely.

* Remove the MNT_LAZY from the sync() system call, it can cause whole
  files to not be synced.  It is meant only to be used by the automatic
  kernel 30-second sync (which eventually gets everything flushed out).

5 years agothread: Add td_type field; this avoids blowout td_flags w/ type flags
Sepherosa Ziehau [Thu, 13 Dec 2012 06:12:16 +0000 (14:12 +0800)]
thread: Add td_type field; this avoids blowout td_flags w/ type flags

- Replace TDF_CRYPTO with TD_TYPE_CRYPTO
- All lwkt threads are created as TD_TYPE_GENERIC
- Netisr threads now mark themselves as TD_TYPE_NETISR

Discussed-with: sjg@ and dillon@
Approved-by: dillon@
5 years agokernel - Fix missing B_ORDERED inheritance
Matthew Dillon [Thu, 13 Dec 2012 04:09:45 +0000 (20:09 -0800)]
kernel - Fix missing B_ORDERED inheritance

* The cluster code was not inheriting B_ORDERED on buffers when constructing
  the rollup buffer due to a coding error.

* I don't think anything uses B_ORDERED so this shouldn't matter, but fix
  it anyway.

Reported-by: vsrinivas
5 years agokernel -- ffs: ufs_ihash may not match for vnode recycling reasons.
Venkatesh Srinivas [Wed, 12 Dec 2012 20:17:50 +0000 (12:17 -0800)]
kernel -- ffs: ufs_ihash may not match for vnode recycling reasons.

* vnode recycling may cause VGET to not find an inode; handle the
  race as before, it was not a bug.

* print diradd pointer in ffs MKDIR_BODY panics; will help finding
  the panic, it is masked by the inode hash panic.

Reviewed-by: sjg
5 years agoRemove upc_{control,register} syscalls and everything that has to do with it.
Sascha Wildner [Wed, 12 Dec 2012 20:40:16 +0000 (21:40 +0100)]
Remove upc_{control,register} syscalls and everything that has to do with it.

It's no longer used for anything.

Requested-by: vsrinivas
Approved-by: dillon
5 years agobce: Disable RX max BDs based interrupt moderation
Sepherosa Ziehau [Tue, 11 Dec 2012 11:20:17 +0000 (19:20 +0800)]
bce: Disable RX max BDs based interrupt moderation

The RX max coalesce BDs is limited to 255, which means that the chip will
generate ~5800 interrupts/s when it sinks 1.48Mpps tiny packets.  However,
interrupt rate at 4500Hz is already enough for the chip to sink 1.48Mpps
tiny packets, so ticks based RX interrupt moderation should be prefered.

5 years agokernel/makesyscalls.sh: Fix copy/paste error.
Sascha Wildner [Wed, 12 Dec 2012 06:01:08 +0000 (07:01 +0100)]
kernel/makesyscalls.sh: Fix copy/paste error.

Reported-by: vsrinivas
5 years agokernel/makesyscalls.sh: Improve comment and regenerate all affected files.
Sascha Wildner [Tue, 11 Dec 2012 22:29:52 +0000 (23:29 +0100)]
kernel/makesyscalls.sh: Improve comment and regenerate all affected files.

5 years agokernel/makesyscalls.sh: Output a friendlier comment about how to regenerate.
Sascha Wildner [Tue, 11 Dec 2012 22:19:43 +0000 (23:19 +0100)]
kernel/makesyscalls.sh: Output a friendlier comment about how to regenerate.

5 years agokernel - Reduce the size of the callout wheel
Matthew Dillon [Mon, 10 Dec 2012 23:11:46 +0000 (15:11 -0800)]
kernel - Reduce the size of the callout wheel

* The callout wheel is per-cpu but ncallout is calculated based on memory.
  A system with many cpus tended to allocate an excessive amount of memory
  in aggregate for the callout wheels.

* Reduce the size of the per-cpu callout wheel by approximately a factor
  of (ncpus).  On a 16G machine with 8 cores, aggregate callout wheel
  allocations is reduced from 128MB to 16MB.

5 years agokernel - Fix softupdates panic with UFS
Matthew Dillon [Mon, 10 Dec 2012 23:02:01 +0000 (15:02 -0800)]
kernel - Fix softupdates panic with UFS

* If getdirtybuf() was unable to lookup a dirty buffer from the
  flush_pagedep_deps path, we need to retry the lookup, rather than
  proceeding through processing the diradd.

Reported-by: marino@
Submitted-by: vsrinivas
5 years agokernel - Remove unnecessary mplock from ata I/O path
Matthew Dillon [Mon, 10 Dec 2012 22:59:38 +0000 (14:59 -0800)]
kernel - Remove unnecessary mplock from ata I/O path

* ata_finish() doesn't need the MP Lock; moving it from taskqueue_swi
  to taskqueue_swi_mp will make sure it's not taken.

* Note that taskqueue_swi(_mp) will be processed when this ithread
  attempts to switch back to the thread it preempted. This means that
  taskqueue_swi(_mp) processing exclude processing further interrupts
  on the ithread while they're running. This may not be desirable and
  is different than taskqueue_swi / swi_* / setsoft* in FreeBSD 4.x.

Submitted-by: vsrinivas
5 years agokernel - Make UFS ihash table per-mount
Matthew Dillon [Mon, 10 Dec 2012 22:55:59 +0000 (14:55 -0800)]
kernel - Make UFS ihash table per-mount

* Make the UFS ihash table per-mount.

* Scale down the size of the hash table a bit so we have ~4 inodes per
  bucket instead of ~1.  Works fine for a single mount and this way
  multiple UFS mounts don't make [as] bloated kmalloc calls.

Submitted-by: vsrinivas
5 years agokernel - Fix bug (not reached in normal operation) in vm_map_set_wired_quick()
Matthew Dillon [Mon, 10 Dec 2012 22:37:28 +0000 (14:37 -0800)]
kernel - Fix bug (not reached in normal operation) in vm_map_set_wired_quick()

* Fix a bug where vm_map_set_wired_quick() only operated on the first
  vm_map_entry of a slab, and would panic if there were more.

* Slab allocations are only going to have one vm_map_entry anyway so the
  bug was never hit.  But we fix it anyway.

5 years agokernel - Fix debug output label
Matthew Dillon [Mon, 10 Dec 2012 22:35:39 +0000 (14:35 -0800)]
kernel - Fix debug output label

* Fix "rflags" to "eflags" in i386 kprintf() for smp_invltlb() debugging.

Reported-by: swildner
5 years agoFix buildkernel with 'options KTR' in the config.
Sascha Wildner [Mon, 10 Dec 2012 18:51:12 +0000 (19:51 +0100)]
Fix buildkernel with 'options KTR' in the config.

5 years agobge: Obey the RX polling count
Sepherosa Ziehau [Mon, 10 Dec 2012 12:31:28 +0000 (20:31 +0800)]
bge: Obey the RX polling count

5 years agobge: Avoid unnecessary local scope stack variable resetting
Sepherosa Ziehau [Mon, 10 Dec 2012 12:16:36 +0000 (20:16 +0800)]
bge: Avoid unnecessary local scope stack variable resetting

5 years agopolling: Add tunable for net.ifpoll.X.rx.user_frac
Sepherosa Ziehau [Mon, 10 Dec 2012 09:09:01 +0000 (17:09 +0800)]
polling: Add tunable for net.ifpoll.X.rx.user_frac

5 years agobnx: Avoid unnecessary local scope stack variable resetting
Sepherosa Ziehau [Mon, 10 Dec 2012 08:45:34 +0000 (16:45 +0800)]
bnx: Avoid unnecessary local scope stack variable resetting

5 years agoagp: Do not limit attachment to primary devices
François Tigeot [Sat, 8 Dec 2012 08:55:02 +0000 (09:55 +0100)]
agp: Do not limit attachment to primary devices

* PCIS_DISPLAY_VGA really corresponds to the first graphic device
  initialized by the BIOS at boot time.

* Recent Intel chips contain both AGP and graphic hardware, identified by
  the same PCI ids

* The agp device thus has no associated PCIS_DISPLAY_VGA flag when the
  Intel graphic device is not set as primary display in BIOS

* Tested with:
  - ATI Radeon X550 (primary graphic card)
  - Intel Xeon E3-1245v2 (agp device)

5 years agoagp: Add PCI ID for Ivy Bridge GT2 server variant
François Tigeot [Sat, 8 Dec 2012 07:40:29 +0000 (08:40 +0100)]
agp: Add PCI ID for Ivy Bridge GT2 server variant

5 years agoagp: A rewrite of the i810 bits of the driver
François Tigeot [Sat, 8 Dec 2012 22:31:50 +0000 (23:31 +0100)]
agp: A rewrite of the i810 bits of the driver

New driver supports operations required by GEMified i915.ko. It
also attaches to SandyBridge and IvyBridge CPU northbridges now.

Obtained-from: FreeBSD

5 years agobnx: Obey the RX polling count
Sepherosa Ziehau [Sun, 9 Dec 2012 13:12:09 +0000 (21:12 +0800)]
bnx: Obey the RX polling count

5 years agoigb: The RDT writing thresh should be tested w/ ">=" instead of ">"
Sepherosa Ziehau [Sun, 9 Dec 2012 13:08:08 +0000 (21:08 +0800)]
igb: The RDT writing thresh should be tested w/ ">=" instead of ">"

5 years agobnx: Remove redundant TX/RX index reading from polling code
Sepherosa Ziehau [Sun, 9 Dec 2012 12:15:49 +0000 (20:15 +0800)]
bnx: Remove redundant TX/RX index reading from polling code

5 years agokernel: Add cpu_wbinvd_on_all_cpus()
François Tigeot [Sun, 9 Dec 2012 08:38:08 +0000 (09:38 +0100)]
kernel: Add cpu_wbinvd_on_all_cpus()

This function invalidates cache on all active cpus

With-help-from: vsrinivas

5 years agopktgen: Fix csum_data setting for in_delayed_csum()
Sepherosa Ziehau [Sun, 9 Dec 2012 10:42:25 +0000 (18:42 +0800)]
pktgen: Fix csum_data setting for in_delayed_csum()

5 years agoigb: Improve tiny packets reception performance on low frequency CPU
Sepherosa Ziehau [Sun, 9 Dec 2012 06:36:20 +0000 (14:36 +0800)]
igb: Improve tiny packets reception performance on low frequency CPU

Update RDT register a little bit often, so the RX descriptors are made
to the NIC chip on a more regularly base:
The RDT register is updated after certain amount of RX descriptors are
added to the hardware RX ring.  The default value of the amount of RX
descriptors are 32.  This value could be further tuned by per-device
sysctl node hw.igbX.rxY_wreg.

The default value improves tiny packets reception performance w/ 82576
on AMD970@800Mhz under interrupt mode for single stream (1.28Mpps ->
1.48Mpps) and it does not increase CPU usage on AMD970@3500Mhz (CPU
usage stays @36%).

This commit does _not_ seem to affect the tiny packet reception
performance when the workload are evenly distributed to all CPUs.

5 years agokernel - Fix improper assertion panic in vinvalbuf()
Matthew Dillon [Sat, 8 Dec 2012 22:22:15 +0000 (14:22 -0800)]
kernel - Fix improper assertion panic in vinvalbuf()

* Related to the removal of vhold/vdrop from buffer cache buffers, the
  state of a vnode being cleaned can now contain more active buffers and
  I/O's at the time of the vinvalbuf() call.

* Remove the 'vinvalbuf: dirty bufs' assertion and panic.  It is no longer
  a correct assertion.  Note that we've also had sporatic reports of this
  panic even prior to the work so it might not have been a completely
  correct assertion before either.

* Rework the vinvalbuf buffer-flushing and I/O-waiting code a bit.  We
  have to wait for I/O at least once and it is probably a good idea to
  wait for I/O after each buffer flush pass too, to avoid live-locks.

Reported-by: vsrinivas
5 years agokernel: add VM_OBJECT_LOCK/UNLOCK macros
François Tigeot [Mon, 6 Aug 2012 05:41:50 +0000 (07:41 +0200)]
kernel: add VM_OBJECT_LOCK/UNLOCK macros

5 years agoDocumentation: MPASS conversion
François Tigeot [Sat, 8 Dec 2012 21:09:49 +0000 (22:09 +0100)]
Documentation: MPASS conversion

5 years agokernel: Remove MPASS{,4} entirely and use KKASSERT instead.
Sascha Wildner [Sat, 8 Dec 2012 20:58:34 +0000 (21:58 +0100)]
kernel: Remove MPASS{,4} entirely and use KKASSERT instead.

In-discussion-with: ftigeot

5 years agokernel: Move MPASS and MPASS4 definitions around
François Tigeot [Mon, 6 Aug 2012 07:04:58 +0000 (09:04 +0200)]
kernel: Move MPASS and MPASS4 definitions around

They are generic enough to be used in the entire kernel, no need to
keep them in an acpica-specific file.

5 years agokernel: Import sglist subsystem from FreeBSD
François Tigeot [Mon, 6 Aug 2012 08:37:47 +0000 (10:37 +0200)]
kernel: Import sglist subsystem from FreeBSD

Fixes-from: swildner
Blessed-by: vsrinivas
5 years agokernel - Adjust NFS server for new allocvnode() code
Matthew Dillon [Sat, 8 Dec 2012 05:21:55 +0000 (21:21 -0800)]
kernel - Adjust NFS server for new allocvnode() code

* Adjust the NFS server to check for LWP_MP_VNLRU garbage collection
  requests and act on them.

  This prevents excessive allocation of vnodes by the nfsd's.

5 years agokernel - Change allocvnode() to not recursively block freeing vnodes
Matthew Dillon [Sat, 8 Dec 2012 02:52:30 +0000 (18:52 -0800)]
kernel - Change allocvnode() to not recursively block freeing vnodes

allocvnode() has caused many deadlock issues over the years, including
recent issues with softupdates, because it is often called from deep
within VFS modules and attempts to clean and free unrelated vnodes when
the vnode limit is reached to make room for the new one.

* numvnodes is not protected by any locks and needs atomic ops.

* Change allocvnode() to always allocate and not attempt to free
  other vnodes.

* allocvnode() now flags the LWP to handle reducing the number of vnodes
  in the system as of when it returns to userland instead.  Consolidate
  several flags into a single conditional function call, lwpuserret().

  When triggered, this code will do a limited scan of the free list to
  try to find vnodes to free.

* The vnlru_proc_wait() code existed to handle a separate algorithm
  related to vnodes with cached buffers and VM pages but represented
  a major bottleneck in the system.

  Remove vnlru_proc_wait() and allow vnodes with buffers and/or non-empty
  VM objects to be placed on the free list.

  This also requires not vhold()ing the vnode for related buffer cache
  buffer since the vnode will not go away until related buffers have been
  cleaned out.  We shouldn't need those holds.

Testing-by: vsrinivas
5 years agokernel - Fix filesystem lookup error due to parent directory recyclement race
Matthew Dillon [Fri, 7 Dec 2012 22:44:26 +0000 (14:44 -0800)]
kernel - Fix filesystem lookup error due to parent directory recyclement race

* When looking up a path the parent ncp's vnode is needed to pass into
  the VFS code as the directory vnode (dvp) for the element being looked up.

* Fix a timing race whereby a system under extreme vnode pressure (such as
  when kern.maxvnodes is set to a very low value) can squeek in recyclement
  of this directory vnode when there are no children under it in the
  namecache.

  We fix the problem by holding the directory vnode during the nlookup() and
  cache_resolve().

5 years agokernel - Fix issues where tmpfs loses file data
Matthew Dillon [Fri, 7 Dec 2012 22:20:31 +0000 (14:20 -0800)]
kernel - Fix issues where tmpfs loses file data

* For TMPFS, UIO_NOCOPY writes must use bawrite() or bwrite() and must NEVER
  used buwrite() because these operations are being called via the VM page
  cleaning code via the pageout daemon or the vnode termination code
  (when maxvnodes is reached), and the underlying pages are about to be
  destroyed.

* vm_object_terminate() must call vinvalbuf() both before (for normal
  filesystems) and also after (for tmpfs style filesystems).  Otherwise
  buffers potentially not disposed of during the page cleaning might
  get left hanging.  This is a safety feature.

* Remove post-flush test code from vm_object_page_collect_flush() entirely.
  The IO's are in progress at this point so it makes no sense to set
  PG_CLEANCHK here.

* vm_page_need_commit() must make the object writeable and dirty, I think.

* Fix multiple places where m->dirty is tested and PG_NEED_COMMIT is not.

Reported-by: vsrinivas, others
5 years agoetc/Makefile: Some cleanup.
Sascha Wildner [Fri, 7 Dec 2012 20:31:05 +0000 (21:31 +0100)]
etc/Makefile: Some cleanup.

* Stop creating .profile and .cshrc symlinks in the / directory.

* Silence a bmake warning (the check only applies to i386).

Reported-by: ftigeot, sephe
While here, simplify the way /COPYRIGHT is installed.

5 years agorc.conf.5: Document rc.conf.d/
Sascha Wildner [Fri, 7 Dec 2012 18:18:59 +0000 (19:18 +0100)]
rc.conf.5: Document rc.conf.d/

Taken-from:     FreeBSD
Pointed-out-by: fgudin
5 years agoiso639: Sync with Library of Congress list.
Sascha Wildner [Fri, 7 Dec 2012 11:54:14 +0000 (12:54 +0100)]
iso639: Sync with Library of Congress list.

5 years agoRemove MFILES from kernel module Makefiles. It should not be needed.
Sascha Wildner [Thu, 6 Dec 2012 11:06:06 +0000 (12:06 +0100)]
Remove MFILES from kernel module Makefiles. It should not be needed.

5 years agokern.fwd.mk: Move comments to the start of the line.
Sascha Wildner [Thu, 6 Dec 2012 10:54:31 +0000 (11:54 +0100)]
kern.fwd.mk: Move comments to the start of the line.

Else they would get output to the screen.

Reported-by: tuxillo
5 years agoagp(4): Fix some minor issues.
Sascha Wildner [Thu, 6 Dec 2012 09:36:21 +0000 (10:36 +0100)]
agp(4): Fix some minor issues.

* Move AGP_DEBUG to the global 'options' file. No need to split it by
  platform.

* Add missing file to the Makefile.

* Fix a cast.

5 years agotwe(4): Sync with FreeBSD.
Sascha Wildner [Wed, 5 Dec 2012 22:28:33 +0000 (23:28 +0100)]
twe(4): Sync with FreeBSD.

Main change is making it MPSAFE. There's also some cleanup and misc
fixes.

I tested it with an Escalade 8506-8.

5 years agoRemove some unneeded semicolons across the tree.
Sascha Wildner [Wed, 5 Dec 2012 20:21:22 +0000 (21:21 +0100)]
Remove some unneeded semicolons across the tree.

5 years agokernel - Fix memory starvation issue w/tmpfs
Matthew Dillon [Wed, 5 Dec 2012 19:40:01 +0000 (11:40 -0800)]
kernel - Fix memory starvation issue w/tmpfs

* TMPFS relies on the pagedaemon to retire dirty pages to swap.  The normal
  buffer cache flushing won't do the job (nor do we want it to).  To avoid
  starving the system we change bio_page_alloc() to not dig into the
  system reserve when allocating pages for TMPFS.

Reported-by: tuxillo (Antonio Huete)
5 years agoagp: Fix a kernel panic on boot issue
François Tigeot [Wed, 5 Dec 2012 07:19:40 +0000 (08:19 +0100)]
agp: Fix a kernel panic on boot issue

* A pointer wasn't correctly initialized, leading to a
  Fatal trap 12: page fault while in kernel mode panic

* This commit fixes bug report #2467

Tested-by: Eric Christeson, David Shao
5 years agocluster - Stabilization
Matthew Dillon [Wed, 5 Dec 2012 07:32:30 +0000 (23:32 -0800)]
cluster - Stabilization

* Fix disconnect/reconnect sequence for autoconn (/etc/hammer2/autoconn).
  The pipe used to signal termination of the iocom_core() was not supposed
  to be closed by iocom_done().

* The shutdown code now simply sets DMSG_IOCOMF_EOF instead of trying to
  shutdown() the socket.

* Fix double mutex lock in dmsg_msg_alloc().

5 years agocluster - xdisk automatic BIO restart
Matthew Dillon [Wed, 5 Dec 2012 03:24:54 +0000 (19:24 -0800)]
cluster - xdisk automatic BIO restart

* The xdisk driver now detects failed transactions due to failed circuits
  and will restart the BIOs on another circuit or hold onto them until
  connectivity is restored and a new circuit is reforged.

  Automatic restarts only occur if the xa* disk is open()'d (i.e. mounted
  or being accessed by userland).  Kernel disk subsystem probes on attach
  will be failed normally and not stall on lost connectivity.

* subr_diskiocom now reports the correct DMSG error code for failed BIOs
  instead of reporting a kernel error code.

5 years agocluster - misc cleanup
Matthew Dillon [Wed, 5 Dec 2012 03:23:54 +0000 (19:23 -0800)]
cluster - misc cleanup

* Rename the routing function to something that I remember.

5 years agocluster - misc work
Matthew Dillon [Tue, 4 Dec 2012 22:30:42 +0000 (14:30 -0800)]
cluster - misc work

* Use a different API function for state-based reply in the volconf code.