8 hours agopktgen: Allow building w/o INVARIANTS master
Sepherosa Ziehau [Tue, 26 Sep 2017 00:49:01 +0000 (08:49 +0800)]
pktgen: Allow building w/o INVARIANTS

9 hours agoethernet: Restructure vlan check.
Sepherosa Ziehau [Mon, 25 Sep 2017 23:40:33 +0000 (07:40 +0800)]
ethernet: Restructure vlan check.

9 hours agoipflow: Use netisr APIs
Sepherosa Ziehau [Mon, 25 Sep 2017 23:14:06 +0000 (07:14 +0800)]
ipflow: Use netisr APIs

10 hours agoipflow: Remove compat macro
Sepherosa Ziehau [Mon, 25 Sep 2017 22:17:38 +0000 (06:17 +0800)]
ipflow: Remove compat macro

11 hours agosbin/hammer: Use uuid_compare(3) instead of bcmp(3)
Tomohiro Kusumi [Mon, 25 Sep 2017 20:19:48 +0000 (23:19 +0300)]
sbin/hammer: Use uuid_compare(3) instead of bcmp(3)

(missed ones from 118205ce)

11 hours agokcollect - Final dbm support code
Matthew Dillon [Mon, 25 Sep 2017 21:12:59 +0000 (14:12 -0700)]
kcollect - Final dbm support code

* Fix time conversion issues and memory leaks

* Code cleanup

* Documentation cleanup (from swildner)

Submitted-by: htse (Harald Brinkhof)
25 hours agoipflow: Utilize netisr_domsg_global
Sepherosa Ziehau [Mon, 25 Sep 2017 07:25:20 +0000 (15:25 +0800)]
ipflow: Utilize netisr_domsg_global

26 hours agoipflow: Allocate ipflow context on its owner cpu.
Sepherosa Ziehau [Mon, 25 Sep 2017 06:51:41 +0000 (14:51 +0800)]
ipflow: Allocate ipflow context on its owner cpu.

26 hours agoipflow: Use INTWAIT | NULLOK for kmalloc
Sepherosa Ziehau [Mon, 25 Sep 2017 06:02:33 +0000 (14:02 +0800)]
ipflow: Use INTWAIT | NULLOK for kmalloc

26 hours agoipflow: No need to mark it cachealign.
Sepherosa Ziehau [Mon, 25 Sep 2017 06:01:21 +0000 (14:01 +0800)]
ipflow: No need to mark it cachealign.

ipflow is allocated on the owner cpu.

26 hours agoipflow: Remove reference counting, which no longer applies.
Sepherosa Ziehau [Mon, 25 Sep 2017 05:58:25 +0000 (13:58 +0800)]
ipflow: Remove reference counting, which no longer applies.

27 hours agoipflow: Stringent assertion.
Sepherosa Ziehau [Mon, 25 Sep 2017 05:07:02 +0000 (13:07 +0800)]
ipflow: Stringent assertion.

27 hours agoroute: Minor style change.
Sepherosa Ziehau [Mon, 25 Sep 2017 05:05:31 +0000 (13:05 +0800)]
route: Minor style change.

28 hours agopolling: Utilize netisr_domsg_global
Sepherosa Ziehau [Mon, 25 Sep 2017 04:47:22 +0000 (12:47 +0800)]
polling: Utilize netisr_domsg_global

28 hours agopolling: No need to explicitly align io context and systimer context
Sepherosa Ziehau [Mon, 25 Sep 2017 04:41:37 +0000 (12:41 +0800)]
polling: No need to explicitly align io context and systimer context

28 hours agopolling: Adjust comment
Sepherosa Ziehau [Mon, 25 Sep 2017 04:34:07 +0000 (12:34 +0800)]
polling: Adjust comment

31 hours agopolling: Don't do direct input in critical section.
Sepherosa Ziehau [Mon, 25 Sep 2017 00:42:56 +0000 (08:42 +0800)]
polling: Don't do direct input in critical section.

38 hours agokcollect - Add initial dbm support
Matthew Dillon [Sun, 24 Sep 2017 18:17:03 +0000 (11:17 -0700)]
kcollect - Add initial dbm support

* Fully implement the -b and -d options to allow a dbm file to
  be recorded / appended, and played back.

* Still needs a little fleshing out for scaling info and

Submitted-by: htse (Harald Brinkhof)
40 hours agosbin/hammer: Fix strncpy(3) length
Tomohiro Kusumi [Sun, 24 Sep 2017 15:22:28 +0000 (18:22 +0300)]
sbin/hammer: Fix strncpy(3) length

The last one is ok, but HAMMER userspace doesn't use strl variants
except for this, then just use strncpy(3) for better portability.

44 hours agokernel: Remove no longer used FFS_ROOT option.
Sascha Wildner [Sun, 24 Sep 2017 12:39:15 +0000 (14:39 +0200)]
kernel: Remove no longer used FFS_ROOT option.

Last used in code removed in 8840cec90a57df5e7c0f84c3c3c1e9abea7f2632.

While here, remove some no longer necessary opt_ffs.h #includes.

44 hours ago<machine/stdint.h>: Add __suseconds_t for suseconds_t definitions.
Sascha Wildner [Sun, 24 Sep 2017 11:30:30 +0000 (13:30 +0200)]
<machine/stdint.h>: Add __suseconds_t for suseconds_t definitions.

2 days agopolling: Implement direct input support.
Sepherosa Ziehau [Sat, 23 Sep 2017 03:19:26 +0000 (11:19 +0800)]
polling: Implement direct input support.

When "direct input" is enabled by driver, driver's RX polling handler
will run ethernet/ip/tcp processing directly, which avoids cache-miss
on mbufs themselves.  Currently it is enabled on ix(4) by default.

The normal IP forwarding performance is improved by %12, while the fast
IP forwarding performance is improved by 10%.  13.2Mpps is achieved for
dual side IP forwarding!

1 request/connection HTTP/1.1 performance and avg-latency stay same,
but the latency is further stablized:
1K  5.20ms  -> 4.60ms
8K  6.43ms  -> 5.76ms
16K 16.30ms -> 14.90ms

45 hours agosbin/hammer: Cleanup header includes
Tomohiro Kusumi [Sat, 23 Sep 2017 20:06:17 +0000 (23:06 +0300)]
sbin/hammer: Cleanup header includes

47 hours agosys/vfs/hammer: Use kuuid_compare() instead of bcmp()
Tomohiro Kusumi [Sat, 23 Sep 2017 20:43:52 +0000 (23:43 +0300)]
sys/vfs/hammer: Use kuuid_compare() instead of bcmp()

though kuuid_compare() is probably slower than bcmp() in most cases.
It's not performance critical part anyway.

2 days agohammer2 - Fix bug in hammer2_chain_indkey_dir()
Matthew Dillon [Sun, 24 Sep 2017 04:12:22 +0000 (21:12 -0700)]
hammer2 - Fix bug in hammer2_chain_indkey_dir()

* The shortcut in hammer2_chain_indkey_dir() ignores the
  possibilty that the key breakdown chosen may not result
  in the clearing out of any elements in the parent.  If this
  occurs, an insertion operation following the function
  will assert on too many elements.

* Remove the shotcut.

2 days agoUpdate files for file-5.32 import.
Sascha Wildner [Sat, 23 Sep 2017 19:15:53 +0000 (21:15 +0200)]
Update files for file-5.32 import.

2 days agoMerge branch 'vendor/FILE'
Sascha Wildner [Sat, 23 Sep 2017 19:23:05 +0000 (21:23 +0200)]
Merge branch 'vendor/FILE'

2 days agoRevert "Import file-5.22."
Sascha Wildner [Sat, 23 Sep 2017 19:22:44 +0000 (21:22 +0200)]
Revert "Import file-5.22."

This reverts commit 89a9c80e537ed7238142c9af2cdc03401742a18a.

For some reason the 5.22 upgrade was not git-merged, looks like
copied instead. Caused merge conflicts with 5.32.

2 days agoImport file-5.32. vendor/FILE
Sascha Wildner [Sat, 23 Sep 2017 19:13:08 +0000 (21:13 +0200)]
Import file-5.32.

See ChangeLog for details.

2 days agomicrouptime.9 microtime.9: Fix documentation of the get* function versions.
Imre Vadász [Sat, 23 Sep 2017 15:04:38 +0000 (17:04 +0200)]
microuptime.9 microtime.9: Fix documentation of the get* function versions.

The kern.timecounter sysctl tree doesn't exist nowadays, the getmicrotime(),
getnanotime(), getmicrouptime() and getnanouptime() functions always
return the less precise time.

2 days agosbin/newfs_hammer2: Fix typo in newfs_hammer2(8)
Tomohiro Kusumi [Sat, 23 Sep 2017 11:27:20 +0000 (14:27 +0300)]
sbin/newfs_hammer2: Fix typo in newfs_hammer2(8)

of of

2 days agousr.sbin/fstyp: Add initial HAMMER2 support
Tomohiro Kusumi [Fri, 22 Sep 2017 22:17:20 +0000 (01:17 +0300)]
usr.sbin/fstyp: Add initial HAMMER2 support

-l option and multiple/partial volumes are not supported yet.

2 days agosys/vfs/hammer: Add typedef hammer_uuid_t
Tomohiro Kusumi [Thu, 21 Sep 2017 16:06:37 +0000 (19:06 +0300)]
sys/vfs/hammer: Add typedef hammer_uuid_t

Add typedef for uuid_t for better portability,
similar to hammer_crc_t and other hammer_xxx_t.
(Some platforms have char[16] for uuid_t instead of struct value)

No functional changes.

2 days agosbin/hammer: Add uuid.c
Tomohiro Kusumi [Thu, 21 Sep 2017 16:06:16 +0000 (19:06 +0300)]
sbin/hammer: Add uuid.c

Add a simple wrapper over uuid functions for better portability,
similar to sys/vfs/hammer/hammer_crc.h (which helped implement
version 7 CRC).

No functional changes.

2 days agopsm: Drop bpsm%d device files. Instead support non-blocking reads on psm%d.
Imre Vadász [Sat, 23 Sep 2017 11:12:34 +0000 (13:12 +0200)]
psm: Drop bpsm%d device files. Instead support non-blocking reads on psm%d.

The /dev/psm%d vs. /dev/bpsm%d separation doesn't serve any clear purpose
nowadays. Userland can just use fcntl(2) to switch the fd to non-blocking
or blocking mode as needed.

3 days agohammer2 - Fix hammer2 snapshot operation
Matthew Dillon [Fri, 22 Sep 2017 16:27:04 +0000 (09:27 -0700)]
hammer2 - Fix hammer2 snapshot operation

* Bring the hammer2 snapshot code up-to-date with the pfs-create

* Fix initial inode number assignment for hammer2 snapshot code (it
  was starting at 1024 which obviously won't work).

* Correct hammer2_vop_ncreate() error code - it was not converting
  the hammer2 error code to a system error code.

3 days agopsm: Get rid of PSM_LEVEL_NATIVE, and the psmwrite method used with that.
Imre Vadász [Fri, 22 Sep 2017 15:46:45 +0000 (17:46 +0200)]
psm: Get rid of PSM_LEVEL_NATIVE, and the psmwrite method used with that.

* Nothing in userspace ever uses this feature. This apparently was intended
  to allow implementing the complete mouse packet parsing in userspace.

3 days agopsm: Remove dead unused code: psmpoll(), enable_lordless(), is_a_mouse().
Imre Vadász [Fri, 22 Sep 2017 12:21:52 +0000 (14:21 +0200)]
psm: Remove dead unused code: psmpoll(), enable_lordless(), is_a_mouse().

* The is_a_mouse() check method was already disabled in the original
  FreeBSD commit, which added the psm(4) driver
  (git b3062b5d6a9d9631bf6a1612e27107ea9aa6801d ).

4 days agoinet/inet6: Randomize local port
Sepherosa Ziehau [Fri, 22 Sep 2017 01:09:10 +0000 (09:09 +0800)]
inet/inet6: Randomize local port

Due to avoid lock intruction, this also improves connect(2)
performance a bit.

4 days agoarc4random: Make arc4random context per-cpu.
Sepherosa Ziehau [Thu, 21 Sep 2017 23:35:21 +0000 (07:35 +0800)]
arc4random: Make arc4random context per-cpu.

Critical section is commented out, no consumers from ISRs/ithreads.

4 days agohammer2 - Fix panic related to the accounting for pfs-create
Matthew Dillon [Fri, 22 Sep 2017 05:01:03 +0000 (22:01 -0700)]
hammer2 - Fix panic related to the accounting for pfs-create

* Properly disconnect the inode created by pfs-create from the spmp so it
  can be reassociated with the pmp.

* Do not allow the newly created inode to be emplaced on the spmp's sideq,
  which will cause a duplicate inode structure to be created if the
  pfs is then mounted.

Reported-by: Romick
4 days agohammer2 - Fix flush issues with unmounted PFSs and shutdown panic
Matthew Dillon [Fri, 22 Sep 2017 00:35:56 +0000 (17:35 -0700)]
hammer2 - Fix flush issues with unmounted PFSs and shutdown panic

* Fix flush and shutdown issues when unmounted PFS's are present.
  These PFSs do not get flushed by the filesystem sync code because
  they haven't been mounted, but may still contain modified or
  referenced chains, as well as sideq'd inodes.

* Fix some other cleanup issues when unmounting.  Clean out vchain.pmp
  and fchain.pmp for the spmp during the unmount scan, which fixes a
  hammer2 pfs_memory_*() panic.

Reported-by: yellowrabbit2010
5 days agoarc4random: Minor style changes.
Sepherosa Ziehau [Thu, 21 Sep 2017 07:04:18 +0000 (15:04 +0800)]
arc4random: Minor style changes.

Use uintX_t instead of u_intX_t.

5 days agox86: Use kmem_alloc3 for cpu0's ipiq
Sepherosa Ziehau [Thu, 21 Sep 2017 05:46:41 +0000 (13:46 +0800)]
x86: Use kmem_alloc3 for cpu0's ipiq

5 days agohammer2 - performance pass
Matthew Dillon [Thu, 21 Sep 2017 06:49:51 +0000 (23:49 -0700)]
hammer2 - performance pass

* Get rid of vfs.hammer2.cluster_write and stop using cluster_write()
  for the block device I/O.  This coupled into common unlock/lock
  situations on chains which would acquire and retire the DIO, and
  usually thus also the underlying buffer, many times before it
  really needed to be committed.

  This greatly reduces unnecessary writes to disk.

* Increase HAMMER2_FLUSH_DEPTH_LIMIT to 60.  It was set to 10 for
  debugging purposes.  This created an O(N^2) overhead situation
  in hammer2_flush().  20,000 dirty inodes would translate to
  30 million chain scans, resulting in cpu-bound stalls for long
  periods of time.

  Fixing this value reduces a 20,000 dirty inode flush to around
  200,000 chain scans (100x faster).

* Use hammer2_chain_ref_hold() and hammer2_chain_drop_unhold()
  to reduce the amount of buffer cache buffer cycling that occurs
  during a flush, by retaining the DIO associated with a parent
  chain across its unlock/recurse/relock sequence.

  The number of buffers held locked is limited by the flush recursion

6 days agoipfw: Factor out fucntion to setup local variables.
Sepherosa Ziehau [Wed, 20 Sep 2017 05:40:08 +0000 (13:40 +0800)]
ipfw: Factor out fucntion to setup local variables.

6 days agoipfw: Add ipfrag filter.
Sepherosa Ziehau [Wed, 20 Sep 2017 00:21:58 +0000 (08:21 +0800)]
ipfw: Add ipfrag filter.

Unlike 'frag' filter, which only matches non-first IP fragments,
this filter matches all IP fragments.

6 days agoipfw: Remove unnecessary complexity
Sepherosa Ziehau [Wed, 20 Sep 2017 00:13:57 +0000 (08:13 +0800)]
ipfw: Remove unnecessary complexity

6 days agohammer2 - Remove debugging, adjust iocom
Matthew Dillon [Wed, 20 Sep 2017 00:31:03 +0000 (17:31 -0700)]
hammer2 - Remove debugging, adjust iocom

* Call hammer2_iocom_uninit() before we start cleaning up the hmp.

* Remove numerous debug messages.

6 days agokernel - Fix races in kern_dmsg.c (hammer2)
Matthew Dillon [Wed, 20 Sep 2017 00:29:42 +0000 (17:29 -0700)]
kernel - Fix races in kern_dmsg.c (hammer2)

* Fix kdmsg races during shutdown which can assert or panic

* Fixes numerous hammer2 assertions or panics related to unmounting,
  including mount failures due to missing labels.

6 days agokernel - Remove some kdmsg debugging
Matthew Dillon [Tue, 19 Sep 2017 21:20:25 +0000 (14:20 -0700)]
kernel - Remove some kdmsg debugging

* Remove '<blah> thread terminating' kdmsg debug messages.

6 days agokernel - support dummy reallocblks in devfs
Matthew Dillon [Tue, 19 Sep 2017 21:13:57 +0000 (14:13 -0700)]
kernel - support dummy reallocblks in devfs

* cluster_write() calls VOP_REALLOCBLKS() in certain situations.

* Supply a dummy for devfs's .vop_reallocblks to avoid a panic.

Reported-by: tuxillo
6 days agogpt(8): Add HAMMER and HAMMER2 support
François Tigeot [Tue, 19 Sep 2017 20:15:35 +0000 (22:15 +0200)]
gpt(8): Add HAMMER and HAMMER2 support

This makes it possible to create HAMMER or HAMMER2 partitions
with simple commands such as:

  gpt add -t hammer2 /dev/device

6 days agoboot/loader: Fix the 'crc' command to the intended code.
Sascha Wildner [Tue, 19 Sep 2017 18:24:03 +0000 (20:24 +0200)]
boot/loader: Fix the 'crc' command to the intended code.

It doesn't change the result, but fixes a cppcheck warning.

Reported-by: dcb
Fix-submitted-by: Lubos Boucek
Dragonfly-bug:    <https://bugs.dragonflybsd.org/issues/3060>

6 days agosbin/hammer: Use uuid_compare(3) instead of bcmp(3)
Tomohiro Kusumi [Sat, 16 Sep 2017 17:53:35 +0000 (20:53 +0300)]
sbin/hammer: Use uuid_compare(3) instead of bcmp(3)

6 days agosbin/newfs_hammer: Use uuid_create(3) instead of uuidgen(2)
Tomohiro Kusumi [Sat, 16 Sep 2017 16:02:55 +0000 (19:02 +0300)]
sbin/newfs_hammer: Use uuid_create(3) instead of uuidgen(2)

HAMMER userspace uses uuid_create(3) except for this one.
uuidgen(2) syscall isn't part of the specification.

6 days agosbin/newfs_hammer: Use hwarnx() instead of hwarn()
Tomohiro Kusumi [Sat, 16 Sep 2017 12:23:36 +0000 (15:23 +0300)]
sbin/newfs_hammer: Use hwarnx() instead of hwarn()

This one should be with x.

6 days agohammer2(8): Fix printf.
Sascha Wildner [Tue, 19 Sep 2017 14:23:12 +0000 (16:23 +0200)]
hammer2(8): Fix printf.

7 days agoipfw: Add defrag action.
Sepherosa Ziehau [Sat, 16 Sep 2017 06:17:52 +0000 (14:17 +0800)]
ipfw: Add defrag action.

IP fragment reassembling is almost required for stateful firewall,
and will be needed for in-kernel NAT.

NOTE: Reassemabled IP packets will be passed to the next rule for
further evaluation.

6 days agohammer2 - Fix corruption on sync (2)
Matthew Dillon [Tue, 19 Sep 2017 09:07:51 +0000 (02:07 -0700)]
hammer2 - Fix corruption on sync (2)

* Looping on ONFLUSH to call RB_SCAN() can be endless due to deferrals.
  Just do it twice to catch the indirect block maintenance issue.

7 days agohammer2 - Fix corruption on sync, fix excessive stall, optimize sideq
Matthew Dillon [Tue, 19 Sep 2017 08:35:41 +0000 (01:35 -0700)]
hammer2 - Fix corruption on sync, fix excessive stall, optimize sideq

* Fix topology corruption which can occur due to the new
  hammer2_chain_indirect_maintenance() code.  This code can make
  modifications to the parent from inside the flush code itself.
  This can cause the flush code's RB_SCAN() recursion to miss
  mandatory chains during the flush, resulting in some of the
  topology missing from the synchronized flush.

  This bug could cause corruption due to a crash, but not due to
  a normal unmount, shutdown, or reboot, because that code always
  runs extra sync() calls which corrects the problem.

  Fix the bug by detecting that UPDATE was again set in the parent
  and run the RB_SCAN() again.

* Fix an excessive stall that can occur in the XOP code due to a
  sleep/wakeup race.  This race could cause a VOP operation to stall
  for 60 seconds (it then hit some failsafe code and continued running

  Fix this issue by removing hamemr2_xop_head->check_counter and
  integrating its flagging functions into run_mask.  Increase run_mask
  to 64 bits to accomodate the counter in the upper 32 bits.

* Optimize hammer2_inode_run_sideq().  Avoid running the sideq if the
  number of detached inodes is not significant, except when flushing
  in which case we always want to run the entire sideq.

7 days agohammer2 - augment freemap directive
Matthew Dillon [Tue, 19 Sep 2017 08:34:37 +0000 (01:34 -0700)]
hammer2 - augment freemap directive

* The hammer2 freemap debugging dump now sums up free blocks and
  displays the results, allowing the actual free bytes to be
  compared against df output.

8 days agoUpdate the pciconf(8) database.
Sascha Wildner [Mon, 18 Sep 2017 07:01:58 +0000 (09:01 +0200)]
Update the pciconf(8) database.

September 17, 2017 snapshot from http://pciids.sourceforge.net/

8 days agohammer2 - push missing file (cmd_destroy.c)
Matthew Dillon [Mon, 18 Sep 2017 02:36:14 +0000 (19:36 -0700)]
hammer2 - push missing file (cmd_destroy.c)

* Push missing file for the 'destroy' directive.

8 days agoshm_open(3): Set the FD_CLOEXEC flag for the new fd, per POSIX.
Sascha Wildner [Sun, 17 Sep 2017 16:07:54 +0000 (18:07 +0200)]
shm_open(3): Set the FD_CLOEXEC flag for the new fd, per POSIX.



9 days agoip: Don't double check length.
Sepherosa Ziehau [Sat, 16 Sep 2017 23:47:54 +0000 (07:47 +0800)]
ip: Don't double check length.

9 days agodummynet: ip_input expects ip_off/ip_len in network byte order.
Sepherosa Ziehau [Sat, 16 Sep 2017 22:40:50 +0000 (06:40 +0800)]
dummynet: ip_input expects ip_off/ip_len in network byte order.

9 days agoipfw/ipfw3: Use INTWAIT|NULLOK for mtag allocation.
Sepherosa Ziehau [Sat, 16 Sep 2017 22:27:11 +0000 (06:27 +0800)]
ipfw/ipfw3: Use INTWAIT|NULLOK for mtag allocation.

9 days agodummynet: Don't deliver freed mbuf to callers.
Sepherosa Ziehau [Sat, 16 Sep 2017 22:21:05 +0000 (06:21 +0800)]
dummynet: Don't deliver freed mbuf to callers.

9 days agoip: Move mbuf length assertion into an earlier place.
Sepherosa Ziehau [Sat, 16 Sep 2017 22:02:30 +0000 (06:02 +0800)]
ip: Move mbuf length assertion into an earlier place.

Before mbuf is casted to ip.

9 days agokernel - Order ipfw3 module before other ipfw3_* modules
Matthew Dillon [Sun, 17 Sep 2017 04:50:11 +0000 (21:50 -0700)]
kernel - Order ipfw3 module before other ipfw3_* modules

* Order ipfw3 first, i.e. before any other ipfw3_* modules.  This avoids
  an assertion in the other modules during their init.

Reported-by: shassard (irc)
9 days agohammer2 - Add directive to destroy bad directory entries
Matthew Dillon [Sun, 17 Sep 2017 01:17:16 +0000 (18:17 -0700)]
hammer2 - Add directive to destroy bad directory entries

* Add a directive and ioctl that is capable of destroying bad hammer2
  directory entries.  If topological corruption occurs due to a crash
  (which theoretically shouldn't be possible with HAMMER2), this directive
  allows you to destroy directory entries which do not have working inodes
  and cannot otherwise be destroyed with 'rm'.

* Sysops should only use this directive when absolutely necessary.

10 days agomtag: Use kmalloc flags, instead of just M_WAITOK or M_NOWAIT.
Sepherosa Ziehau [Sat, 16 Sep 2017 06:45:42 +0000 (14:45 +0800)]
mtag: Use kmalloc flags, instead of just M_WAITOK or M_NOWAIT.

This allows more fine-grained mtag allocation control, e.g.

10 days agonetisr: Make dynamic netisr rollup register/unregister MPSAFE.
Sepherosa Ziehau [Sat, 16 Sep 2017 02:54:49 +0000 (10:54 +0800)]
netisr: Make dynamic netisr rollup register/unregister MPSAFE.

10 days agonetisr: Use kmem_alloc3 for netisr thread and netlastfunc.
Sepherosa Ziehau [Sat, 16 Sep 2017 02:07:33 +0000 (10:07 +0800)]
netisr: Use kmem_alloc3 for netisr thread and netlastfunc.

10 days agohammer2 - Fix inode nlinks / directory-entry desynchronization on crash
Matthew Dillon [Fri, 15 Sep 2017 17:21:14 +0000 (10:21 -0700)]
hammer2 - Fix inode nlinks / directory-entry desynchronization on crash

* Hammer2 must flush dirty inodes, buffers, and chains when doing a sync,
  before writing-out the volume header.

* Inodes are flushed in two stages... we flush inodes via vfsyncscan()
  which runs through dirty vnodes, but inodes disassociated from vnodes
  are recorded separately and must also be flushed.  This is handled by

* Fix an ordering bug where hammer2_inode_run_sideq() was being called
  before vfsyncscan() instead of after.  This could result in some dirty
  inodes slipping through the cracks by getting retired by the system
  after the hammer2_inode_run_sideq() call but before vfsyncscan() can
  get to them.

  Fixed by calling hammer2_inode_run_sideq() after vfsyncscan() instead
  of before.

  Note that vnodes cannot normally be dirtied during the serialized portion
  of the flush because the flush serializes against modifying VOPs.  So we
  should not have a second source of desynchronization from that sort of
  activity.  In fact, strategy calls via shared R/W mmap()'s can execute
  concurrent with a flush, but these will have no effect on inode size
  or nlinks.

11 days agotcp: Use primary hash for TCP ports.
Sepherosa Ziehau [Fri, 15 Sep 2017 05:20:39 +0000 (13:20 +0800)]
tcp: Use primary hash for TCP ports.

This fixes the hash aliasing issue, which is caused by port space
devisiion.  Improve TCP connection establish performance a bit.

11 days agotcp/udp: Make sure hash size macro is powerof2
Sepherosa Ziehau [Fri, 15 Sep 2017 04:32:41 +0000 (12:32 +0800)]
tcp/udp: Make sure hash size macro is powerof2

11 days agohammer2 - Instrument error path for indirect block maintenance
Matthew Dillon [Thu, 14 Sep 2017 20:31:45 +0000 (13:31 -0700)]
hammer2 - Instrument error path for indirect block maintenance

* Instrument error path, fix a crash case when 'chain' cannot be modified
  (usually due to a filesystem full error).  Just complain instead.

* Add some temporary debugging for another possible issue under test.

Reported-by: arcade@b1t.name
11 days agokernel - Fix memory ordering race
Matthew Dillon [Thu, 14 Sep 2017 17:36:23 +0000 (10:36 -0700)]
kernel - Fix memory ordering race

* Fix a race in the mtx wait/wakeup code for situations where the
  releasing thread hands lock ownership to the waiter.  In this
  situation the waiter can sometimes succeed without having to do
  additional atomic ops.  However, this also allows speculative reads
  by the waiting cpu to preceed the lock handover.

* Add an mfence to fix this problem.  Add a few cpu_sfence()s (which
  are basically NOPs on Intel) to clarify other bits of code too.

12 days agohammer2 - Remove dead code, clarify comment
Matthew Dillon [Wed, 13 Sep 2017 23:07:27 +0000 (16:07 -0700)]
hammer2 - Remove dead code, clarify comment

* Remove some dead code.

* Clarify the flags passed in to hammer2_chain_getparent() and

12 days agokernel - Fix shared lock bug in kern_mutex.c
Matthew Dillon [Wed, 13 Sep 2017 23:03:19 +0000 (16:03 -0700)]
kernel - Fix shared lock bug in kern_mutex.c

* When the last exclusive lock is unlocked or when downgrading an exclusive
  lock to a shared lock, pending shared links must be processed.  The
  last 'lock count' is transfered to the first link, thus preventing the
  lock from getting ripped out from under the transfer code.

* However, when multiple shared links are pending, it is possible for the
  first recipient link to wakeup and release its lock before the unlock/drop
  code is able to finish its scan, which places the lock in an unexpected
  state.  The lock count was only being incremented within the link scan
  loop, once at a time.

* Fix the problem by applying a lock count representing ALL pending
  shared lock links after the first one before processing the first link.
  This ensures that the lock remains in a shared-lock state while the loop
  is running.

* This fixes a race that can occur in HAMMER2.

12 days agoinstaller - Avoid endless loop for UEFI installations
Antonio Huete Jimenez [Wed, 13 Sep 2017 20:32:12 +0000 (22:32 +0200)]
installer - Avoid endless loop for UEFI installations

- While doing an UEFI installation after selecting the disk if the dialog
  to write changes to the disk is cancelled there was no way to get back
  to the previous screen.
- Fix it by going to the select disk state.

12 days agokernel/hammer2: Rename DEBUG to H2_ZLIB_DEBUG in the zlib code.
Sascha Wildner [Wed, 13 Sep 2017 20:23:43 +0000 (22:23 +0200)]
kernel/hammer2: Rename DEBUG to H2_ZLIB_DEBUG in the zlib code.

This unbreaks LINT64, to which hammer2 was added in cf4ab83ee58092c57
without actually having tested it.

There is a DEBUG kernel option that this conflicts with. Also, most of
this code is userland code, not kernel code.

H2's zlib really needs to be cleaned up better.

12 days agohammer2 - Allow simple 'mount @label <target>' shortcut for snapshots
Matthew Dillon [Wed, 13 Sep 2017 17:33:27 +0000 (10:33 -0700)]
hammer2 - Allow simple 'mount @label <target>' shortcut for snapshots

* If any hammer2 PFS on a device is already mounted, all other PFS's on
  the device can be mounted simply by specifying their label.  There is
  no need to specify the device.  e.g.:

  # hammer2 pfs-list /build
  Type        ClusterId (pfs_clid)                 Label
  MASTER      726d8ab1-9839-11e7-98a7-6145cb9ac050 ROOT
  MASTER      726d8a72-9839-11e7-98a7-6145cb9ac050 LOCAL
  SNAPSHOT    eb19b5fa-98a7-11e7-98a7-6145cb9ac050 ROOT.20170913.102102
  # mount @ROOT.20170913.102102 /mnt

12 days agoipfw: WARNS=6 isn't necessary, it's in the parent Makefile.inc.
Sascha Wildner [Wed, 13 Sep 2017 12:10:57 +0000 (14:10 +0200)]
ipfw: WARNS=6 isn't necessary, it's in the parent Makefile.inc.

13 days agoipfw: Raise WARNS to 6
Sepherosa Ziehau [Wed, 13 Sep 2017 01:41:43 +0000 (09:41 +0800)]
ipfw: Raise WARNS to 6

13 days agoipfw: Raise WARNS to 3
Sepherosa Ziehau [Wed, 13 Sep 2017 01:28:18 +0000 (09:28 +0800)]
ipfw: Raise WARNS to 3

13 days agosshlockout: Add ipfw(8) table support.
Sepherosa Ziehau [Wed, 13 Sep 2017 01:07:45 +0000 (09:07 +0800)]
sshlockout: Add ipfw(8) table support.

13 days agokernel - Fix sys% time reporting
Matthew Dillon [Wed, 13 Sep 2017 02:50:47 +0000 (19:50 -0700)]
kernel - Fix sys% time reporting

* Fix system time reporting in systat -vm 1, systat -pv 1, and process

* Basically the issue is that when coincident systimer interrupts occur,
  such as when the statclock, hardclock, and schedclock all fire at the
  same time, the statclock must execute first in order to properly detect
  the state the current thread is in.  If it does not, it may see a lwkt
  thread schedule by one of the other systimers and improper dock the
  current thread as being in 'system' time.

* The various systimer interrupts could wind up out of phase and
  desynchronized due to the tsc_frequency not being perfectly divisible
  by the requested frequencies.  In addition, various timers could queue
  in an undesirable order due to being different integral frequencies of
  each other.

* Refactor the systimer API a bit, adding new functions which guarantee
  synchronization for nominally requested frequencies and which guarantee
  ordering for coincident systimer events (which statclock uses).  This
  should completely solve the problem.

* Also, if the RQF_INTPEND flag is set, count as interrupt time.  This
  will give us a slightly more accurate understanding of interrupt overhead
  (alternatively we could do this test for just the case where curthread is
  the idlethread, which might be more accurate).

13 days agokernel - Change legacy MBR partition type from 0xA5 to 0x6C
Matthew Dillon [Tue, 12 Sep 2017 23:42:08 +0000 (16:42 -0700)]
kernel - Change legacy MBR partition type from 0xA5 to 0x6C

* Should have done this years ago but finally change the legacy MBR
  partition type DragonFlyBSD uses from 0xA5 (which was shared with
  FreeBSD), to something different 0x6C.

* Makes it less confusing for Grub.

* Does not change EFI boot, which uses 16-byte UUIDs (we already have
  our own) and does not use 8-bit partition ids.

* Boot code and kernel now recognize both 0xA5 and 0x6C.  Existing users
  do *NOT* need to reinstall their boot code.

13 days agomount_udf.8: Correct typo in arguments.
Sascha Wildner [Tue, 12 Sep 2017 19:09:45 +0000 (21:09 +0200)]
mount_udf.8: Correct typo in arguments.

2 weeks agosshlockout: Style changes; no functional changes.
Sepherosa Ziehau [Tue, 12 Sep 2017 07:22:19 +0000 (15:22 +0800)]
sshlockout: Style changes; no functional changes.

2 weeks agoipfw: Add per-cpu table support.
Sepherosa Ziehau [Thu, 7 Sep 2017 00:56:57 +0000 (08:56 +0800)]
ipfw: Add per-cpu table support.

This is intended to improve performance and reduce latency for
matching discrete addresses.  Table itself is radix tree.

For exmaple, nginx, 1KB web object, 30K concurrent connections,
1 request/connection.  ipfw is running on the server side.

Comparison between no-match rules and no-match table entries:

                   |  perf-avg | lat-avg | lat-stdev | lat-99%
                   |   (tps)   |  (ms)   |   (ms)    |  (ms)
100 nomatch rules  | 184752.65 |   67.50 |      5.69 |   79.11
100 nomatch tblent | 200754.53 |   61.18 |      5.72 |   73.10

1K nomatch rules   |  90836.43 |  144.72 |     12.28 |  168.97
1K nomatch tblent  | 199750.35 |   61.54 |      5.73 |   72.90

10K nomatch rules  |  14836.69 |  864.46 |    157.49 | 1110.00
10K nomatch tblent | 198412.93 |   62.17 |      5.66 |   73.08

Comparison between number of no-match table entries:

                   |  perf-avg | lat-avg | lat-stdev | lat-99%
                   |   (tps)   |  (ms)   |   (ms)    |  (ms)
no-ipfw            | 210658.80 |   58.01 |      5.20 |   68.73
100 nomatch tblent | 200754.53 |   61.18 |      5.72 |   73.10
1K nomatch tblent  | 199750.35 |   61.54 |      5.73 |   72.90
10K nomatch tblent | 198412.93 |   62.17 |      5.66 |   73.08

It scales pretty well with the number of no-match table entries.
En if it is compared w/ no-ipfw case, the performance and latency
impacts of the ipfw after this commit are pretty small.

2 weeks agohammer2 - Add daily periodic for hammer2 cleanup
Matthew Dillon [Tue, 12 Sep 2017 04:26:06 +0000 (21:26 -0700)]
hammer2 - Add daily periodic for hammer2 cleanup

* Add a daily periodic for hammer2 cleanups

2 weeks agoinstaller - Add hammer2 support to the installer
Matthew Dillon [Tue, 12 Sep 2017 00:57:56 +0000 (17:57 -0700)]
installer - Add hammer2 support to the installer

* hammer2 can now be selected as a filesystem in the installer.

* Note that we still for /boot to use UFS.  The boot loader *CAN*
  access a hammer2 /boot, but the small size of the filesystem makes
  it too easy to fill up when doing installkernel or installworld.

* Also fix a minor bug in the installer.  when issuing a 'dumpon device'
  be sure to first issue a 'dumpon off' to avoid dumpon complaints about
  a dump device already being specified.

2 weeks agohammer2 - Include by default in kernel build
Matthew Dillon [Tue, 12 Sep 2017 00:55:22 +0000 (17:55 -0700)]
hammer2 - Include by default in kernel build

* Include hammer2 by default in the kernel build

2 weeks agohammer2 - Add 'cleanup' command, retool h2 build for conf/files inclusion
Matthew Dillon [Tue, 12 Sep 2017 00:53:41 +0000 (17:53 -0700)]
hammer2 - Add 'cleanup' command, retool h2 build for conf/files inclusion

* Add a preliminary 'hammer2 cleanup' command that works similar to

* Retool xxhash and zlib prefixing to avoid kernel conflicts and to
  allow hammer2 to be included in conf/files.

2 weeks agohammer2 - Limit bulkfree cpu and SSD I/O
Matthew Dillon [Mon, 11 Sep 2017 21:46:31 +0000 (14:46 -0700)]
hammer2 - Limit bulkfree cpu and SSD I/O

* Limit resource utilization when running bulkfree.  The default is 5000
  tps (meta-data blocks per second) and can be changed via the
  vfs.hammer2.bulkfree_tps sysctl.

* Designed primarily to limit cpu utilization when meta-data is cached,
  and to limit SSD utilization otherwise.  This feature generally cannot
  be used to limit HDD utilization because it cannot currently distinguish
  between cached and uncached I/O.  Setting a low a number to accomodate
  a HDD will cause bulkfree to take way too long to run.

2 weeks agokernel - Fix callout_stop/callout_reset rearm race
Matthew Dillon [Mon, 11 Sep 2017 07:11:31 +0000 (00:11 -0700)]
kernel - Fix callout_stop/callout_reset rearm race

* If a callout_reset() occurs while a callout_stop() is running, the
  callout_stop() can wind up blocking forever.  Change the conditional
  to break out of the processing loop to simply wait for the IPI to finish
  executing, and if the callout is still armed due to a callout_reset()
  the callout_stop() simply loops back to the top and retries the stop.

* Can be reproduced when itimers are used heavily (typically ghc processes
  that run during a bulk synth run).

* Race tested and verified to occur, fix appears to solve the problem.