François Tigeot [Wed, 6 Aug 2014 21:08:15 +0000 (23:08 +0200)]
drm/i915: Fix locking in i915_dma.c
Some i915_dma.c functions didn't include locking directives or used
them in a different order than Linux.
Sascha Wildner [Wed, 6 Aug 2014 19:33:13 +0000 (21:33 +0200)]
rc.subr: Add quietstart and pass through arguments to rc scripts.
None of our scripts handle additional arguments yet but it will be
needed for moused starting/stopping via devd(8) and for bluetooth
too.
Taken-from: FreeBSD
François Tigeot [Wed, 6 Aug 2014 13:31:41 +0000 (15:31 +0200)]
linux/err.h: Add IS_ERR_OR_NULL()
Obtained-from: OpenBSD
François Tigeot [Wed, 6 Aug 2014 13:30:47 +0000 (15:30 +0200)]
linux/delay.h: Add mdelay()
Obtained-from: OpenBSD
François Tigeot [Wed, 6 Aug 2014 13:29:37 +0000 (15:29 +0200)]
linux/kernel.h: Add WARN_ONCE()
Michael Neumann [Tue, 5 Aug 2014 13:20:12 +0000 (15:20 +0200)]
Fix logic error. Due to the bug, it incorrectly checked TXQ status
which in turn can leave TXQ active.
Obtained From: FreeBSD (r218038
747546166f1055f5d23ef661fbc30e355f1d6fec)
Michael Neumann [Tue, 5 Aug 2014 13:18:20 +0000 (15:18 +0200)]
Fix typo.
Obtained From: FreeBSD
Michael Neumann [Tue, 5 Aug 2014 13:02:51 +0000 (15:02 +0200)]
Correct wrong definition of PM timer mask and adjust L1/PM timer
value. While I'm here enable all clocks before initializing
controller. This change should fix lockup issue seen on AR8152
v1.1 PCIe Fast Ethernet controller.
This fixes: http://bugs.dragonflybsd.org/issues/2625
Obtained-From: FreeBSD (r217649
d7200f3f19ae70c6fc623456c4e7ceda8e6f631d)
Michael Neumann [Tue, 5 Aug 2014 12:46:44 +0000 (14:46 +0200)]
Don't bother to enable ASPM L1 to save more power. Even though I am
not able to trigger the issue with sample boards, some users seems
to suffer from freeze/lockup when system is booted without UTP cable
plugged in. I'm not sure whether this is BIOS issue or controller
bug. This change fixes AR8132 lockup issue seen on EEE PC.
Obtained-From: FreeBSD (r214542
d5b678ca0dc503e5c0ee4dde9d43cb0138def17c)
Michael Neumann [Tue, 5 Aug 2014 11:59:38 +0000 (13:59 +0200)]
status bits should be &'ed against status to be really functional.
Obtained-From: FreeBSD (r212764
d7bdcc10e4df9e7f755be4e1853272956eb39493)
Michael Neumann [Tue, 5 Aug 2014 11:44:55 +0000 (13:44 +0200)]
Make sure to disable RX MAC in alc_stop_mac(). Previously there was a logic error which it always enabled RX MAC.
Obtained-From: FreeBSD (r211285
d05dbde94114fc2033995347b55de74f9bc8e40e)
Matthew Dillon [Tue, 5 Aug 2014 07:06:31 +0000 (00:06 -0700)]
newfs_hammer2 - Set default compression and check modes
* When creating a new filesystem, make sure the directory inodes contain
proper defaults for the compression and crc check mode.
Matthew Dillon [Tue, 5 Aug 2014 07:04:24 +0000 (00:04 -0700)]
hammer2 - Add directives for setting the check mode
* Add setcheck, setcrc32, and a few other directives for setting the
crc check mode on a directory or file. The inode and any new chains
or modifications will inherit the mode.
* Make adjustments to stat and show to display more information.
Matthew Dillon [Tue, 5 Aug 2014 06:58:14 +0000 (23:58 -0700)]
hammer2 - Make the CRC check code programmable
* Add infrastructure to allow a CRC check code method request to be stored
in an inode. The method will be inherited by anything created under the
inode.
* Refactor the check_algo and comp_algo encoding to make the distinction
between the requests in ipdata fields and the actual specification stored
in bref.methods.
* Make sure that the hidden directory inherits the iroot's algorithm
specifications for consistency.
Matthew Dillon [Mon, 4 Aug 2014 08:11:46 +0000 (01:11 -0700)]
hammer2 - freemap and data check code
* Change the freemap to use 8 sets = ~32 reserved blocks instead of 14 sets.
Use a straight iterator. (TODO - must invalidate newer volumes if an older
volume backup is mounted RW because freemap will become inconsistent).
* Update the freemap documentation. Requires more work.
* MODIFY_OPTDATA can only skip to the end if chain->data is NULL in
the BREF_TYPE_DATA case. chain->data might not be NULL if the
data chain represents a compressed device buffer.
* Add check code generation and testing. Currently just kprintf()s a
warning if the check code does not match. Implement iscsi crc-32,
sha-192 (this is sha-256 with the last 64 bits XOR'd into the first),
and implement check generation and testing for all data, meta-data,
and the freemap. For Everything. 64-bit CRCs are not yet implemented.
* Fix a few cases where the DIO or the chain was not being properly
dirtied prior to making modifications to the media contents via
chain->data (this showed up because CRC tests were failing).
* Cleanup:
- CHAIN_FORCECOW is no longer used, remove it.
- Move FLUSH_DEPTH_LIMIT from hammer2.h to hammer2_flush.c.
- Remove debug structures from the blockref check union.
- Remove some #if 0'd code that is no longer applicable.
Matthew Dillon [Mon, 4 Aug 2014 08:10:06 +0000 (01:10 -0700)]
hammer2 - add crc checking skeleton
* Check meta-data crcs. If verbose (-v) is specified, also check
data crcs.
Sascha Wildner [Sun, 3 Aug 2014 02:27:51 +0000 (04:27 +0200)]
kernel/acpi_timer: Lower the bar for ACPI-fast on real and virtual machines.
This brings in FreeBSD's revisions 220331-220369:
r220369 | jkim | 2011-04-05 20:40:19 +0200 (Tue, 05 Apr 2011) | 6 lines
Lower the bar for ACPI-fast on real machines slightly. Empirical evidences
show that there are perfectly working PM timers with occasional "hiccups",
probably because of an SMI. Now we ignore the maximum if it happens once in
the test loop and the width is small enough. Also, relax normal width a bit
to count in a boundary case.
------------------------------------------------------------------------
r220336 | jkim | 2011-04-04 19:44:26 +0200 (Mon, 04 Apr 2011) | 3 lines
Always check the current minimum value to make the test more predictable.
Use INT32_MAX instead of an arbitrary big number for the initial minimum.
------------------------------------------------------------------------
r220333 | jkim | 2011-04-04 19:00:50 +0200 (Mon, 04 Apr 2011) | 5 lines
Lower the bar for ACPI-fast on virtual machines. The current logic depends
on the fact that real hardware has almost fixed cost to read the ACPI timer.
It is virtually always false for hardware emulation and it makes no sense to
read it multiple times, which is already quite expensive for full emulation.
------------------------------------------------------------------------
r220331 | jkim | 2011-04-04 18:47:42 +0200 (Mon, 04 Apr 2011) | 2 lines
Add inline to acpi_timer_read() to reduce unnecessary jumps and calls.
Sascha Wildner [Sun, 3 Aug 2014 01:08:37 +0000 (03:08 +0200)]
kernel: Don't pass the size of the var as arg2 to sysctl_handle_int().
arg1 (second parameter) is for passing a variable and arg2 (third
parameter) is for passing a constant (in which case arg1 is NULL).
Sascha Wildner [Sat, 2 Aug 2014 12:42:46 +0000 (14:42 +0200)]
isp(4): Remove wrong D_DISK and add D_MPSAFE to ops.
Sascha Wildner [Sat, 2 Aug 2014 12:42:25 +0000 (14:42 +0200)]
mpt(4): Add D_MPSAFE to ops.
Sascha Wildner [Sat, 2 Aug 2014 12:27:31 +0000 (14:27 +0200)]
kernel/usched: Make the bootverbose messages a bit more informative.
Talk about which scheduler this is about.
While here, change Sibs -> siblings.
Sascha Wildner [Sat, 2 Aug 2014 11:00:58 +0000 (13:00 +0200)]
kernel: Add D_MPSAFE to the ops of mfi(4), mrsas(4) and twa(4).
I overlooked it when I ported them.
Sascha Wildner [Sat, 2 Aug 2014 10:57:36 +0000 (12:57 +0200)]
mps(4): Add forgotten D_MPSAFE to dev_ops and use callout_init_mp().
Sascha Wildner [Sat, 2 Aug 2014 10:50:07 +0000 (12:50 +0200)]
ciss(4): Add forgotten D_MPSAFE to dev_ops and use callout_init_mp().
Sascha Wildner [Sat, 2 Aug 2014 10:14:31 +0000 (12:14 +0200)]
mps(4): Remove unnecessary assignment (cam_calc_geometry() sets it).
Matthew Dillon [Fri, 1 Aug 2014 06:03:51 +0000 (23:03 -0700)]
hammer2 - Get snapshots working again
* Clean up null-pointer dereference panics and sequencing issues when
creating a snapshot.
* Fix panic on mount if the requested label is not found or is not
mountable.
* Automatically flush the snapshot before taking and automatically flush
the super-root entry before returning.
Matthew Dillon [Fri, 1 Aug 2014 06:01:16 +0000 (23:01 -0700)]
hammer2 - Make snapshot directive more convenient
* Make the snapshot directive more intuitive. The optional arguments
are now (1) the <path> to snapshot and (2) the PFS label to use.
If not specified, the PFS label is named after the PFS the snapshot
is taken from, the last component of the path being snapshotted,
and the date and time.
* pfs-list now takes an optional argument pointing at a mounted hammer2
filesystem. -s <path> still works, it just isn't as intuitive.
Matthew Dillon [Fri, 1 Aug 2014 03:08:30 +0000 (20:08 -0700)]
hammer2 - major simplification 1/many (stabilization C)
* Remove lock-count test from write path where async reads can be
queued. Fixes false assertion.
* Deleted-but-still-open files are moved to a hidden directory, and on
mount a scan is done to remove them. The scan was improperly passing
a NODATA flag when inode data is needed to do a proper stats rollup
during the deletion.
Matthew Dillon [Fri, 1 Aug 2014 00:22:04 +0000 (17:22 -0700)]
hammer2 - Implement meta-data statistics rollup
* HAMMER2 keeps total recursive data and inode count statistics in each
inode. This means that one can determine how much storage is being
used for an entire subdirectory tree simply by doing a 'hammer2 stat <dir>'.
* Implement this by storing temporary rollup adjustments in the hammer2_chain
structure, then synchronizing those adjustments on insertions, deletions,
and flushes.
Generally speaking, the chain structure has a data_count, inode_count,
data_count_up, and inode_count_up for temporary tracking. The main count
fields are applied to the current chain AND the parent, while the *_up
fields are only applied to the parent.
For example, when an inode is inserted its stored statistics must be
applied to the parent (recursively), but not to itself.
* Preliminary implementation.
Matthew Dillon [Fri, 1 Aug 2014 00:14:20 +0000 (17:14 -0700)]
hammer2 - hammer2 stat adjustments
* Report inodes as a count rather than as 'bytes'.
Antonio Huete Jimenez [Thu, 31 Jul 2014 18:27:49 +0000 (20:27 +0200)]
kernel - Rule out vkernels from config hook delays.
Sascha Wildner [Thu, 31 Jul 2014 17:40:43 +0000 (19:40 +0200)]
Remove duplicates in usbdevs, urtwn(4) and devd(8)'s usb.conf.
Matthew Dillon [Thu, 31 Jul 2014 17:35:59 +0000 (10:35 -0700)]
drm/i915 - Fix double lock deadlock
* Fix an incorrect use wq_lock which was accidently double-locking instead
of unlocking around a sleep.
* Fixes X lockups overnight in the presence of xscreensaver. 'gears'
seems to trigger it.
Matthew Dillon [Thu, 31 Jul 2014 05:50:19 +0000 (22:50 -0700)]
hammer2 - major simplification 1/many (stabilization B)
* Change hammer2_cluster_bytes() to hammer2_cluster_need_resize()
to check for cluster size mismatches against desired. Used for
data block resizing.
* Fix panic - allow data blocks to have a chain->dio. This will be
the case when compression or other data filters are used.
* Fix null pointer panic - chain->dio can be NULL for data blocks.
* Fix null pointer panic - hlinkp is allowed to be NULL in
hammer2_unlink_file().
* Do not assert if a hardlink target cannot be found. There is a known
bug case when a directory is moved to another part of the topology
where underlying hardlinks can get lost. kprintf() instead.
* Fix inode deadlock, add missing inode unlock in hammer2_hardlink_find().
* Remove OBJTYPE_HARDLINK tests from hammer2_inode_lock_*(). It is no
longer possible for an inode's chain to point to a hardlink pointer,
it will always point to the hardlink target.
* Add some lock count tracking to the VOPs to catch left over locks on
return. (Note that read-ahead operations mess up the lock count because
the shared lock is inherited by the async op, so lock count tracking
is not done in code which handles logical file data).
* Hammer2 survives cpdup, blogbench fsx, fsstress
Sascha Wildner [Wed, 30 Jul 2014 21:24:51 +0000 (23:24 +0200)]
kernel/usb4bsd: Sync urtwn(4) with current FreeBSD.
This adds support for several (RTL8188EU based) adapters, among other
changes.
It should also fix the hangs we were seeing (using usb_pause_ls() now).
Thanks to Max Herrgaard <herrgaard@gmail.com> for testing it on a
RTL8188EU based adapter and to Christian Koch <cfkoch@sdf.lonestar.org>
for testing it with a RTL8188CU based one (Adafruit USB WiFi).
Matthew Dillon [Wed, 30 Jul 2014 21:16:18 +0000 (14:16 -0700)]
hammer2 - major simplification 1/many (stabilization)
* Remove the extra drop from hammer2_hardlink_consolidate(). It was dropping
cdip in one path but not another. The previous fix for the
hammer2_inode_common_parent() use-cast flipped the problem around, but
was otherwise correct (and more sane).
Matthew Dillon [Wed, 30 Jul 2014 20:35:55 +0000 (13:35 -0700)]
hammer2 - major simplification 1/many (stabilization)
* Fix a dirty chain leak due to detached inodes and the delayed vnode
deactivation that DragonFly does. A cache_unlink() call was missing
to properly cycle the vnode in the nrename path and a
hammer2_cluster_delete() needed the DELETE_PERMANENT flag to handle the
case where the vnode was already detached.
* Fix an inode reference count leak, callers of hammer2_inode_common_parent()
were not properly dropping the returned inode.
* Fix a deadlock due to front-end vs write-thread interactions. nvtruncbuf()
calls must not be made with an inode lock held.
* Cleanup some debugging, add some debugging.
Antonio Huete Jimenez [Wed, 30 Jul 2014 19:01:43 +0000 (21:01 +0200)]
hammer - Fix max volumes check on mount time
Antonio Huete Jimenez [Tue, 8 Jul 2014 17:27:56 +0000 (19:27 +0200)]
share/examples - Fix cdev warnings
Matthew Dillon [Wed, 30 Jul 2014 07:17:29 +0000 (00:17 -0700)]
hammer2 - major simplification of algorithms part 1/many
* Huge simplification of in-memory data structures and algorithms.
Remove delete-duplicate, ownerq (shadow copies), dbq, dbtree, and most of
the xid lo/hi sequencing. Remove all the complexities related to
managing the above elements. Net removal of ~1500 lines of code or so.
* Blockmap deletions are now handled by the frontend, so the backend doesn't
need to deal with shadowed deletions. This is still fairly optimal since
insertions are still handled by the backend during flushes. So for quick
create/delete operations the blockmap is never even initialized which means
that deletions don't have to remove anything.
* Cleanup buffer cache on file removal / last-close, but allow file delete
to simply wipe out the inode. Don't bother iterating its indirect blocks
or data blocks on-media but use the flush code to get rid of any chains
still cached.
* Buffer invalidation on permanent chain deletions for modified chains.
* Major items still TODO: flush interlocks and meta-data updates.
Sascha Wildner [Tue, 29 Jul 2014 20:45:27 +0000 (22:45 +0200)]
"Normalize" some types, s/long unsigned/unsigned long/ etc.
Just like the rest of our tree is doing it.
Sascha Wildner [Tue, 29 Jul 2014 19:55:02 +0000 (21:55 +0200)]
kernel: Completely remove the obsolete DEVICE_POLLING and SMP options.
DEVICE_POLLING is IFPOLL_ENABLE and SMP is the default for some time
now.
Sascha Wildner [Tue, 29 Jul 2014 19:45:17 +0000 (21:45 +0200)]
kernel: Remove unused and unbuilt code from the userland sysvipc GSoC.
In-discussion-with: profmakx
Sascha Wildner [Tue, 29 Jul 2014 19:36:23 +0000 (21:36 +0200)]
kernel: Make sysvipc syscalls non-optional.
Before this commit, we had three related kernel options, SYSVMSG,
SYSVSEM and SYSVSHM, to enable the syscalls. They were in all our
configs, but in theory the user could disable the functionality.
Having to deal with scenarios where they are not available is
unnecessarily complicated and there seems to be no real reason to
want to disable them.
For convenience, leave the three options as no-ops for now, so
adjusting the kernel config is not necessarily needed. We'll
change them to being unknown at some later point.
This commit also removes some parts which assumed that we had
sysvmsg.ko, sysvsem.ko and sysvshm.ko modules, like FreeBSD, but
this assumption was never true on DragonFly.
Markus Pfeiffer [Tue, 29 Jul 2014 18:52:40 +0000 (18:52 +0000)]
usb4bsd: set D_MPSAFE for usb devices (static)
Markus Pfeiffer [Tue, 29 Jul 2014 18:12:13 +0000 (18:12 +0000)]
usb4bsd: set D_MPSAFE for usb devices
Nuno Antunes [Sun, 27 Jul 2014 06:39:07 +0000 (07:39 +0100)]
kernel/netisr: Use __func__ in kprintfs.
Sascha Wildner [Sat, 26 Jul 2014 09:37:51 +0000 (11:37 +0200)]
Sync ACPICA with Intel's version
20140724.
* ACPI 5.1 is fully supported in ACPICA as of this release.
* Better handling of GPEs with no associated handler or control message.
* Timer() support in the AML Debug object.
* New -u option in acpihelp(8).
* Bug fixes & other enhancements.
For a more detailed list, please see sys/contrib/dev/acpica/changes.txt.
Sascha Wildner [Sat, 26 Jul 2014 07:53:01 +0000 (09:53 +0200)]
acpica: Exclude nsdumpdv.c, it's obsolete & its code is #ifdef'd out.
François Tigeot [Fri, 25 Jul 2014 06:18:59 +0000 (08:18 +0200)]
drm/i915: Sync intel_ringbuffer.c with Linux 3.8.13
* Preallocate next seqno before touching the ring
* Rearrange code to only have a single method for waiting upon the ring
* Don't allow ring tail to reach the same cacheline as head
* Implement workaround for broken CS tlb on i830/845
François Tigeot [Fri, 25 Jul 2014 06:15:23 +0000 (08:15 +0200)]
drm/i915: Reduce differences with Linux 3.8.13
Mostly in GEM code
Alex Hornung [Fri, 25 Jul 2014 06:05:05 +0000 (07:05 +0100)]
csprng - Add copyright & comment around sleep code
Alex Hornung [Thu, 24 Jul 2014 20:57:39 +0000 (21:57 +0100)]
csprng - fix unused variable
Alex Hornung [Thu, 24 Jul 2014 20:53:33 +0000 (21:53 +0100)]
csprng - don't wait for entropy for the ratectl'ed reseed
Imre Vadasz [Thu, 24 Jul 2014 18:19:39 +0000 (20:19 +0200)]
kernel/rum: Fix TX rate control. Use usb_pause_ls instead of zsleep.
* Fix TX rate control by interpreting the TX statistic counters correctly.
Taken-From: OpenBSD
* Using usb_pause_ls instead of zsleep seems to avoid deadlocks.
Matthew Dillon [Thu, 24 Jul 2014 18:56:16 +0000 (11:56 -0700)]
kernel - Fix jumbo cluster buffer deadlock
* mbufjcluster_cache and mbufphdr_jcluster_cache did not have
a nominal maintainance number set, which causes objcache to
default to (cluster_limit / 2). Both of these caches are fed
from mjclmeta_cache. The default maintainance value combined
for these two allows mjclmeta_cache to become completely exhausted.
The exhaustion results in an edge case when combined with the per-cpu
caches which can deadlock the mjclmeta_cache. The other mbuf caches
do not have this problem because they specify maintainance divisors
of at least 4.
* Implement kern.ipc.mjclph_cachefrac and kern.ipc.mjcl_cachefrac to
force the two jcluster caches to return more buffers to mjclmeta_cache.
Default to 4 and 16.
* Force all cachefrac values for all mbuf caches to not be less than 3
to prevent sysop foot-shooting.
* Also set a fixed cachefrac of 4 for mbuf_cache, mclmeta_cache, and
mjclmeta_cache. The default in objcache of 2 (aka 1/2) is overkill.
(this change is subject to review from Sephe).
Reported-by: joris
Alex Hornung [Thu, 24 Jul 2014 19:41:03 +0000 (20:41 +0100)]
csprng - If not enough entropy is available, sleep
* If no reseed has happened yet, or if we were unsuccessful in
reseeding the prng, sleep and try again whenever a reseed
occurred or entropy has been added to the pools.
Reported-by: YONETANI
Sascha Wildner [Wed, 23 Jul 2014 20:48:43 +0000 (22:48 +0200)]
kernel/csprng: Compile in the SHA256_*() functions by default.
The (non-optional) CSPRNG needs them so make sha2.c "standard" too.
Reported-by: Studbolt
Matthew Dillon [Wed, 23 Jul 2014 01:52:47 +0000 (18:52 -0700)]
kernel - Redo struct vmspace allocator and ref-count handling.
* Get rid of the sysref-based allocator and ref-count handler and
replace with objcache. Replace all sysref API calls in other kernel
modules with vmspace_*() API calls (adding new API calls as needed).
* Roll-our-own hopefully safer ref-count handling. We get rid of exitingcnt
and instead just leave holdcnt bumped during the exit/reap sequence. We
add vm_refcnt and redo vm_holdcnt.
Now a formal reference (vm_refcnt) is ALSO covered by a holdcnt. Stage-1
termination occurs when vm_refcnt transitions from 1->0. Stage-2 termination
occurs when vm_holdcnt transitions from 1->0.
* Should fix rare reported panic under heavy load.
Michael Neumann [Wed, 23 Jul 2014 00:12:28 +0000 (02:12 +0200)]
Document that tcb_segstack should not be reordered.
Michael Neumann [Tue, 22 Jul 2014 23:27:48 +0000 (01:27 +0200)]
Add field to tls_tcb to support segmented stacks in LLVM
When segmented stack support is enabled, LLVM adds code in front of
every function to check if the stack is already exhausted, in which
case it calls __morestack. For this reason LLVM needs to know the lower
boundary of the stack to check against the stack pointer.
The stack boundary can be stored in this per-thread field (tcb_segstack)
and accessed via %fs:32 (x86_64) or %fs:16 (i386) from the code generated
by LLVM.
Sascha Wildner [Tue, 22 Jul 2014 16:35:34 +0000 (18:35 +0200)]
kernel: Use NELEM() in a number of places.
Sascha Wildner [Tue, 22 Jul 2014 08:07:45 +0000 (10:07 +0200)]
kernel/virtio: Remove a useless #ifndef (CSUM_TSO is defined there).
François Tigeot [Mon, 21 Jul 2014 13:00:54 +0000 (15:00 +0200)]
drm/i915: Use a common fence writing routine
François Tigeot [Mon, 21 Jul 2014 09:11:52 +0000 (11:11 +0200)]
i915_gem.c: Simplify fence code
* Remove fence pipelining, it caused many spurious GPU hangs and could
never be made to work reliably
* Simplify fence finding
* Remove an useless optimisation from flush_fence()
* Remove a few now useless struct members and associated code
François Tigeot [Sun, 20 Jul 2014 18:56:44 +0000 (20:56 +0200)]
drm: Add Linux wake_up() and wait_event()
François Tigeot [Sun, 20 Jul 2014 18:21:25 +0000 (20:21 +0200)]
drm: Remove a no longer used kmalloc type
Matthew Dillon [Sun, 20 Jul 2014 18:04:18 +0000 (11:04 -0700)]
kernel - Fix error handling in NFS async bio callbacks
* The NFS request may already have an error code set as-of when the
callback occurs. Check the code before trying to decode the possibly
non-existant reply rpc.
John Marino [Sun, 20 Jul 2014 11:31:35 +0000 (13:31 +0200)]
unbreak kernel (netgraph) by adding missing header inclusion
Nuno Antunes [Fri, 18 Jul 2014 14:12:52 +0000 (15:12 +0100)]
Use system's RT_ROUNDUP and RT_ADVANCE macros instead of local copies.
Reviewed-by: dillon
Nuno Antunes [Fri, 18 Jul 2014 10:16:10 +0000 (11:16 +0100)]
net/route.h: Expose the ROUNDUP and ADVANCE macros.
* These macros are replicated in multiple places on the tree. Give
them an RT_ prefix and centralize them in net/route.h in an effort
to reduce code duplication.
* Kernel and userland changes to use these macros will come in a
subsequent commit.
Taken-from: NetBSD
Reviewed-by: dillon
Nuno Antunes [Thu, 17 Jul 2014 06:51:24 +0000 (07:51 +0100)]
netgraph7: Assert the refcount is zero when freeing the item.
Matthew Dillon [Sat, 19 Jul 2014 17:23:41 +0000 (10:23 -0700)]
kernel - Revert "Fix buildworld."
* Fix PF in a different way, by conditionalizing the inclusion of
struct pf_state instead of conditionalizing all the use cases for
pfvar.h.
* This reverts commit
56e2aaa4d1de560d06f713866ab834747982f839.
* Reorders pfvar.h a bit and conditionalizes struct pf_state { }.
François Tigeot [Sat, 19 Jul 2014 10:02:00 +0000 (12:02 +0200)]
re(4): Use MPSAFE callout
The callout function was already protected by a serializer.
Imre Vadasz [Sat, 19 Jul 2014 09:54:27 +0000 (11:54 +0200)]
mii: Add RealTek RTL8251 phy found on an ASUS A88XM-Plus mainboard.
Taken-From: OpenBSD
François Tigeot [Sat, 19 Jul 2014 07:54:07 +0000 (09:54 +0200)]
drm(4): This device is MPSAFE
* And has always been since the initial import from FreeBSD 11 years ago.
* Tested with Radeon and i915 hardware for good measure.
Sascha Wildner [Fri, 18 Jul 2014 18:49:02 +0000 (20:49 +0200)]
kernel/sym: Remove an extra semicolon in a #define.
Sascha Wildner [Fri, 18 Jul 2014 17:32:57 +0000 (19:32 +0200)]
kernel: Switch to mrsas(4) as the default for 'Thunderbird' series cards.
Matthew Dillon [Fri, 18 Jul 2014 16:32:46 +0000 (09:32 -0700)]
kernel - Adjust ssb_space_prealloc() use cases
* Add two flags to the signalsockbuf ssb_flags field.
SSB_PREALLOC - Indicates that data preallocation tracking is being used
SSB_STOPSUPP - Indicates that SSB_STOP flow control is being used
* unix domain sockets set SSB_STOPSUPP, tcp and sctp sockets
set SSB_PREALLOC.
* sendfile() requires that either SSB_PREALLOC or SSB_STOPSUPP be specified.
* Code now conditionalizes the use of ssb_space() vs ssb_space_prealloc()
based on the presence of the SSB_PREALLOC flag.
Reported-by: sephe
Sepherosa Ziehau [Fri, 18 Jul 2014 12:00:24 +0000 (20:00 +0800)]
tcp: Set upper limit for the DupThresh generated by the NCR
The DupThresh could be pretty large due to large amount of outstanding
segments on the fast local area network. If the reception side really
lost some segments, the fast recovery would be delayed for a long time.
It would become even worse, if the reception side aggregated ACKs, i.e.
widely used LRO; it could even cause timeout retransmition, which is
highly unappreciated on the fast local area network. Put an upper
limit for the DupThresh, currently 16, so that fast recovery could take
over segment retransmittion in a timely fashion. The upper limit of
DupThresh could be controlled by sysctl net.inet.tcp.ncr_rxtthresh_max.
Matthew Dillon [Fri, 18 Jul 2014 06:52:54 +0000 (23:52 -0700)]
kernel - network adjustments (netisr, tcp, and socket buffer changes)
* Change sowakeup() to use an atomic fetch when testing WAIT/WAKEUP for
a quick return. It is now coded properly. Previous coding is not known
to have created any bugs.
* Change sowakeup() to use ssb_space_prealloc() instead of ssb_space()
when testing against the transmit low-water mark. This is a bug fix
which primarily effects very tiny write()'s. The prior code is not
known to have created any problems.
* Make the netisr packet counter before doing a rollup programmer and
change the default from 512 to 32 for the moment. This may be changed
back to 512 (or some number inbetween) after further testing.
The issue here is that interrupt/netisr pipelining can cause ack aggregation
to be delayed for too many packets.
* For TCP, when timestamps are not being used, pass the correct delta
to tcp_xmit_timer() in our fallback. The function expects N+1. This
should improve/fix incorrect rtt calculations when tcp timestamps are
not in use.
* Fix an edge case in tcp_xmit_bandwidth_limit() where the 'ticks' global
could change values out from under the code. Load the global into a local
variable.
* Change the inflight code to use (t_srtt + t_rttvar) instead of
(t_srtt + t_rttbest) / 2.
This needs fine-tuning, the buffer is still too big. Expect more commits
later.
* Call sowwakeup() when appending a mbuf to a stream. The append can call
sbcompress() and make a stream buffer that has hit its mbuf limit writable
again.
* Remove the ssb_notify() macro and collapse the sorwakeup() and sowwakeup()
macros. They now just call sowakeup() on the appropriate sockbuf. The
notify test is now done in sowakeup().
Matthew Dillon [Fri, 18 Jul 2014 04:33:32 +0000 (21:33 -0700)]
kernel - turn off auto-socket sizing
* Turn off automatic socket sizing for NFS sockets. Otherwise the socket
buffer might be reduced to the point where the mbuf interface refuses
to queue w/EMSGSIZE.
TODO: We need a better fix.
Matthew Dillon [Fri, 18 Jul 2014 03:54:42 +0000 (20:54 -0700)]
kernel - Fix two NFS crashes
* Fix a bug during unmount when sillyrenames are being terminated.
When doing a forced unmount, the sillyrename vnode(s) may be VBAD.
Do not attempt to flush the sillyrename in this case.
* Fix a bug for 'soft' mounts. Soft failures do not properly set the
error code which can lead to a NULL pointer dereference in the rpc
processing code.
Set the error code to EINTR for soft mounts whos retries have been
exceeded.
Matthew Dillon [Thu, 17 Jul 2014 23:03:13 +0000 (16:03 -0700)]
kernel - Move wakeup*() to ouside a spin lock
* Move the wakeup*() calls in the linux completion interface from inside
to outside the spinlock. It can't be safely called from inside the
spinlock.
Reported-by: me_
Zach Crownover [Thu, 17 Jul 2014 11:26:11 +0000 (04:26 -0700)]
Added support for rcreload
Updated the man page date and links to account for the new symlink to
rcrun as well as add it in to the rcrun.sh based on the restart entry.
Matthew Dillon [Thu, 17 Jul 2014 05:17:19 +0000 (22:17 -0700)]
kernel - minor cpu idle statistics adjustment
* Change the idlethread test from RQF_AST_LWKT_RESCHED to
RQF_IDLECHECK_WK_MASK (which includes the first flag and adds a few more)
to determine if the idle thread is actually idle or not.
* Should not materially change reported idle% as the original test handled
the most common idle-thread-skips-halt case.
Nuno Antunes [Thu, 17 Jul 2014 03:10:06 +0000 (04:10 +0100)]
msgport.9: lwkt_initport_spin now takes a fixed_cpuid argument.
François Tigeot [Wed, 16 Jul 2014 19:52:17 +0000 (21:52 +0200)]
drm/i915: Sync intel_sprite.c with Linux 3.8.13
Matthew Dillon [Wed, 16 Jul 2014 07:07:58 +0000 (00:07 -0700)]
kernel - Add feature to allow sendbuf_auto to decrease the buffer size
* sysctl net.inet.tcp.sendbuf_auto (defaults to 1) is now able to
decrease the tcp buffer size as well as increase it.
* Inflight bwnd data is used to determine how much to decrease the
buffer. Inflight is enabled by default. If you disable it
with (net.inet.tcp.inflight_enable=0), sendbuf_auto will not
be able to adjust buffer sizes down.
* Set net.inet.tcp.sendbuf_min (default 32768) to set the floor for
any downward adjustment.
* Set net.inet.tcp.sendbuf_auto=2 to disable the decrease feature.
Nuno Antunes [Tue, 15 Jul 2014 02:16:18 +0000 (03:16 +0100)]
netgraph7: Factor out and inline item reference counting code.
* Netgraph7 assumes that nodes synchronously consume the items passed to them,
i.e. either 1) immediatly drop the item or 2) immediatly pass the item to the
next node.
The previous assumption is not true for nodes that have their own internal
item queues and defer the processing of the item. Such nodes can use these
routines to prevent the items from being freed too early.
* Move the apply callback check into the item reference release code.
Matthew Dillon [Wed, 16 Jul 2014 03:27:51 +0000 (20:27 -0700)]
kernel - Improve TCP socket handling at high speeds
* Add M_SOLOCKED to mbuf->m_flags. This flag prevents sbcompress()
from collapsing more data into a mbuf.
* Rewrite sorecvtcp() (NOTE: soreceive() could use similar treatment).
Use M_SOLOCKED to freeze mbufs in the sockbuf with the rcvtok held,
then do the uiomove() loop WITHOUT the rcvtok held, then finalize
the disposal of the mbufs with rcvtok held.
This greatly reduces contention on rcvtok against the netisr threads
when reading large amounts of data at once and reduces cpu overhead
for netisr and user network threads.
* Change the default transmit ssb_lowat from ssb_hiwat / 2 to ssb_hiwat / 4.
The (previous) default maximum socket buffer size was 256KB. The default
lowat reduced the effective TCP transmit window to ~100KB. This can cause
severe buffering issues on GiGE links when multiple TCP streams are being
routed to the same cpu.
With this change the default max send buffer is ~180KB or so.
* Change the default kern.ipc.maxsockbuf from 256KB to 512KB. This
primarily effects auto-sizing of tcp buffers which in turn effects
most TCP connections.
This coupled with the hiwat fix greatly improves transmit throughput.
* Add more debugging info to the tcp inflight code.
François Tigeot [Tue, 15 Jul 2014 20:08:02 +0000 (22:08 +0200)]
drm/i915: Sync ringbuffer code with Linux 3.8.13
* Split hardware initialization and irq management to model-specific
functions
* Various little fixes and workarounds to compensate for hardware
bugs and irregular behavior
* Enable parity error interrupts
* Simplify flushing and request tracking
François Tigeot [Tue, 15 Jul 2014 20:02:10 +0000 (22:02 +0200)]
drm: Fix locking issues in drm_irq.c
* Some functions expected the drm lock to be used differently than what
gpu drivers really did, leading to crashes
* Sync them with Linux 3.8.13
Reported-by: Johannes Hofmann
Matthew Dillon [Tue, 15 Jul 2014 19:31:50 +0000 (12:31 -0700)]
kernel - Add safety for Intel SYSRET issue
* First, insofar as we can tell DragonFly was *NOT* vulnerable to the
Intel SYSRET issue. We have a RQF_QUICKRET flag that determines if SYSRET
can be used. Any heavy weight process switch, signal delivery, signal
return, or set_regs() call clears this flag and forces the system call to
return via IRET.
* However, the ptrace() path is a bit convoluted. Insofar as I can tell
it just won't allow %rip to be changed unless the target process is in
a SSTOPped state, meaning that a heavy weight context switch must occur
before the new %rip is used which means we should be safe.
Still, we are adding a safety to ptrace_set_pc() to cannonicalize the
%rip anyway, to ensure that this bug cannot bite us indirectly in the
future.
François Tigeot [Tue, 15 Jul 2014 16:20:56 +0000 (18:20 +0200)]
drm: Reorder functions in drm_irq.c
* Reducing differences with Linux 3.8.13
* No functional change
Sascha Wildner [Tue, 15 Jul 2014 09:15:47 +0000 (11:15 +0200)]
Update the pciconf(8) database.
July 14, 2014 snapshot from http://pciids.sourceforge.net/
Sascha Wildner [Tue, 15 Jul 2014 09:03:03 +0000 (11:03 +0200)]
<sys/protosw.h>: Use netmsg_t.
Sascha Wildner [Tue, 15 Jul 2014 08:42:17 +0000 (10:42 +0200)]
kernel/netgraph7: Use kprintf etc. directly instead of defining printf.
While here, remove some commented out code from dragonfly.h
In-discussion-with: nant
Sascha Wildner [Tue, 15 Jul 2014 08:09:00 +0000 (10:09 +0200)]
kernel/netgraph: Don't grab the tty_token around ldisc_{,de}register().
The functions already grab it themselves.
Pointed-out-by: nant
Sascha Wildner [Tue, 15 Jul 2014 07:03:41 +0000 (09:03 +0200)]
kernel/netgraph7: Remove unneeded CFLAGS.