Matthew Dillon [Mon, 21 Oct 2013 17:59:40 +0000 (10:59 -0700)]
kernel - Rewrite lockmgr / struct lock
* Rewrite lockmgr() to remove the exclusive spinlock used internally
to guard operations.
* Retain existing API and operational semantics. This is primarily:
- Acquiring a LK_SHARED lock on a lock the caller already owns
exclusively simply bumps the count and retains the exclusive
nature of the lock.
- Exclusive requests and upgrade requests have priority over shared
locks even if the lock is currently held shared, unless the thread
is flagged for deadlock treatment.
- Upgrade requests are capable of guaranteeing the upgrade (as before).
This could be further enhanced because we now have the last release
transfer the exclusive lock to the upgrade requestor, but the original
API didn't have a function for this so neither do we. The more
primitive detection method is used (aka LK_SLEEPFAIL and/or
LK_EXCLUPGRADE).
* Reduce multiple tracking fields into one field so we can use
atomic_cmpset_int().
* Hot-path common operations. A single atomic_cmpset_int() gets us
through.
Matthew Dillon [Mon, 21 Oct 2013 17:17:12 +0000 (10:17 -0700)]
kernel - Fix a SMP race between pageout and exec_new_vmspace()
* Panics on token mismatch due to p->p_vmspace being replaced out
from under a process utilizing p->p_vmspace->vm_map.map_token.
* Fix a SMP race between pageout and exec_new_vmspace(). The pageout
code properly PHOLD()s the process and related process token but
fails to hold p->p_vmspace during a potentially blocking call.
Thus it is still possible to race termination of the vmspace and/or
for the process to replace its vmspace while the pageout activity is
in progress.
* Use vmspace_hold()/vmspace_drop() and reference the vmspace directly
after load it from p->p_vmspace. The race is allowed, but the vmspace
will no longer be destroyed out from under the pageout and the code
will no longer attempt to release the wrong token.
Sascha Wildner [Sun, 20 Oct 2013 08:04:36 +0000 (10:04 +0200)]
kernel - Rewrite vnode ref-counting code to improve performance
* Rewrite the vnode ref-counting code and modify operation to not
immediately VOP_INACTIVE a vnode when its refs drops to 0. By
doing so we avoid cycling vnodes through exclusive locks when
temporarily accessing them (such as in a path lookup). Shared
locks can be used throughout.
* Track active/inactive vnodes a bit differently, keep track of
the number of vnodes that are still active but have zero refs,
and rewrite the vnode freeing code to use the new statistics
to deactivate cached vnodes.
Sascha Wildner [Mon, 21 Oct 2013 16:13:26 +0000 (18:13 +0200)]
make.1: We use bmake.1, not make.1.
Sascha Wildner [Mon, 21 Oct 2013 07:47:16 +0000 (09:47 +0200)]
kernel/hda: Add headphone switch support for the Acer Aspire One Happy 2.
Submitted-by: shamaz
Dragonfly-bug: <http://bugs.dragonflybsd.org/issues/2596>
Sascha Wildner [Sun, 20 Oct 2013 22:33:24 +0000 (00:33 +0200)]
iswalnum_l.3: Fix prototypes.
Sascha Wildner [Sun, 20 Oct 2013 22:08:01 +0000 (00:08 +0200)]
wmemchr.3: Add missing const to prototypes.
Sascha Wildner [Sun, 20 Oct 2013 22:00:59 +0000 (00:00 +0200)]
newlocale.3: Fix header file name.
Sascha Wildner [Sun, 20 Oct 2013 08:04:36 +0000 (10:04 +0200)]
Fix two section references in hammer and jail manpages.
Sascha Wildner [Sat, 19 Oct 2013 19:49:57 +0000 (21:49 +0200)]
Some manual page fixes here and there.
It also reverts most of the changes done to tzfile.5 with the
"locale megapatch". Ours was in sync with upstream's tzcode2012c, while
FreeBSD uses tzcode2009e. There's probably more work to bring back in
libc/stdtime.
John Marino [Fri, 18 Oct 2013 21:22:45 +0000 (23:22 +0200)]
libldns, drill(1): Update to version 1.6.16
Many dports that require libldns were not building because the library
was detected, but not the ldns.h header (along with its 30+ friends).
The headers were available in contrib, but not installed. However, the
version 1.6.11 is too old for at least some of the ports, so it became
necessary to update LDNS to the latest version.
John Marino [Fri, 18 Oct 2013 22:04:02 +0000 (00:04 +0200)]
Merge branch 'vendor/LDNS'
John Marino [Fri, 18 Oct 2013 20:40:13 +0000 (22:40 +0200)]
ldns: Update vendor branch from 1.6.11 to 1.6.16
Sascha Wildner [Fri, 18 Oct 2013 14:24:37 +0000 (16:24 +0200)]
Update the pciconf(8) database.
October 11, 2013 snapshot from http://pciids.sourceforge.net/
Sepherosa Ziehau [Fri, 18 Oct 2013 02:59:57 +0000 (10:59 +0800)]
vga_pci: Fix cached resources cleanup and setup driver's softc size
- The cached resource was not cleanup even if the underlying resource was
freed. Now, if the cached resource reference count drops to zero, the
underlying resource is freed and the cached resource is cleared.
Submitted-by: dillon@
- The driver uses softc, so the softc size in the driver_t needs to be
setup properly.
Sepherosa Ziehau [Fri, 18 Oct 2013 01:53:17 +0000 (09:53 +0800)]
vga_pci: Allocate resource method requires resource owner CPUID
Matthew Dillon [Thu, 17 Oct 2013 22:15:58 +0000 (15:15 -0700)]
libc - Fix bugs in arc4random, improve arc4random() and srandomdev()
* Fix a bug where arc4random() was not using the entire 128 bytes of random
data it had read from /dev/random.
* Increase the amount of state incorporated by arc4random_stir()
from 128 to 256 bytes.
* Fix an important issue with both srandomdev() and arc4random(). When
/dev/random is not available these functions now back-off to the new
kern.random sysctl to obtain random data.
This allows these functions to work in chroot's that might not have
/dev mounted.
* Do not try to incorporate random stack data. This is stupid, the stack
data is not random.
Matthew Dillon [Thu, 17 Oct 2013 22:15:08 +0000 (15:15 -0700)]
kernel - Add sysctl kern.random
* Add a sysctl kern.random which returns random data. This can be used
when /dev/random is not available (e.g. a chroot and no /dev mounted in
the chroot).
Matthew Dillon [Thu, 17 Oct 2013 07:26:48 +0000 (00:26 -0700)]
kernel - Use shared spinlock for namecache hash
* Use a shared spinlock when doing lookups in the namecache hash table.
This wasn't showing up in the contention statistics but it's an issue.
* Use a shared spinlock for the vnode v_spin when scanning v_namecache,
when possible.
Matthew Dillon [Thu, 17 Oct 2013 03:31:21 +0000 (20:31 -0700)]
kernel - namecache clock performance improvement
* Fix a bug in _cache_lock_shared_special() which could cause a shared
lock to improperly fall-back to an exclusive lock. This could result
in a cascade which regressed all namecache locks on the ncp in question
to also fall-back.
Matthew Dillon [Wed, 16 Oct 2013 22:10:50 +0000 (15:10 -0700)]
kernel - Fix panic in sysctl_kern_proc()
* sysctl_kern_proc() loops through ncpus and moves the thread to each cpu
in turn in order to access its local thread list.
* Fix a panic where the function does not return on the same cpu it was
called on. The userland scheduler expects threads to return to usermode
on the same cpu they left usermode on and is responsible for moving the
thread to another cpu (for userland scheduling purposes) itself.
Sascha Wildner [Wed, 16 Oct 2013 18:24:34 +0000 (20:24 +0200)]
make upgrade: Clean up after the "locale megapatch".
Sascha Wildner [Wed, 16 Oct 2013 17:42:42 +0000 (19:42 +0200)]
make upgrade: Do no longer remove multibyte.3
Sascha Wildner [Wed, 16 Oct 2013 17:25:29 +0000 (19:25 +0200)]
kernel/hammer2: Remove two unused malloc types, W_BIOQUEUE and W_MTX.
Sascha Wildner [Wed, 16 Oct 2013 16:45:45 +0000 (18:45 +0200)]
libc/nftw: Bring in some fixes from FreeBSD.
* Silently skip directories causing loops (instead of erroring with ELOOP).
* Refresh OpenBSD CVS IDs.
* Don't check maxfds against OPEN_MAX.
Matthew Dillon [Wed, 16 Oct 2013 06:03:18 +0000 (23:03 -0700)]
ps - Allow a pid specification in combination with -R
* e.g. 'ps R23434' will output the specified process plus also output
all children (recursively) of that process.
Sascha Wildner [Wed, 16 Oct 2013 04:17:09 +0000 (06:17 +0200)]
Remove formatted manual pages too via 'make upgrade'.
Sascha Wildner [Wed, 16 Oct 2013 04:15:50 +0000 (06:15 +0200)]
ppp(8): Fix a logic error.
MPPE only accepts protocol numbers 0x21 through 0xfa.
Confirmed-by: FreeBSD
Matthew Dillon [Tue, 15 Oct 2013 19:25:30 +0000 (12:25 -0700)]
kernel - improve pv_hold()
* pv_hold() can just use an atomic_add_int() here.
Matthew Dillon [Tue, 15 Oct 2013 19:13:49 +0000 (12:13 -0700)]
kernel - Fix spin_lock_shared() race
* Fix a serious bug in the shared spinlock code. There are numerous races.
The main problem is that the spin_lock_shared*() inlines set the
SPINLOCK_SHARED bit based on a non-atomic test, assuming that the previous
atomic operation guarded the test:
atomic_add_int(&spin->counta, 1);
if (spin->counta == 1)
atomic_set_int(&spin->counta, SPINLOCK_SHARED);
However, this can race an exclusive spin lock in another thread inbetween
the conditional and the atomic_set_int(). The exclusive spinlock code
would have seen contention, but then does this:
atomic_clear_int(&spin->counta, SPINLOCK_SHARED);
atomic_add_int(&spin->counta, SPINLOCK_EXCLWAIT - 1);
...
The exclusive spinlock code then assumes that the shared spinlock code
can no longer set SPINLOCK_SHARED. However, if this occurs just after
the shared spinlock code's if (spin->counta == 1) but before it atomically
sets the SHARED bit, we wind up in a situation where the exclusive spinlock
code completes its operation and leaves the SHARED bit set.
In other words, the code which believes it has sucessfully obtained an
exclusive spinlock actually winds up getting a shared spinlock. Oops!
* Fixed by guarding the shared lock conditional and atomic op and changing
the way shared lock contention is handled.
* Case was only reproducable on monster, probably due to massive shared
spinlock use in the pmap code on 48 cpu cores all fork/exec'ing /bin/sh
at the same time.
Sascha Wildner [Tue, 15 Oct 2013 19:22:51 +0000 (21:22 +0200)]
bsd-family-tree: Sync with FreeBSD.
Sascha Wildner [Tue, 15 Oct 2013 19:00:10 +0000 (21:00 +0200)]
btx: Add FreeBSD's r256293 (fixes boot on Jetway NF81 mobo with RAID enabled).
FreeBSD's commit msg:
Sanitize the %eflags returned by BIOS routines. Some BIOS routines enter
protected mode and may leave protected-mode-specific flags like PSL_NT set
when they return to real mode. This can cause a fault when BTX re-enters
protected mode after the BIOS mode returns.
Reported-by: Julian Pidancet <julian.pidancet@gmail.com>
Taken-from: FreeBSD
Sascha Wildner [Mon, 14 Oct 2013 16:47:44 +0000 (18:47 +0200)]
Fix up some include guards (and checks) in our header files.
Matthew Dillon [Tue, 15 Oct 2013 01:27:17 +0000 (18:27 -0700)]
kernel - work around ipmi serial port bug
* On our supermicro blade server the ipmi can get confused when the
host initializes the 16550A and may fail to clear the RXRDY interrupt
status, resulting in an endless loop.
This appears to only occur when interrupts are enabled early to support
kern.alt_break_to_debugger on a serial console.
* Issuing a dummy read of the RXDATA register appears to unstick the ipmi.
Go figure.
Matthew Dillon [Tue, 15 Oct 2013 00:11:38 +0000 (17:11 -0700)]
kernel - more vfs syncer stuff
* Make sure we stop the thread when a mount attempt fails.
Matthew Dillon [Mon, 14 Oct 2013 23:54:37 +0000 (16:54 -0700)]
kernel - Fix bug last commit (2)
* Oops. and don't try to get the syncer thread's context if there is no
syncer thread for a mount point. I'm sure I'll get this right.
Matthew Dillon [Mon, 14 Oct 2013 23:53:13 +0000 (16:53 -0700)]
kernel - Fix bug last commit
* Don't add the syncer vnode to the syncer list if the mount point
has no syncer thread (optimization for nullfs).
Matthew Dillon [Mon, 14 Oct 2013 23:41:03 +0000 (16:41 -0700)]
kernel - Fix hammer recovery crash (due to recent syncer work)
* Unconditionally create a syncer thread for each mount. This way we can
create the thread prior to calling VFS_MOUNT.
* hammer(1) needs to acquire vnodes and potentially issue vn_rdwr()'s during
mount for recovery purposes. This syncer thread is expected to already
exist. (and it does now).
* Remove the default syncer thread.
* rewrite speedup_syncer().
Matthew Dillon [Mon, 14 Oct 2013 23:34:32 +0000 (16:34 -0700)]
kernel - Use per-cpu token for deadlwps list
* There is a deadlwps reaper thread per cpu, use a per-cpu token instead
of a global token to control it.
Matthew Dillon [Mon, 14 Oct 2013 16:46:30 +0000 (09:46 -0700)]
kernel - Concurrent fork/exec (3) - Fix 32-bit builds & vkernels
* 32-bit kernels and both 32 and 64-bit vkernels were setting
kernel_pmap.pm_pteobj to kernel_object. This creates a shared/excl
race that locks them up.
* Replace with kptobj, a vm_object dedicated to the kernel_pmap.
* Not applicable on normal 64-bit kernels as they use a more modern
pmap implementation that does not require a pm_pteobj.
Reported-by: tuxillo
Franco Fichtner [Sun, 13 Oct 2013 23:20:55 +0000 (01:20 +0200)]
pkill/pwait: tweak manuals
Franco Fichtner [Sun, 13 Oct 2013 20:38:42 +0000 (22:38 +0200)]
mdocml: upstream sync of lib.in
Heh, Ingo fetched our lib.in/st.in changes already. Cheers!
Here are some more library definitions from NetBSD and FreeBSD.
Taken-from: OpenBSD
François Tigeot [Sun, 13 Oct 2013 17:16:56 +0000 (19:16 +0200)]
drm: Sync drm_hashtab files with Linux 3.8
Replacing BSD LIST_xxx macros by Linux hlist functions and data types
François Tigeot [Sun, 13 Oct 2013 16:36:42 +0000 (18:36 +0200)]
drm: Add hlist RCU macros
François Tigeot [Sun, 13 Oct 2013 16:44:20 +0000 (18:44 +0200)]
drm: Replace BSD and legacy DRM macros by Linux mechanisms
Reducing differences with Linux
François Tigeot [Sun, 13 Oct 2013 17:35:51 +0000 (19:35 +0200)]
drm: Rename DRM_LIST_HEAD to LINUX_LIST_HEAD
It is supposed to be the well-known Linux LIST_HEAD() macro after all
Matthew Dillon [Sun, 13 Oct 2013 17:48:57 +0000 (10:48 -0700)]
kernel - Concurrent fork/exec (2)
* Fix bug in vm_fault() path that can cause a token live lock.
When taking a write fault first_shared must be set to 0
because the fault might involve calling swap_pager_unswapped(),
which currently requires an exclusive VM object lock.
Matthew Dillon [Sun, 13 Oct 2013 17:08:40 +0000 (10:08 -0700)]
drm - Fix kernel compile
* Linux's LIST_HEAD interferes with ours. Rename as it was previously
renamed from LIST_HEAD to DRM_LIST_HEAD. Also, other source files
already assumed DRM_LIST_HEAD.
* Fixes kernel compile
François Tigeot [Sun, 13 Oct 2013 16:34:29 +0000 (18:34 +0200)]
drm: Add linux/compiler.h from the FreeBSD OFED stack
François Tigeot [Sun, 13 Oct 2013 16:31:44 +0000 (18:31 +0200)]
drm: Replace drm_linux_list.h by linux/list.h ...
... from FreeBSD's OFED stack
* Keep a few missing functions and macros
François Tigeot [Sun, 13 Oct 2013 14:59:38 +0000 (16:59 +0200)]
drm: Add a local implementation of linux/export.h
* The Linux drm files often contain EXPORT_SYMBOL() macros
* We don't /need/ them but keeping them unchanged in the drm
files helps reducing differences with Linux
François Tigeot [Sun, 13 Oct 2013 14:47:04 +0000 (16:47 +0200)]
drm: Add a local implementation of linux/hash.h
* Implement hash_long() as a wrapper on top of the existing
hash32_buf() function
Jean-Sébastien Pédron [Sat, 7 Sep 2013 09:43:36 +0000 (11:43 +0200)]
drm: Define BITS_PER_LONG
At the same time, rename a macro of the same name in drm_atomic.h. They
both have the same meaning but the one in drm_atomic.h uses sizeof(),
which prevents from using it inside an #if preprocessor condition.
Matthew Dillon [Sat, 12 Oct 2013 23:10:36 +0000 (16:10 -0700)]
kernel - Greatly improve concurrent fork's and concurrent exec's
* Rewrite all the vm_fault*() API functions to use a two-stage methodology
which keeps track of whether a shared or exclusive lock is being used
on fs.first_object and fs.object. For most VM faults a shared lock is
sufficient, particularly under fork and exec circumstances.
If the shared lock is not sufficient the functions will back-down to an
exclusive lock on either or both elements.
* Implement shared chain locks for use by the above.
* kern_exec - exec_map_page() now attempts to access the page with a
shared lock first, and backs down to an exclusive lock if the page
is not conveniently available.
* vm_object ref-counting now uses atomic ops across the board. The
acquisition call can operate with a shared object lock. The deallocate
call will optimize decrementation of ref_count for values above 3 using
an atomic op without needing any lock at all.
* vm_map_split() and vm_object_collapse() and associated functions are now
smart about handling terminal (e.g. OBJT_VNODE) VM objects and will use
a shared lock when possible.
* When creating new shadow chains in front of a OBJT_VNODE object, we no
longer enter those objects onto the OBJT_VNODE object's shadow_head.
That is, only DEFAULT and SWAP objects need to track who might be shadowing
them. TODO: This code needs to be cleaned up a bit though.
This removes another exclusive object lock from the critical path.
* vm_page_grab() will use a shared object lock when possible.
François Tigeot [Sat, 12 Oct 2013 13:14:49 +0000 (15:14 +0200)]
netinet/in.h: Add missing IPPORT_MAX definition
Obtained-from: FreeBSD
John Marino [Sat, 12 Oct 2013 12:35:31 +0000 (14:35 +0200)]
Fix world build bootstrapping issue
The new version of find(1) uses a new locale function called rpmatch.
The use of this function breaks the world build on systems that don't
have the new locale functionality yet, e.g. Release 3.4.
To fix this, I added a BOOTSTRAPPING macro that will check the
first letter of the response character array rather than use the
locale function. This modification appears only in the bootstrap tool.
John Marino [Fri, 11 Oct 2013 23:51:11 +0000 (01:51 +0200)]
pwait(1): Import from FreeBSD, built without modification!
John Marino [Fri, 11 Oct 2013 23:04:54 +0000 (01:04 +0200)]
pgrep(1), pkill(1): Sync with FreeBSD to get the -F options
These functions haven't been touched since DragonFly 1.1 (2004), other
than build tweaks. It was claimed that it would be "trivial" to add
the -F options (pid file) but I wouldn't classify it as such. There
is a pretty big diff between the FreeBSD and DragonFly 1.1 version. I
had make some modifications, but the functions appear to work in the
very short tests that I performed.
John Marino [Fri, 11 Oct 2013 22:13:17 +0000 (00:13 +0200)]
find(1): Sync with FreeBSD
Originally I did this to gain the -quit feature. It turns out that we
already had it, but FreeBSD didn't document it until 4 years after they
implemented it, so it wasn't on our man page either.
With this sync, we get time comparisons down to the nanosecond. We get
the -sparse option. The behavior of -delete was changed the delete the
files given as arguments.
The only thing that was omitted with all the birthtime options.
The new find was used successfully in locate.update job, something
that failed in the last sync and code in functions.c had to be
partially reverted.
John Marino [Fri, 11 Oct 2013 20:18:38 +0000 (22:18 +0200)]
pwd(1): Sync with FreeBSD (very minor, mainly editorial)
John Marino [Fri, 11 Oct 2013 20:12:25 +0000 (22:12 +0200)]
realpath(1): Sync with FreeBSD to add -q option
If -q is specified, warnings will not be printed out when realpath(3) fails.
John Marino [Fri, 11 Oct 2013 20:03:46 +0000 (22:03 +0200)]
pwd(1), realpath(1): Split shared source file into separate ones
The realpath(1) and pwd(1) programs share the same source file using the
program name as a condition. Let's create a dedicated directory for
realpath, duplicate the source file, then tailor both. Not only is this
more logical, it will make maintenance easier in the (near) future.
Matthew Dillon [Fri, 11 Oct 2013 19:44:03 +0000 (12:44 -0700)]
kernel - Fix bug when running swapon a gpt slice
* diskpsize() and related API functions to obtain the number of blocks
in a disk specification was punting if there was no dragonfly disklabel.
This path was being specifically used by swapon.
* Do not require a dragonfly disklabel when the whole-slice partition is
specified (i.e. /dev/daXsY with no a...z suffix).
* Swapon now works on gpt swap slices.
Reported-by: julianp
Matthew Dillon [Fri, 11 Oct 2013 17:48:43 +0000 (10:48 -0700)]
kernel - Performance optimization pass
* Numerous pid and priority related syscalls, such as getpid(), were
improperly acquiring proc_token to protect fields that are now protected
with per-process or per-pgrp tokens.
Do a pass on kern_prot.c and kern_resource.c fixing these issues. This
removes the use of proc_token from several common system call paths but
it should be noted that none of these system calls are in critical paths.
The benefit is probably minor but will improve performance in the face
of allproc-scanning operations (such as when you do a 'ps' or 'top').
* vmntvnodescan() is not in the critical path except for vflush()'s which
occur on umount. vflush()'s pass a NULL fast function. The
vmntvnodescan() only needs to hold the vmobj_token when the fastfunc is
non-NULL. Do not hold the vmobj_token when fastfunc is NULL.
This primarily improves performance when tmpfs's are being mounted and
unmounted at a high rate (poudriere bulk builds).
dumbbell [Sun, 15 Sep 2013 07:48:42 +0000 (07:48 +0000)]
drm/radeon: Add missing "return false" after unmapping invalid BIOS
Without that, we would try to copy the unmapped BIOS.
Submitted by: Christoph Mallon <christoph.mallon@gmx.de>
Approved by: re (blanket)
Jean-Sébastien Pédron [Sat, 7 Sep 2013 09:54:26 +0000 (11:54 +0200)]
drm: Rename struct drm_driver_info to struct drm_driver
This reduces the diff between FreeBSD and Linux 3.8.
Jean-Sébastien Pédron [Sat, 7 Sep 2013 09:50:46 +0000 (11:50 +0200)]
drm: Rename members of struct drm_agp_head to match Linux
This reduces the diff between FreeBSD and Linux 3.8.
François Tigeot [Fri, 11 Oct 2013 13:38:12 +0000 (15:38 +0200)]
vga_pci.c: Sync with FreeBSD
Most important change:
* Use vga_pci_alloc_resource() to map PCI Expansion ROMs
* This is cleaner and fixes Video BIOS mapping when the given device isn't
the boot display.
* Original author: dumbbell@
François Tigeot [Fri, 11 Oct 2013 13:30:41 +0000 (15:30 +0200)]
kernel: Add a method to get the bus's bus_dma_tag_t
This is required by some machine architectures where there are separate
IOMMU's for each PCI bus.
Obtained-from: FreeBSD
Antonio Huete Jimenez [Fri, 11 Oct 2013 12:34:41 +0000 (05:34 -0700)]
stat.1 - Remove unsupported %B
* Our struct stat does not have st_birthtim so don't mention it.
Matthew Dillon [Fri, 11 Oct 2013 07:00:39 +0000 (00:00 -0700)]
kernel - Optimize sync and msync for tmpfs and nfs
* Flesh-out the vfs_sync API and implement vhold/vdrop callbacks
(used by NFS).
* Use MNTK_THR_SYNC in tmpfs and finish implementing it in nfs. This
will optimize sync and msync for these filesystems.
* In both cases inode attributes are either synchronous or don't involve
any VFS work to flush, so we don't have to use VISDIRTY.
Matthew Dillon [Fri, 11 Oct 2013 06:13:22 +0000 (23:13 -0700)]
kernel - Optimize vfs_msync() when MNTK_THR_SYNC is used
* vfs_msync() will now use vsyncscan() when MNTK_THR_SYNC is set.
Lazy synchronization scans will still properly ignore MADV_NOSYNC
areas, but will not be able to optimize away the scan overhead for
those vnodes (they remain on the syncer list).
This change allows both lazy synchronization and explicit 'sync' commands
to avoid having to scan all cached vnodes on the system, resulting in O(1)
operation in many cases where it might have taken a few seconds before
(on large systems with hundreds of thousands to millions of vnodes cached).
With this change both the vnode sync and the memory sync will be optimal.
Currently implemented for hammer1 and hammer2.
* Add VOBJDIRTY to the set of flags that will place the vnode on the
syncer list. This occurs from the vm_page_dirty() and other bits of
code only if MNTK_THR_SYNC is set.
Theoretically it should be safe for us to do this even though neither
the vm_object or the related vnode are likely locked or guarded, because
neither can go away while an associated vm_page is busied. The syncer
list code itself is protected with a token.
Matthew Dillon [Fri, 11 Oct 2013 05:21:49 +0000 (22:21 -0700)]
hammer - Use new vsyncscan() mechanic (3)
* The vsyncscan() feature requires using MNTK_THR_SYNC, otherwise the
callback has to deal with vnodes unrelated to the mount point.
Assert this in vsyncscan().
* Enable MNTK_THR_SYNC in hammer
* Cleanup edge cases in the scan2 callback.
Matthew Dillon [Fri, 11 Oct 2013 02:33:08 +0000 (19:33 -0700)]
hammer - Use new vsyncscan() mechanic (2)
* Fix crash, VISDIRTY must be cleared in reclaim.
* Implement convenient API functions to set and clear VISDIRTY and
properly synchronize the syncer list.
Matthew Dillon [Fri, 11 Oct 2013 02:02:24 +0000 (19:02 -0700)]
hammer - Use new vsyncscan() mechanic.
* Use the new vsyncscan() mechanic, greatly reducing the work involved
in finding dirty vnodes in hammer_vfs_sync() and during unmounting.
Call vsyncscan() instead of vmntvnodescan().
* We set VISDIRTY in the vnode and call vn_syncer_add() when an inode
becomes dirty. This ensures that dirty vnodes are placed in the syncer
list even if the vnode has no related dirty file data buffers.
Previously we had to do a full scan of the mount's vnode list to
find dirty inodes.
Matthew Dillon [Fri, 11 Oct 2013 01:56:45 +0000 (18:56 -0700)]
kernel - Add vsyncscan() infrastructure
* For VFS's which support it, allows vnodes with dirty inodes to be placed
on the syncer list rather than just vnodes with dirty buffers. The VFS
can then implement its VFS_SYNC ops by calling vsyncscan() instead of
vmntvnodescan().
* On large systems with potentially hundreds of thousands to millions of
cached vnodes, this reduces sync scan overhead by several orders of
magnitude.
* Add the VISDIRTY flag to vnode->v_flag to indicate a dirty inode, adjust
syncer add/delete code to use the flag.
* Cleanup vfs_sync.c. Always initialize mp->mnt_syncer_ctx to something.
Change the kern.syncdelay sysctl to use SYSCTL_PROC which properly
range-checks syncdelay.
* Implement vsyncscan() which only scans the syncer lists for a mount point.
Franco Fichtner [Thu, 10 Oct 2013 21:08:10 +0000 (23:08 +0200)]
man: no need for libutil
Franco Fichtner [Thu, 10 Oct 2013 19:27:48 +0000 (21:27 +0200)]
man: fix suffix parsing for good
Instead of guessing the suffix in the code, use the suffix list previously
loaded via man.conf(5). While there, zap unused interation code.
Franco Fichtner [Thu, 10 Oct 2013 16:59:21 +0000 (18:59 +0200)]
netgraph.4: manlint and style nitpicking
Franco Fichtner [Thu, 10 Oct 2013 16:39:25 +0000 (18:39 +0200)]
man: mop up a couple of manlint issues
Antonio Huete Jimenez [Thu, 10 Oct 2013 11:24:31 +0000 (04:24 -0700)]
dirfs - Rework how host file permissions are checked.
* Retrieve uid/gid of the user running the vkernel on mount time
instead of on every open(2).
Matthew Dillon [Thu, 10 Oct 2013 06:34:13 +0000 (23:34 -0700)]
kernel - Attempt to fix tty race
* Opening /dev/tty is special cased to open the session ttyvp. The
VCTTYISOPEN flag is used on the session ttyvp to indicate this.
* There is a bug where the VCTTYISOPEN flag is set prior to calling
VOP_OPEN() on ttyvp. Because devfs's devfs_spec_open() (and
also devfs_spec_close()) temporarily release the vnode lock
on the vp (ttyvp in this case), setting the flag prior to
the VOP_OPEN() can lead to a race where another process opens
AND closes /dev/tty before our VOP_OPEN() executes.
The racing open will see that the VCTTYISOPEN flag is already
set and not issue a VOP_OPEN(). It's close will then VOP_CLOSE()
ttyvp (which so far has not been opened by either process),
which can kill the last open on ttyvp and cause the tty to
disconnect.
This race is very difficult to reproduce. We were only able to
reproduce it on monster (48-core opteron) which happened to
access "/dev/tty" during a poudriere bulk build in a manner
that was able to trigger the race.
* Fix this particular bug by not setting the VCTTYISOPEN flag
until after VOP_OPEN() returns, then re-checking the flag to
detect the race and clean-up/retry if a race is detected.
* TODO - This is not the only bug. Unfortunately it is also quite possible
for multiple threads/processes to open("/dev/tty", ...) simultaniously.
There is only one VCTTYISOPEN flag so when this occurs and one process
then close()s its descriptor, the VCTTYISOPEN flag is cleared.
The other process or processes may then proceed to access ttyvp without
an opencount guard. When they close() the count is handled properly
because the close() code detects that the VCTTYISOPEN flag was cleared.
The problem is the unguarded read, write, and ioctl calls that might
occur in the mean time.
Matthew Dillon [Wed, 9 Oct 2013 17:15:56 +0000 (10:15 -0700)]
dmesg - Add -f option for continuous monitoring
* Add the -f option to dmesg. After the initial message buffer dump
dmesg monitors the kernel for additional data and displays it as it
arrives. dmesg will not terminate until killed in this mode.
* The sysctl() is deficient so libkvm is forced when this option is
specified.
Sascha Wildner [Wed, 9 Oct 2013 17:09:00 +0000 (19:09 +0200)]
bsd-family-tree: Sync with FreeBSD.
Sascha Wildner [Wed, 9 Oct 2013 17:07:47 +0000 (19:07 +0200)]
kernel: Fix the LINT kernels.
Matthew Dillon [Wed, 9 Oct 2013 15:58:35 +0000 (08:58 -0700)]
kernel - Fix pgrp and session ref-count races
* Fix some tight timing windows where the ref count on these structures
could race.
* Protect the pgrp hash table with a spinlock instead of using proc_token.
* Improve pgfind() performance by using the spinlock in shared mode.
* Do not transition p_pgrp through NULL when changing a process's pgrp.
Atomically transition the process (protected p->p_token and
pg->pg_token).
François Tigeot [Wed, 9 Oct 2013 13:52:53 +0000 (15:52 +0200)]
drm/radeon: Import firmwares
These files come from FreeBSD but were originally obtained from
https://git.kernel.org/cgit/linux/kernel/git/firmware/linux-firmware.git/
François Tigeot [Wed, 9 Oct 2013 12:15:04 +0000 (14:15 +0200)]
drm/radeon: Remove useless .PATH directive
François Tigeot [Wed, 9 Oct 2013 11:26:57 +0000 (13:26 +0200)]
drm/radeon: Fix locking issues
François Tigeot [Wed, 9 Oct 2013 09:46:16 +0000 (11:46 +0200)]
drm: Some drm_addmap() fixes
* Fix two warnings, moving some of the code to make the function more
similar to drm_addmap_core() in Linux 3.8
* Remove some weirdly placed locking directives not present in Linux
Matthew Dillon [Wed, 9 Oct 2013 05:49:46 +0000 (22:49 -0700)]
Makefile.usr - Remove pkgsrc targets
* DragonFly will be pure dports as of the next release so remove the
pkgsrc helper targets from /usr/Makefile.
François Tigeot [Tue, 8 Oct 2013 19:15:54 +0000 (21:15 +0200)]
drm/radeon: Import the Radeon KMS driver from FreeBSD
* Credits for porting an updated version of this driver from Linux
mainly go to Jean-Sébastien Pédron <jean-sebastien.pedron@dumbbell.fr>
* Compatibility layer for running 32-bit applications on 64-bit systems
left out
Additional credits from the FreeBSD import message:
This driver is based on Linux 3.8 and a previous effort by kan@.
More informations about this project can be found on the FreeBSD wiki:
https://wiki.freebsd.org/AMD_GPU
Help from: kib@, kan@
Tested by: avg@, kwm@, ray@,
Alexander Yerenkow <yerenkow@gmail.com>,
Anders Bolt-Evensen <andersbo87@me.com>,
Denis Djubajlo <stdedjub@googlemail.com>,
J.R. Oldroyd <fbsd@opal.com>,
Mikaël Urankar <mikael.urankar@gmail.com>,
Pierre-Emmanuel Pédron <pepcitron@gmail.com>,
Sam Fourman Jr. <sfourman@gmail.com>,
Wade <wade-is-great@live.com>,
(probably other I forgot...)
HW donations: kyzh, Yakaz
Sascha Wildner [Tue, 8 Oct 2013 20:47:51 +0000 (22:47 +0200)]
kernel/drm: Fix the LINT build.
Franco Fichtner [Tue, 8 Oct 2013 20:14:58 +0000 (22:14 +0200)]
a couple more Mt macros for author emails
Franco Fichtner [Tue, 8 Oct 2013 19:30:40 +0000 (21:30 +0200)]
libc: proper FreeBSD version and some Mt macros for mails
Franco Fichtner [Tue, 8 Oct 2013 19:19:24 +0000 (21:19 +0200)]
world: remove spurious Pp macros
Franco Fichtner [Tue, 8 Oct 2013 19:17:47 +0000 (21:17 +0200)]
mdocml: sync std strings with groff
Franco Fichtner [Tue, 8 Oct 2013 19:16:47 +0000 (21:16 +0200)]
groff: add recent FreeBSD releases
(reduces `manlint' noise)
Antonio Huete Jimenez [Tue, 8 Oct 2013 09:09:48 +0000 (02:09 -0700)]
hammer - Fix exit path for newly added ioctl
* Release cursor/inode on exit.
* Take in account the case where hammer_get_inode() returns a NULL ip.
Franco Fichtner [Mon, 7 Oct 2013 21:29:16 +0000 (23:29 +0200)]
mdocml: sync a few upstream commits
Most of our local changes have found its way to upstream.
Let's return the favour and bring in fixes that we have
requested. Hunk count for mandiff output reduced from
3321 to 3311 (not much, but it's a start).
# make mandiff | grep "^@@" | wc -l