Venkatesh Srinivas [Sun, 27 Nov 2011 17:57:36 +0000 (09:57 -0800)]
libc -- Remove assembler i386 strlen() routine.
On a number of processors, it is slower than the obvious C version.
(400,000,000 loops, times in sec)
On a 2.66 GHz Core 2:
asm C
10-by string 22.9 9.5
68-by string 77.2 19.8
175-by string 173.7 40.6
On a 2.0 GHz Athlon64 3000+:
asm C
10-by string 11.3 9.9
68-by string 34.7 34.7
175-by string 78.7 77.6
On a 2.2 GHz Core i7 (nehalem):
asm C
10-by string 13.4 5.2
68-by string 33.8 29.5
175-by string 71.6 67.4
Venkatesh Srinivas [Sun, 27 Nov 2011 17:16:45 +0000 (09:16 -0800)]
kernel -- ktrace: Fix possible one-word stack leak to userspace.
From OpenBSD kern_ktrace.c 1.55, via Loganaden Velvindron.
Sepherosa Ziehau [Sun, 27 Nov 2011 13:49:16 +0000 (21:49 +0800)]
x86_64/ioapic_abi: Implement MachIntrABI.rman_setup
Sepherosa Ziehau [Sun, 27 Nov 2011 11:16:41 +0000 (19:16 +0800)]
test: from lancer
Sepherosa Ziehau [Sun, 27 Nov 2011 11:13:03 +0000 (19:13 +0800)]
test: from xanadu64
Sepherosa Ziehau [Sun, 27 Nov 2011 11:09:54 +0000 (19:09 +0800)]
test: from enigma
Thomas Nikolajsen [Sun, 27 Nov 2011 08:20:38 +0000 (09:20 +0100)]
Unbreak buildworld
John Marino [Sat, 26 Nov 2011 09:23:14 +0000 (10:23 +0100)]
binutils 2.20: remove source files
Binutils 2.20 has not been building for a month, but the source files
were still present in case we wanted to switch it back on.
Per IRC discussion, the general consensus is that DragonFly wants to
continue to maintain two sets of binutils in addition to two compilers.
A new vendor branch has been created, vendor/BINUTILS-ALL, and
binutils 2.20 is the first version to go into this branch. All future
versions of binutils will also go into this branch rather than create
a new vendor branch each time. Everything else including objformat
and the naming scheme will remain as it was. The idea is that the
newest version of binutils is always "prime" and the older version is
the backup which does not require updating. When a new version of
binutils is brought in, the previous backup version will be deleted.
That's what is happening with binutils version 2.20 today.
John Marino [Fri, 25 Nov 2011 23:14:58 +0000 (00:14 +0100)]
binutils 2.22: Promote to primary binutils
John Marino [Fri, 25 Nov 2011 23:05:34 +0000 (00:05 +0100)]
binutils 2.22: Activate building in world
The next commit will change the default to make binutils 2.22 prime
and demote binutils 2.21 to be the backup.
John Marino [Fri, 25 Nov 2011 22:42:29 +0000 (23:42 +0100)]
binutils 2.22: Add makefiles, new incremental-dump binary
The makefiles and headers for binutils 2.22 are similar to those of
binutils 2.21 with the exception of the restructuring of the gold
build. A dedicated include file was created, and several files were
moved to libgold. This was done to avoid redundant compiling of
object files in common between ld.gold and incremental-dump. The
latter was never built before although the source was previously
available.
John Marino [Fri, 25 Nov 2011 22:30:41 +0000 (23:30 +0100)]
binutils 2.22: Add READMEs and local modifications
John Marino [Sat, 26 Nov 2011 11:44:16 +0000 (12:44 +0100)]
Merge branch 'vendor/BINUTILS_ALL'
John Marino [Sat, 26 Nov 2011 11:42:21 +0000 (12:42 +0100)]
Initial import of binutils 2.22 on the new vendor branch
Future version of binutils will also reside on this branch rather
than continuing to create new binutils branches for each version.
John Marino [Sat, 26 Nov 2011 11:05:52 +0000 (12:05 +0100)]
Revert "Merge branch 'vendor/BINUTILS-ALL'"
This reverts commit
92a1e2d9549ce76d785444d000d358126cf7762f, reversing
changes made to
2b195d6a566cb8441f5d6d66363235683bbd92af.
John Marino [Sat, 26 Nov 2011 11:04:00 +0000 (12:04 +0100)]
Revert "binutils 2.22: Add READMEs and local modifications"
This reverts commit
ca679fdaaa1df38750514ba8786c627ce15864d0.
John Marino [Sat, 26 Nov 2011 11:03:59 +0000 (12:03 +0100)]
Revert "binutils 2.22: Add makefiles, new incremental-dump binary"
This reverts commit
d83cfc8d85b905d12358d755882d57f95b2a0f48.
John Marino [Sat, 26 Nov 2011 11:03:57 +0000 (12:03 +0100)]
Revert "binutils 2.22: Activate building in world"
This reverts commit
e8402471895f5f3b9bc77fde77d732db0feea249.
John Marino [Sat, 26 Nov 2011 11:03:54 +0000 (12:03 +0100)]
Revert "binutils 2.22: Promote to primary binutils"
This reverts commit
c2e570e14a87f984d27b71ec02365064555dee87.
John Marino [Fri, 25 Nov 2011 23:14:58 +0000 (00:14 +0100)]
binutils 2.22: Promote to primary binutils
John Marino [Fri, 25 Nov 2011 23:05:34 +0000 (00:05 +0100)]
binutils 2.22: Activate building in world
The next commit will change the default to make binutils 2.22 prime
and demote binutils 2.21 to be the backup.
John Marino [Fri, 25 Nov 2011 22:42:29 +0000 (23:42 +0100)]
binutils 2.22: Add makefiles, new incremental-dump binary
The makefiles and headers for binutils 2.22 are similar to those of
binutils 2.21 with the exception of the restructuring of the gold
build. A dedicated include file was created, and several files were
moved to libgold. This was done to avoid redundant compiling of
object files in common between ld.gold and incremental-dump. The
latter was never built before although the source was previously
available.
John Marino [Fri, 25 Nov 2011 22:30:41 +0000 (23:30 +0100)]
binutils 2.22: Add READMEs and local modifications
John Marino [Sat, 26 Nov 2011 08:32:25 +0000 (09:32 +0100)]
Merge branch 'vendor/BINUTILS-ALL'
John Marino [Sat, 26 Nov 2011 08:27:44 +0000 (09:27 +0100)]
Initial import of binutils 2.22 on the new vendor branch
Future versions of binutils will also reside on this branch rather
than continuing to create new binutils branches for each new version.
Sepherosa Ziehau [Fri, 25 Nov 2011 06:17:43 +0000 (14:17 +0800)]
x86_64/ioapic_abi: Disable interrupt load balance by default
Add hw.ioapic.gsi.balance tunable to enable/disable interrupt
load balance. It is disabled by default.
Sepherosa Ziehau [Thu, 24 Nov 2011 05:53:54 +0000 (13:53 +0800)]
accept: Implement fast soaccept predication
Fast soaccept predication tries to run soaccept_predicate before
domsg to the proto-thread, i.e. put the current thread into sleep.
We could do this because listen socket's completion list is always
protected by the listen socket's pool-token. Domsg to proto-thread
to extract socket from completion list for non-block listen socket
does not make any sense. Even for blocking listen socket if there
are sockets on the completion list, domsg to the proto-thread to
extract socket from completion list also wastes time.
The result:
192.168.249.42 (Xeon E3-1230 HT enabled, 16G) runs httperf
192.168.249.29 (i7-2600 HT enabled, 16G) runs nginx (web server)
The server runs nginx-1.0.4 (from pkgsrc-2011Q2), using the default
configure w/ following changes:
events {
worker_connections 10240;
use kqueue;
}
The client runs httperf-0.9.0 manually compiled w/ FD_SETSIZE to 16424
The client machine runs following commands before starting benching:
net.inet.ip.portrange.last=60000
route change -net 192.168.249.0/24 -msl 500
16 parallel httperf --server=192.168.249.29 --wsess=5000,1,1 --max-conn=4
4 runs (Request rate, unit: req/s)
old 23554.0 23542.0 23557.0 23526.2
new 24793.7 24809.9 24792.7 24794.4
This gives 5.3% performance improvement
Sepherosa Ziehau [Mon, 21 Nov 2011 05:46:13 +0000 (13:46 +0800)]
bce: Use MPSAFE callout
Sascha Wildner [Wed, 23 Nov 2011 20:36:28 +0000 (21:36 +0100)]
<ucontext.h>: For now, mark *context() as i386 only.
Note that is not intended as an argument against implementing the
missing functions on x86_64. It just isn't nice to have prototypes
for missing functions.
Suggested-by: pavalos
Sascha Wildner [Wed, 23 Nov 2011 18:10:36 +0000 (19:10 +0100)]
Remove /usr/X11R6/... paths from various config and default files.
Reported-by: ftigeot
Sascha Wildner [Wed, 23 Nov 2011 17:58:15 +0000 (18:58 +0100)]
Sync 'make distribution' with an upgraded system, file-wise.
Two symlinks in /usr/include/machine are created via 'make upgrade' for
historical reasons detailed in
a8f70ff21e2ea644b109f4ecb198b6cfa6fef8dc.
Add the creation of those to 'make distribution' too, so that a fresh
system off the CD has them. At some later point, we can remove it from
upgrade.
Reported-by: marino
Sascha Wildner [Tue, 22 Nov 2011 03:34:57 +0000 (04:34 +0100)]
i386/cpufunc.h: Adjust opcodes which are specified as ".byte 0xNN, 0xMM".
This was from the days when older assemblers wouldn't recognize the
new Pentium instructions.
Taken-from: FreeBSD
Antonio Huete Jimenez [Tue, 22 Nov 2011 00:41:03 +0000 (01:41 +0100)]
nwfs - Use global ncpus
No need to use a sysctl call to get hw.ncpu when we
have the ncpus global from systm.h
John Marino [Mon, 21 Nov 2011 22:32:58 +0000 (23:32 +0100)]
Bump __DragonFly_version after new functions added to libc
John Marino [Mon, 21 Nov 2011 22:30:09 +0000 (23:30 +0100)]
libc: Add wcsncasecmp function
This function performs a case-insensitive string comparison test of
not more than a specified number of wide characters. It is a GNU
extension, not POSIX. Some packages in pkgsrc may require it.
John Marino [Mon, 21 Nov 2011 22:13:43 +0000 (23:13 +0100)]
libc: Add wcscasecmp function
This function performs a case-insensitive string comparison test on
wide characters. It is a GNU extension, not POSIX. Some packages
in pkgsrc require it.
Venkatesh Srinivas [Mon, 21 Nov 2011 23:09:38 +0000 (15:09 -0800)]
kernel -- nata: Raise ATA timeout for FLUSHCACHE requests.
(S)ATA devices may take longer than the default ata timeout to respond to
FLUSHCACHE requests, particularly when they are spinning-up. Seen with
Western Digital Caviar Green SATA disks.
From: FreeBSD PR 136182 (http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/136182)
Alex Hornung [Mon, 21 Nov 2011 20:55:31 +0000 (20:55 +0000)]
dfr2text, tbridge.9 - fix typos/etc
Sascha Wildner [Mon, 21 Nov 2011 11:09:39 +0000 (12:09 +0100)]
Disable aps(4) in the GENERIC kernels.
Its probe routine doesn't play nice with the Intel S5520SC motherboard.
Until that is properly fixed, disable it as it is not strictly needed.
Reported-by: Michael Kosarev <russiane39@gmail.com>
Sascha Wildner [Mon, 21 Nov 2011 09:50:48 +0000 (10:50 +0100)]
test commit
Sascha Wildner [Mon, 21 Nov 2011 05:34:51 +0000 (06:34 +0100)]
kernel/scsi: Use __unused instead of assigning to itself.
Sascha Wildner [Mon, 21 Nov 2011 04:31:52 +0000 (05:31 +0100)]
kernel/nfs: Fix two wrong sizeofs.
NFSKERBKEY_T (key's type) is (in <vfs/nfs/rpcv2.h>):
typedef u_char NFSKERBKEY_T[2];
and key is one of the function's args, so we need to use the type for
the sizeof, else we'll get the size of a pointer.
Alex Hornung [Sun, 20 Nov 2011 22:21:48 +0000 (22:21 +0000)]
dfregress - fix copy&paste mistake, add newline in output
Alex Hornung [Sun, 20 Nov 2011 21:55:40 +0000 (21:55 +0000)]
dfregress - Add default values for args, add direct arg
* Use a direct argument instead of -r for the runlist file.
* Add defaults for the -t and -o options based on the path to the
runlist file.
Suggested-by: Sascha Wildner (swildner@)
Matthew Dillon [Sun, 20 Nov 2011 20:21:19 +0000 (12:21 -0800)]
kernel - Correct wire count statistics
* When wiring pages for the dma pool we also have to increment
vmstats.v_wire_count to match the later return of the pages
to the free pool which decrements it.
Reported-by: lentferj
Matthew Dillon [Sun, 20 Nov 2011 19:19:32 +0000 (11:19 -0800)]
kernel - Fix vm_object token deadlock (3)
* Fix bug in this commit sequence, m->object is NULL'd out after the
free so we have to save a copy to drop.
Reported-by: marino
Matthew Dillon [Sun, 20 Nov 2011 18:17:12 +0000 (10:17 -0800)]
kernel - Fix vm_object token deadlock (2)
* Files missed in original commit.
Matthew Dillon [Sun, 20 Nov 2011 18:00:37 +0000 (10:00 -0800)]
kernel - Fix broken assertion
* The assertion in _lwkt_trytokeref() was printing out the contents of
an uninitialized variable. The assertion condition itself was ok.
Matthew Dillon [Sun, 20 Nov 2011 17:58:50 +0000 (09:58 -0800)]
kernel - Fix incorrect VA on interlock
* The pmap_clearbit() code was interlocking the wrong VA due to an
uninitalized variable. This could lead to stale tlbs.
Reported-by: swildner
Matthew Dillon [Sun, 20 Nov 2011 17:47:47 +0000 (09:47 -0800)]
kernel - Fix vm_object token deadlock
* vm_page_alloc() needs an exclusive vm_object token when recycling
random cache pages into the free queue. Because these are effectively
random pages it is possible for this exclusive token to interfere
with a shared token already held by the thread.
* Make sure we can actually get the token. If we cannot we deactivate
the page instead.
Matthew Dillon [Sun, 20 Nov 2011 17:45:36 +0000 (09:45 -0800)]
kernel - Fix DRM_DEBUG() macro
* It was dereferencing td->td_proc without checking whether a process
even exists first.
Reported-by: juanfra_
Sascha Wildner [Sun, 20 Nov 2011 15:39:50 +0000 (16:39 +0100)]
kernel: Fix sizeof()s that were taking a pointer.
Sepherosa Ziehau [Sun, 20 Nov 2011 11:25:05 +0000 (19:25 +0800)]
socket: Speed up soclose by avoiding putting the user thread into sleep
- Embed a netmsg_base into socket, it will be used if fast soclose
is possible
- Factor out sodiscard(), which abort the connections on the listen
socket and set the SS_NOFDREF bit. This function is shared across
fast soclose and synchronized soclose
- Rename the original soclose() to soclose_sync(), which uses domsg
to perform proto-specific operation
- If kern.ipc.soclose_fast is 1 (it is 1 by default) and SO_LINGER
socket option is not set and the socket does not use synchronized
msgport (e.g. UNIX domain socket), fast soclose will be used
- The fast soclose is implemented to avoid putting the caller thread
(usually a user thread) into sleep. It uses the socket's embeded
"close message" with different dispatch functions based on the
current socket state and send the "close message" (asynchronized)
to proto thread to carry out various tasks.
The result:
On Phenom 9550 (4 core, 2.2GHz):
route change -host 127.0.0.1 -msl 50 (set MSL to 50ms)
8 parallel netperf -H 127.0.0.1 -t TCP_CC -P0 (4 runs, unit: tps)
old 33181.18 33005.66 33130.48 33010.50
new 39109.07 39032.48 39022.75 38993.72
This gives 18% performance improvement
Sascha Wildner [Sun, 20 Nov 2011 09:29:17 +0000 (10:29 +0100)]
dfregress.8: Some little cleanup.
Sascha Wildner [Sun, 20 Nov 2011 09:27:59 +0000 (10:27 +0100)]
dfr2text(8): Remove custom DEBUG_FLAGS.
Sascha Wildner [Sun, 20 Nov 2011 09:10:04 +0000 (10:10 +0100)]
rc.conf.5: Add some words about the recently added change_routes variable.
Sepherosa Ziehau [Sun, 20 Nov 2011 06:53:09 +0000 (14:53 +0800)]
rt_metrics: Change msl unit to millisecond
Matthew Dillon [Sat, 19 Nov 2011 22:40:46 +0000 (14:40 -0800)]
fastbulk - Minor corrections & docs
* Minor correction to the setup script
* Document more targets
Matthew Dillon [Sat, 19 Nov 2011 21:54:49 +0000 (13:54 -0800)]
fastbulk - Commit to /usr/src/test/fastbulk
* Commit the fastbulk (fast pkgsrc bulk building system) that I was
working on late last year so others can mess around with it.
* This is a set of scripts that attempt to figure out pkgsrc tree
dependencies and then run as many package builds in parallel as
possible, keeping track of completions which effect other dependencies
in order to keep as many concurrent (up to NPARALLEL) builds going as
possible.
* Once the source archives get synchronized concurrency is actually limited
more by the sludgepile that is the pkgsrc/bmake system which we have to
use to figure out the dependencies in the first place. It takes a bit
for enough of the dependency tree to build for concurrency to ramp up
but it does pretty well once the core packages that everyone else depends
on have been built.
* Easy tracking of the state of the build via per-package log files and
status information in /build/fastbulk/root/tmp/logs/{good,bad,run}.
Log files for currently running builds are placed in run and then
moved to good or bad when the build completes.
* Remaining issues include multi-dependencies (e.g. when multiple versions
of the same package is available for install), because other packages in
the tree might depend on different versions of the same package,
missing dependencies, and other conflicts.
Sascha Wildner [Sat, 19 Nov 2011 20:01:29 +0000 (21:01 +0100)]
LINT/LINT64: Remove page breaks.
Matthew Dillon [Sat, 19 Nov 2011 17:41:48 +0000 (09:41 -0800)]
kernel - Add ts check to dotimeout_only()
* We have to add a null-check before calling dotimeout_only(). When
poll()/select() are called with a NULL timeout that means wait forever
and does not mean a fixed delay.
Reported-by: YONETANI Tomokazu <y0n3t4n1@gmail.com>
Matthew Dillon [Sat, 19 Nov 2011 10:32:31 +0000 (02:32 -0800)]
kernel - Fix swapcache related crash
* VM object must be held while vmobj_token serializes list, before
the lwkt_yield() not after.
* Fixes crash when swapcache fills up and starts to remove entries.
Alex Hornung [Sat, 19 Nov 2011 05:24:55 +0000 (05:24 +0000)]
dfregress.8 - spelling: synopsys => synopsis
Matthew Dillon [Sat, 19 Nov 2011 08:06:53 +0000 (00:06 -0800)]
kernel - Fix crash in pmap_enter()
* When taking a concurrent fault in KVM on a pipe buffer the pte
replacement path when a pte is found to already exist was not
checking whether pt_pv was NULL or not before trying to wire its
page.
Reported-by: n00b183
Matthew Dillon [Sat, 19 Nov 2011 07:12:06 +0000 (23:12 -0800)]
kernel - Correct unaligned results in alist_free_info()
* alist_free_info() needs to return a power-of-2-sized and power-of-2
aligned result in order for the caller to be able to use the information
to allocate the resulting space.
* Fixes an issue where the kernel is unable to return a big chunk of the
reserved DMA space back to the kernel free pool, resulting in a lot of
wasted memory.
Matthew Dillon [Sat, 19 Nov 2011 05:04:00 +0000 (21:04 -0800)]
kernel - Implement a contiguous memory reserve for contigmalloc()
* We initially reserve the lower 1/4 of memory or 256MB, whichever is
smaller. The ALIST API is used to manage the memory.
* Once device initialization is complete, and before int is executed,
we reduce the reserve and return pages to the normal VM paging queues.
The reserve is reduced to ~16MB or 1/16 total memory, whichever is
smaller.
* This can be adjusted with a tunable 'vm.dma_reserved'.
* contigmalloc() now tries the DMA reserve first. If it fails it falls
back to the original contigmalloc() code. contigfree() determines whether
the pages belong to the DMA reserve or not and will either return them
to the reserve or free them to the normal paging queues as appropriate.
VM pages in the reserve are left wired and not busy, and they are returned
to the reserve in the same state. This greatly simplifies operations that
act on the reserve.
* Fix various bits of code that contigmalloc()'d but then kfree()'d instead
of contigfree()'d.
Matthew Dillon [Sat, 19 Nov 2011 04:57:16 +0000 (20:57 -0800)]
kernel - Revamp subr_alist and get it ready for use
* Fix numerous bugs in the bighint code.
* Add API functions to allow static initialization.
* When shortcutting chunks we still should flesh out the parent's whole
array. This makes alist_free_info() easier to implement.
* Implement alist_free_info() which provides information on the largest
trailing chunk available (with some restrictions). This is used to
chop down a large preinitialization.
* Implement an allocate-after-block feature to alist_alloc()
* Implement natural alignment and boundary handling. Allocations can only
be in powers of 2 internally with odd-sized allocations allocating the
larger size and then piecemeal-freeing the trailing portion. This also
has the effect of ensuring that the boundary and alignment will always
be the nearest greater or equal power of 2 to the allocation request size.
Matthew Dillon [Fri, 18 Nov 2011 20:03:09 +0000 (12:03 -0800)]
kernel - Fix incorrect assertion in lwkt_token_swap()
* The bounds check for the two tokens was off by one, resulting in a crash
under certain circumstances.
Matthew Dillon [Fri, 18 Nov 2011 19:51:18 +0000 (11:51 -0800)]
kernel - Fix swapcached problems when max-swap use reached (2)
* Fix bug in last commit
Matthew Dillon [Fri, 18 Nov 2011 18:48:09 +0000 (10:48 -0800)]
kernel - Fix swapcached problems when max-swap use reached
* A calculation could reverse-index the limit counter and cause
swapcached to eat an excessive amount of cpu, causing other
processes to stall.
* Fixes network problems between avalon and the dragonfly core network.
Matthew Dillon [Fri, 18 Nov 2011 16:14:16 +0000 (08:14 -0800)]
kernel - Document vm_map_lookup_entry() better in vm/vm_map.c
* Add some additional code documentation.
* Issue a required cpu_ccfence() after a variable load when checking the
vm_map_entry hint. With the map locked shared the hint can still be
updated concurrently even though the value, once loaded, will point to
a stable structure.
Matthew Dillon [Fri, 18 Nov 2011 16:10:41 +0000 (08:10 -0800)]
kernel - Adjust tlb invalidation in the x86-64 pmap code
* Use a locked bus cycle instruction to clear pte's in all cases.
* Remove unnecessary vm_page_hold() when removing a page table page pv.
The page is still wired so a hold is not needed.
* Do not issue invalidation interlocks when populating a user pte, the
invalidations issued when the user pte is removed are sufficient.
Kernel pte's still appear to need an interlock. It is unclear why
(possibly early PG_PS replacement issues).
* Revamp pmap_enter() to fix a race case which could allow PG_M to get
lost. Any protection or wiring change fully removes the pte before
loading a revised pte.
Matthew Dillon [Fri, 18 Nov 2011 16:09:11 +0000 (08:09 -0800)]
kernel - Fix marker in sysctl_kern_proc()
* The marker wasn't being marked as a marker, resulting in a
kernel panic when two or more 'ps' commands are running concurrently
and one blocks.
Matthew Dillon [Fri, 18 Nov 2011 16:08:24 +0000 (08:08 -0800)]
kernel - Cleanup and document
* Cleanup and document various bits of code.
Sepherosa Ziehau [Fri, 18 Nov 2011 09:38:45 +0000 (17:38 +0800)]
rc.d/routing: Add change_routes support
Alex Hornung [Fri, 18 Nov 2011 08:55:56 +0000 (08:55 +0000)]
tbridge(9) - add man page
Alex Hornung [Fri, 18 Nov 2011 08:26:08 +0000 (08:26 +0000)]
dfregress.8 - Add info on writing testcases
Sepherosa Ziehau [Fri, 18 Nov 2011 08:42:11 +0000 (16:42 +0800)]
netisr: Expose netmsg_sync_handler to avoid code duplication
Sascha Wildner [Fri, 18 Nov 2011 04:41:57 +0000 (05:41 +0100)]
netstat(1): Renumber the nlist[] array indices.
I overlooked this during the atalk removal.
Reported-by: ftigeot
Sascha Wildner [Thu, 17 Nov 2011 21:41:05 +0000 (22:41 +0100)]
netstat(1): Remove another unused prototype.
Sascha Wildner [Thu, 17 Nov 2011 21:28:09 +0000 (22:28 +0100)]
netgraph: Add module dependencies.
Alex Hornung [Fri, 18 Nov 2011 01:13:01 +0000 (01:13 +0000)]
dfregress,dfr2text - add man pages
Alex Hornung [Wed, 16 Nov 2011 17:07:58 +0000 (17:07 +0000)]
dfregress - misc minor fixes/ make more verbose
* man page is coming soon :)
Alex Hornung [Wed, 16 Nov 2011 10:46:08 +0000 (10:46 +0000)]
dfregress,tbridge - Move into usr.bin and sys/dev
* cleanup of the testcases, remove duplicates, consolidate in
test/testcases.
Sascha Wildner [Thu, 17 Nov 2011 20:08:36 +0000 (21:08 +0100)]
netstat(1): Remove some unused prototypes.
Matthew Dillon [Thu, 17 Nov 2011 18:44:16 +0000 (10:44 -0800)]
kernel - Fix additional races in lwp_signotify()
* lwp_signotify() was improperly scheduling threads whos td_gd is on the
local cpu without checking the SINTR flags. This can catch a thread in
the middle of being transitioned to another cpu and cause havoc.
* Only schedule the thread if the SINTR flags are set.
* We can't call setrunnable() from an IPI so adjustments have to be made
in the remote cpu to set the lp's lwp_stat state before issuing the IPI
and only do the scheduling of its thread from the IPI function.
Reported-by: ftigeot
Matthew Dillon [Thu, 17 Nov 2011 17:17:51 +0000 (09:17 -0800)]
kernel - more procfs work
* uiomove_frombuf() takes care of indexing uio_offset and checking its
range for us so we don't have to do it ourselves, clean up use cases
in procfs.
* Generate somewhat more consistent text output for /proc/<pid>/map by
formatting the map entry range with static widths.
* ps_nargvstr is a signed number, do a better range check on it.
Matthew Dillon [Thu, 17 Nov 2011 17:04:53 +0000 (09:04 -0800)]
kernel - Fix ps/thread-exit and other related ps races
* Adjust sysctl_kern_proc()'s kernel thread scanning code to use a marker
instead of depending on td remaining on its proper list. Otherwise
blocking conditions can rip td out from under us or move it to another
cpu, potentially resulting in a crash or livelock. Index the scan
backwards to avoid live-locking continuous adds to the list.
* Fix a potential race is the zombie removal code vs a ps, p->p_token was
being released too early.
* Adjust lwkt_exit() to wait for the thread's hold count to drop to zero
so lwkt_hold() works as advertised.
Sepherosa Ziehau [Thu, 17 Nov 2011 13:42:29 +0000 (21:42 +0800)]
sendfile: Use asynchronized pru_send when ever possible
On Phenom 9550 (4 core, 2.2GHz):
8 parallel netperf -H 127.0.0.1 -t TCP_SENDFILE -P0 (4 runs, unit: Mbps)
old 10509.48 12364.60 11930.55 11104.94
new 21031.34 20165.39 19888.42 19896.47
This give 70% ~ 90% performance improvement
Sepherosa Ziehau [Thu, 17 Nov 2011 12:01:17 +0000 (20:01 +0800)]
protosw: Add PR_ASYNC_SEND, mainly to make sure async pru_send is supported
Currently on IP/TCP and IPv6/TCP set this flag
Venkatesh Srinivas [Thu, 17 Nov 2011 02:30:11 +0000 (18:30 -0800)]
kernel -- vkernel64's trap_pfault should use VM_FAULT_BURST for usermode faults.
From x86-64 trap_pfault.
Venkatesh Srinivas [Thu, 17 Nov 2011 02:10:41 +0000 (18:10 -0800)]
Merge branch 'master' of /repository/git/dragonfly
Venkatesh Srinivas [Thu, 17 Nov 2011 02:07:47 +0000 (18:07 -0800)]
kernel -- token: Two shared token DEBUG_LOCKS tests.
* New warning when a pool token is taken in shared mode.
* KASSERT when trying to take an exclusive token with that token
already held shared.
Both tests are only active under DEBUG_LOCKS.
Matthew Dillon [Thu, 17 Nov 2011 01:56:39 +0000 (17:56 -0800)]
systat - unsigned expansion to proper display >= 2G values on 32 bit boxes
* Rename putlong() to put64(), and have it takes an intmax_t argument
instead of a long.
Matthew Dillon [Wed, 16 Nov 2011 22:59:34 +0000 (14:59 -0800)]
kernel - Move VM objects from pool tokens to per-vm-object tokens
* Move VM objects from pool tokens to per-vm-object tokens.
* This fixes booting issues on i386 with vm.shared_fault=1 (pool
tokens would sometimes coincide with the token used for kernel_object
which causes problems on i386 due to the pmap code's use of
kernel_map/kernel_object).
Matthew Dillon [Wed, 16 Nov 2011 20:29:20 +0000 (12:29 -0800)]
kernel - Try to fix procfs readdir race
* procfs_allocvp() may have a pfs/vnode race which the vget() may not
completely address. For now make sure we can't race a vnode teardown
when attempting to acquire a vnode with vget().
Matthew Dillon [Wed, 16 Nov 2011 19:51:29 +0000 (11:51 -0800)]
kernel - Do not use shared tokens for kernel_map
* This primarily handles a case where i386 systems can deadlock on a
shared token -> exclusive token sequence during a page fault, because
the i386 pmap code uses kernel_object to manage page table pages.
x86-64 page fault code does not but for now just make the change globally.
* Should not effect performance
* Change the default for vm_
* Change the default for vm.shared_fault back to 1.
Reported-by: ejc
Submitted-by: vsrinivas
Matthew Dillon [Wed, 16 Nov 2011 18:58:32 +0000 (10:58 -0800)]
kernel - Fix bug in procfs_ioctl()
* needed pfs_pfind() instead of pfind().
Reported-by: ftigeot
Sascha Wildner [Wed, 16 Nov 2011 18:03:23 +0000 (19:03 +0100)]
<sys/socket.h>: Bring back PF_APPLETALK too, to unbreak building lang/ruby18.
Reported-by: Eric J. Christeson <eric.j.christeson@gmail.com>
Matthew Dillon [Wed, 16 Nov 2011 17:12:58 +0000 (09:12 -0800)]
kernel - Do not call pmap_enter() in vm_fault_page*()
* Do not call pmap_enter() from vm_fault_page*(). This function can be
called from foreign pmap contexts and thus the current cpu's bit may
not be set in the target pmap cpumask. Any pmap_enter() operation will
thus not properly synchronize with other users of the pmap (particularly
other foreign users).
* In addition, for callers of the umtx*() function calling pmap_enter()
is inefficient as the correct page might already be faulted in. Now
because we are no longer updating the page in the pmap an older page
may still exist in the pmap (mapped read-only as it was originally COW).
This page may no longer be correct because the umtx*() functions
modify the contend of the page returned by vm_fault_page() without
necessarily mapping it. So to keep the user visibility into the memory
correct we unmap the old page when vm_fault_page() has to do a COW.
This is slightly more burdensome for fork() but far less burdomsome
for the umtx system calls and also allows procfs_memrw to work properly.
* procfs uses vm_fault_page*() to access command line arguments for
any process and umtx*() uses it to access the memory page the umtx
is operating in. Relative to procfs the user process pmap is foreign
(i.e. the current cpu's bit is not set in its pm_active) and cannot
be properly updated via a vm_fault_page*() from procfs anyway, so the
above new behavior for vm_fault_page*() is even more correct for
procfs use cases.