Hasso Tepper [Sat, 20 Dec 2008 01:53:58 +0000 (03:53 +0200)]
Merge commit 'crater/vendor/OPENPAM'
YONETANI Tomokazu [Fri, 19 Dec 2008 14:21:12 +0000 (23:21 +0900)]
Adjust VFS_SET() to deal with the change to struct vfsconf
Matthew Dillon [Fri, 19 Dec 2008 04:20:15 +0000 (20:20 -0800)]
Close a possible bug where the p_lock for a new process inherits a
non-zero value from its parent on fork(), preventing the process
from being able to exit later on.
Matthew Dillon [Thu, 18 Dec 2008 21:45:27 +0000 (13:45 -0800)]
Merge branch 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into devel
Matthew Dillon [Thu, 18 Dec 2008 21:37:37 +0000 (13:37 -0800)]
This is a MAJOR rewrite of usched_bsd4 and related support logic, plus
additional improvements to the LWKT scheduler.
* The LWKT scheduler used to run a user thread not needing the MP lock
if it was unable to run a kernel thread that did need it, due to some
other cpu holding the lock. This created a massive priority inversion
LWKT no longer does this. It will happily run other MPSAFE kernel
threads but as long as kernel threads exist which need the MP lock
LWKT will no longer switch to a user mode thread.
Add a new sysctl lwkt.chain_mplock which defaults to 0 (off). If set
to 1 LWKT will attempt to use IPIs to notify conflicting cpus when the
MP lock is available and will also allow user mode threads to run if
kernel threads are present needing the MP lock (but unable to get it).
NOTE: Current turning on this feature results in reduced performance,
though not as bad as pre-patch.
* The main control logic USCHED_BSD4 was almost completely rewritten,
greatly improving interactivity in the face of cpu bound programs
such as compiles.
USCHED_BSD4 no longer needs to use the scheduler helper when the
system is under load. The scheduler helper is only used to allow
one cpu to kick another idle cpu when additional processes are
present.
USCHED_BSD4 now takes great advantage of the scheduler's cpu-local
design and uses a bidding algorithm for processes trying to return
to user mode to determine which one is the best. Winners simply
deschedule losers, and since the loser is clearly not running when
the winner does this the descheduling operation is ultra simple to
accomplish.
Matthew Dillon [Thu, 18 Dec 2008 21:27:20 +0000 (13:27 -0800)]
This is a major revamping of the pageout and low-memory handling code.
The pageout daemon now detects out-of-memory conditions and properly
kills the largest process(es). This condition occurs when swap is
full (or you have no swap) and most of the remaining VM pages in memory
have become dirty. With no swap to page to the dirty pages squeeze out
the clean ones. The pageout daemon detects the case and starts killing
processes.
The pageout daemon now detects stress in the form of excess cpu use
and tries to reduce its cpu footprint when that occurs. Excess cpu use
can occur when the only pages left in-core are dirty and there is nowhere
to swap them to. Previously if this case occured the system would basically
just stop working.
These changes make the system truely have VM = RAM+SWAP. If you 1G of ram
and 1G of swap the system can run up to 2G worth of processes.
Matthew Dillon [Thu, 18 Dec 2008 21:18:29 +0000 (13:18 -0800)]
vnode_pager_haspage() could return TRUE but leave *before and *after
uninitialized, causing vm_fault's burst pagein feature to panic the system.
vm_fault was almost never using its burst pagein feature due to incorrect
test logic. The burst pagein code itself was also seriously buggy, so it
is fortunate the test logic was broken.
Rewrite the broken test logic and fix the bugs in the burst pagein code.
Add a new sysctl vm.burst_fault, defaulting to 0 (disabled). The default
will be changed to 1 in a week or two.
Sascha Wildner [Thu, 18 Dec 2008 11:20:34 +0000 (12:20 +0100)]
Really fix indent.
Matthias Schmidt [Thu, 18 Dec 2008 11:09:09 +0000 (12:09 +0100)]
Apply GNU style indent
Matthias Schmidt [Thu, 18 Dec 2008 10:56:46 +0000 (11:56 +0100)]
Fix annoying bug with grep and HAMMER
grep foo * on an UFS partition was silent if grep hit a subdirectory. If
executed on HAMMER, grep complains about "Invalid argument" because directories
in HAMMER are not treated as files.
Before:
cd /usr/src
grep test *
Makefile: test \
grep: cat: Invalid argument
[...]
After:
grep test *
Makefile: test \
Hasso Tepper [Thu, 18 Dec 2008 08:47:32 +0000 (10:47 +0200)]
Import OpenPAM Hydrangea.
Hasso Tepper [Thu, 18 Dec 2008 03:03:47 +0000 (05:03 +0200)]
Call ata_legacy() only once on attach and save it's result.
Scanning PCI configuration registers (which are not going to change) on
every interrupt looks expensive, especially when interrupt is shared.
Profiling (in FreeBSD) shows 3% of time spent by atapci0 on pure network
load due to IRQ sharing with em0.
Obtained-from: FreeBSD
Joe Talbott [Thu, 18 Dec 2008 02:18:48 +0000 (21:18 -0500)]
Merge branch 'master' of /usr/git/dragonfly
Joe Talbott [Thu, 18 Dec 2008 02:13:40 +0000 (21:13 -0500)]
Page align boundaries for kvm_access_check().
Matthew Dillon [Thu, 18 Dec 2008 01:07:33 +0000 (17:07 -0800)]
Merge branch 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into devel
Matthew Dillon [Thu, 18 Dec 2008 01:02:38 +0000 (17:02 -0800)]
Fix bugs in dealing with low-memory situations when the system has run out
of swap or has no swap.
* Fix an error where the system started killing processes before it needed
to.
* Continue propagating pages from the active queue to the inactive queue
when the system has run out of swap or has no swap, even though the
inactive queue has become bloated. This occurs because the inactive
queue may be unable to drain due to an excess of dirty pages which
cannot be swapped out.
* Use the active queue to detect excessive stress which combined with
an out-of-swap or no-swap situation means the system has run out of
memory. THEN start killing processes.
* This also allows the system to recycle nearly all the clean pages
available when it has no swap space left, to try to keep things going,
leaving only dirty pages in the VM page queues.
Michael Neumann [Wed, 17 Dec 2008 18:40:12 +0000 (18:40 +0000)]
Unbreak buildworld
Michael Neumann [Wed, 17 Dec 2008 17:21:05 +0000 (17:21 +0000)]
Merge branch 'vfsconf'
Michael Neumann [Wed, 17 Dec 2008 17:10:58 +0000 (17:10 +0000)]
Clean up a bit
Michael Neumann [Wed, 17 Dec 2008 15:50:04 +0000 (15:50 +0000)]
Merge branch 'vfsconf'
Michael Neumann [Wed, 17 Dec 2008 15:44:42 +0000 (15:44 +0000)]
Refactor filesystem types list and fix bug.
Refactor the management of the filesystem types list (vfsconf) by
introducing some management functions. Reduce inter-module coupling.
This actually fixes a potential "bug" in vfs_register() which does not
compare the new VFS to register with the last entry from the list, i.e.
two (or more) sequential vfs_register() calls with the same argument
would succeed.
Sepherosa Ziehau [Wed, 17 Dec 2008 13:23:16 +0000 (21:23 +0800)]
Symbol TX desc is only used by 64bits TX chain format.
Sepherosa Ziehau [Wed, 17 Dec 2008 12:30:46 +0000 (20:30 +0800)]
Always free the passed in mbuf if jme_encap() failed.
Non-EFBIG error probably will not recover.
Sepherosa Ziehau [Wed, 17 Dec 2008 12:12:34 +0000 (20:12 +0800)]
Don't use magic number
Sepherosa Ziehau [Wed, 17 Dec 2008 12:10:05 +0000 (20:10 +0800)]
Rename ring_cnt to desc_cnt.
ring_cnt may be used when RSS (multi rx ring) support is experimented.
Sepherosa Ziehau [Wed, 17 Dec 2008 11:52:28 +0000 (19:52 +0800)]
Tunable number of RX/TX descs
Sepherosa Ziehau [Wed, 17 Dec 2008 10:38:10 +0000 (18:38 +0800)]
Remove unused macros
Matthew Dillon [Tue, 16 Dec 2008 17:54:38 +0000 (09:54 -0800)]
LIST_FOREACH_MUTUABLE() was tracking processes not held with PHOLD().
Use a normal LIST_FOREACH() instead because the main iterator is being
protected by PHOLD().
Matthew Dillon [Tue, 16 Dec 2008 17:53:32 +0000 (09:53 -0800)]
Assert that nobody holds the process referenced with PHOLD() in
exit.
Matthew Dillon [Tue, 16 Dec 2008 17:49:35 +0000 (09:49 -0800)]
Add missing range checks to sopt_valsize for the linux emulated
setsockopt().
Matthew Dillon [Tue, 16 Dec 2008 17:46:57 +0000 (09:46 -0800)]
Due to races clean blocks can remain cached, remove a conditional from
the ffs_truncate3 panic.
Reported-by: sorry, I forgot.
Matthew Dillon [Tue, 16 Dec 2008 17:46:10 +0000 (09:46 -0800)]
Cleanup and enhance the vnodeinfo output.
Matthew Dillon [Tue, 16 Dec 2008 17:44:57 +0000 (09:44 -0800)]
Add a socketpair performance tester.
Matthew Dillon [Tue, 16 Dec 2008 17:40:20 +0000 (09:40 -0800)]
Merge branch 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into devel
Matthew Dillon [Tue, 16 Dec 2008 17:31:42 +0000 (09:31 -0800)]
Make __fpending() take a const argument.
Sepherosa Ziehau [Tue, 16 Dec 2008 14:26:58 +0000 (22:26 +0800)]
Replace libpcap's pcap-bpf.h with system's net/bpf.h
Sepherosa Ziehau [Tue, 16 Dec 2008 13:40:42 +0000 (21:40 +0800)]
Remove tcpcb.tt_msg == NULL tests in tcp_callout_*().
tcpcb.tt_msg == NULL could only happen for TCP listen sockets, while
for this kind of sockets, tcp timers should never be used.
Suggested-by: dillon@
Sepherosa Ziehau [Mon, 15 Dec 2008 14:29:35 +0000 (22:29 +0800)]
Restore the semantic of callout_active() testing on tcp timers.
Originally there is no time gap between the running of the tcp timer
handler and the deactivation of the tcp timer callout, but the message
based tcp timer has a time gap in between these two actions. This
time gap affects the code path which depends on the current state of
the tcp timer, i.e. return value of callout_active(tcp_timer). To
close this time gap, we take the pending and running tcp timer tasks
into consideration when testing the current state of the tcp timer.
Reviewed-by: dillon@
Michael Neumann [Mon, 15 Dec 2008 23:11:24 +0000 (23:11 +0000)]
Merge branch 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly
Michael Neumann [Mon, 15 Dec 2008 23:09:33 +0000 (23:09 +0000)]
Fix typo
Peter Avalos [Mon, 15 Dec 2008 18:39:51 +0000 (10:39 -0800)]
Rename a local function that conflicts with one in librpcsvc.
Michael Neumann [Mon, 15 Dec 2008 17:45:26 +0000 (17:45 +0000)]
Remove superfluous shadow variable declaration
Michael Neumann [Mon, 15 Dec 2008 17:42:35 +0000 (17:42 +0000)]
Cosmetic changes (move assignment out of "if")
Michael Neumann [Mon, 15 Dec 2008 15:41:00 +0000 (15:41 +0000)]
Fix typo (currenet -> current)
Michael Neumann [Mon, 15 Dec 2008 00:46:58 +0000 (00:46 +0000)]
Remove unnecessary optimization.
The optimization (>> 9) will be automatically performed by the compiler.
Sascha Wildner [Sun, 14 Dec 2008 17:56:01 +0000 (18:56 +0100)]
Remove some unnecessary casts.
Sascha Wildner [Sun, 14 Dec 2008 17:53:02 +0000 (18:53 +0100)]
Adjust the bt_gethostbyaddr(3) prototype to match gethostbyaddr(3).
Sascha Wildner [Sun, 14 Dec 2008 17:50:21 +0000 (18:50 +0100)]
Bring the prototype of gethostbyaddr(3) in line with the standard.
http://www.opengroup.org/onlinepubs/
009695399/functions/gethostbyname.html
Sepherosa Ziehau [Sun, 14 Dec 2008 14:27:33 +0000 (22:27 +0800)]
Use priority message for TCP timers
Sepherosa Ziehau [Sun, 14 Dec 2008 10:47:31 +0000 (18:47 +0800)]
White space cleanup
Sepherosa Ziehau [Sun, 14 Dec 2008 10:14:37 +0000 (18:14 +0800)]
Use 32bit TX chain format on the platform or NIC which only supports
32bit DMA buffer operation.
Sepherosa Ziehau [Sun, 14 Dec 2008 06:30:08 +0000 (14:30 +0800)]
Split net/dlt.h out of net/bpf.h
Idea-from: NetBSD
Sepherosa Ziehau [Sun, 14 Dec 2008 05:37:44 +0000 (13:37 +0800)]
Pull in libpcap's new DLTs
This commit targets using system net/bpf.h in libpcap instead of
libpcap's own pcap-bpf.h; mainly to make following code compile:
#include <pcap.h>
#include <net/bpf.h>
Reported-by: hasso@
Note:
As of this commit DLT_PFSYNC is changed to 18. The original define
conflicts with libpcap's define and it looks like both libpcap,
NetBSD and OpenBSD use 18 as DLT_PFSYNC. This may introduce compat
issue with dump files generated using DLT_PFSYNC before this change.
Sascha Wildner [Sat, 13 Dec 2008 08:49:26 +0000 (09:49 +0100)]
Sync ciss(4) with FreeBSD's RELENG_4 branch.
This adds support for many Smart Array controllers, fixes a couple of
bugs and cleans up some whitespace issues.
Note that hot-plugging does not seem to work yet.
Many thanks to Archimedes Gaviola <archimedes.gaviola@gmail.com> who
tested the patch on a HP ProLiant DL380 (Smart Array P400).
Taken-from: FreeBSD
Sascha Wildner [Fri, 12 Dec 2008 23:21:50 +0000 (00:21 +0100)]
Add ftw(), nftw(), associated header files and documentation.
It seems that security/prelude-manager in pkgsrc actually needs it.
Taken-from: FreeBSD
Tested-by: Rumko <rumcic@gmail.com>
Sascha Wildner [Fri, 12 Dec 2008 09:20:06 +0000 (10:20 +0100)]
Merge branch 'misc'
Sascha Wildner [Fri, 12 Dec 2008 09:16:38 +0000 (10:16 +0100)]
Silence warnings and rearrange the includes a bit.
Matthew Dillon [Wed, 10 Dec 2008 23:42:40 +0000 (15:42 -0800)]
Use per-mount kmalloc pools for bulk data structures, particularly inodes
and records.
Matthew Dillon [Wed, 10 Dec 2008 19:30:12 +0000 (11:30 -0800)]
Add kmalloc_create() and kmalloc_destroy(), an API to dynamically create and
destroy kmmalloc pools.
Matthew Dillon [Wed, 10 Dec 2008 18:30:26 +0000 (10:30 -0800)]
Merge branch 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into devel
Matthew Dillon [Wed, 10 Dec 2008 18:27:32 +0000 (10:27 -0800)]
Fix a buffer cache deadlock which can occur when simulated disk devices
(VN) are backed by a HAMMER file. Do not call bwillwrite() via the
VN->VOP_WRITE->HAMMER path.
Add a new IO_RECURSE flag to identify paths for which bwillwrite() should
never be called.
Reported-by: Rumko <rumcic@gmail.com>
Sepherosa Ziehau [Tue, 9 Dec 2008 12:05:19 +0000 (20:05 +0800)]
Don't enable hardware timer simulated interrupt moderation on 8139C+.
This at least unbreaks the qemu support.
Reported-by: walt <wa1ter@myrealbox.com>
Matthew Dillon [Mon, 8 Dec 2008 04:01:17 +0000 (20:01 -0800)]
Merge branch 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into devel
Matthew Dillon [Mon, 8 Dec 2008 03:55:28 +0000 (19:55 -0800)]
Fix seg-fault in recent 'hammer cleanup' utility work.
Reported-by: Rumko <rumcic@gmail.com>
Sascha Wildner [Sun, 7 Dec 2008 15:22:55 +0000 (16:22 +0100)]
Add missing section number.
Sascha Wildner [Sun, 7 Dec 2008 01:34:16 +0000 (02:34 +0100)]
Improve markup.
Sepherosa Ziehau [Sat, 6 Dec 2008 09:10:24 +0000 (17:10 +0800)]
Constify the tip/sip parameters of arprequest/arprequest_async
Sascha Wildner [Fri, 5 Dec 2008 21:53:51 +0000 (22:53 +0100)]
The CVS ID isn't needed anymore.
Sepherosa Ziehau [Fri, 5 Dec 2008 14:09:51 +0000 (22:09 +0800)]
Deprecate MALLOC/FREE macros
Sepherosa Ziehau [Fri, 5 Dec 2008 13:59:44 +0000 (21:59 +0800)]
Correct the reversed comparison logic
Sepherosa Ziehau [Sun, 30 Nov 2008 08:56:16 +0000 (16:56 +0800)]
Defer various TCP timer function from callout threads to TCP threads; mainly
to avoid possible threading races, e.g. when output processing blocking the
current thread. To save space, only one netmsg is used and is embedded in
tcpcb. The timer functions needed to be carried out are indicated by the
tasks field in the netmsg.
Reviewed-by: dillon@
Discussed-with: dillon@
With-input-from: hsu@
Tested-by: hasso@
Sepherosa Ziehau [Thu, 4 Dec 2008 04:11:25 +0000 (12:11 +0800)]
Don't set max read request size on 10/100 only PCIe NICs.
Setting max read request size to non-default value seems to cause these
kinds of NICs' DMA engine malfunction.
Reported-by: josepht@
Tested-by: josepht@
Based-on-patch-from: josepht@
Hasso Tepper [Thu, 4 Dec 2008 23:21:51 +0000 (01:21 +0200)]
Make CIDR in "-net net gw mask" form work again.
Hasso Tepper [Wed, 3 Dec 2008 09:14:54 +0000 (11:14 +0200)]
Sync termcap with FreeBSD preserving local changes.
Fixes many bugs with TERM=xterm and screen.
Obtained-from: FreeBSD
Matthias Schmidt [Thu, 4 Dec 2008 14:10:23 +0000 (15:10 +0100)]
Update units(1) to allow conversion between degC and degF
This patch (in modified form from FreeBSD) rested for some time on
my disk, so I'll throw it in. It allows conversion between Celsius
and Fahrenheit:
You have: 20 degC
You want: degF
68
Reminded-by: corecode@
Sepherosa Ziehau [Thu, 4 Dec 2008 10:35:13 +0000 (18:35 +0800)]
Add CARP_IS_RUNNING() to test carp(4) iface's IFF_UP and IFF_RUNNING
Sepherosa Ziehau [Thu, 4 Dec 2008 10:14:29 +0000 (18:14 +0800)]
Remove unused macro
Sepherosa Ziehau [Thu, 4 Dec 2008 10:11:05 +0000 (18:11 +0800)]
Merge clearing IFF_UP and IFF_RUNNING
Sepherosa Ziehau [Thu, 4 Dec 2008 10:05:51 +0000 (18:05 +0800)]
Use in_cksum_{range,skip}()
This is mainly used to avoid following code sequence:
m->m_data += skip;
in_cksum(m, len);
m->m_data -= skip;
Simon Schubert [Wed, 3 Dec 2008 23:20:05 +0000 (00:20 +0100)]
Don't drag the host CCVER into the release build
nrelease was defaulting WORLD_CCVER, which is passed as CCVER to buildworld
and KERNEL_CCVER, which is passed as CCVER to buildkernel, to CCVER. However
the system makefiles set CCVER themselves, thus dragging their idea of the
default CCVER into the release build.
This commit should fix snapshots being built with gcc34 on chlamydia running 1.8-REL.
Sepherosa Ziehau [Wed, 3 Dec 2008 13:18:59 +0000 (21:18 +0800)]
Rework carp_input()
- Use ip header length passed in
- Calculate minimal CARP packet size only once
- Nuke redundant mbuf length check and m_pullup()
- Use in_cksum_skip()
- Add comment
- Keep log message consistent with OpenBSD
Sepherosa Ziehau [Wed, 3 Dec 2008 11:48:01 +0000 (19:48 +0800)]
Use suser_cred()
Sepherosa Ziehau [Wed, 3 Dec 2008 11:41:36 +0000 (19:41 +0800)]
Embed ifnet in carp_softc; ifnet allocation is never adopted.
This commit fix the memory leakage when destroying a carp(4) iface.
Sepherosa Ziehau [Tue, 2 Dec 2008 15:08:19 +0000 (23:08 +0800)]
Clean up and style changes
- Break long lines
- Strip/Add blank lines
- White space
Sepherosa Ziehau [Tue, 2 Dec 2008 14:29:43 +0000 (22:29 +0800)]
Nuke lock remainders and related comment
Sepherosa Ziehau [Tue, 2 Dec 2008 14:18:06 +0000 (22:18 +0800)]
Staticize
Sepherosa Ziehau [Tue, 2 Dec 2008 14:10:00 +0000 (22:10 +0800)]
Drop locking in carp(4). carp(4) is under explicit BGL currently; we could
find other MP approach for it, but not locking.
Sepherosa Ziehau [Tue, 2 Dec 2008 12:44:08 +0000 (20:44 +0800)]
- Regroup type declaration
- White space cleanup
- DragonFly has ifnet_detach_event not ifnet_departure_event
- Staticize carp_cloner
Sepherosa Ziehau [Tue, 2 Dec 2008 12:21:42 +0000 (20:21 +0800)]
Cleanup header inclusion
Sepherosa Ziehau [Tue, 2 Dec 2008 12:08:30 +0000 (20:08 +0800)]
White space
Sepherosa Ziehau [Tue, 2 Dec 2008 12:02:55 +0000 (20:02 +0800)]
u_int{8,16,32,64}_t -> uint{8,16,32,64}_t
Sascha Wildner [Wed, 3 Dec 2008 09:23:28 +0000 (10:23 +0100)]
test some more
Sascha Wildner [Wed, 3 Dec 2008 09:11:37 +0000 (10:11 +0100)]
Merge branch 'ciss'
Sascha Wildner [Wed, 3 Dec 2008 09:11:13 +0000 (10:11 +0100)]
test
Hasso Tepper [Wed, 3 Dec 2008 08:26:04 +0000 (10:26 +0200)]
Testing cherry-pick from local branch & push.
Sascha Wildner [Wed, 3 Dec 2008 05:00:06 +0000 (06:00 +0100)]
Merge branch 'misc'
Sascha Wildner [Wed, 3 Dec 2008 04:59:54 +0000 (05:59 +0100)]
kqueue support has been added to HAMMER.
Sascha Wildner [Wed, 3 Dec 2008 04:53:52 +0000 (05:53 +0100)]
Merge branch 'misc'
Sascha Wildner [Wed, 3 Dec 2008 04:53:20 +0000 (05:53 +0100)]
Add a UFS(5) MLINK and reference it from various places.
Simon Schubert [Wed, 3 Dec 2008 03:42:24 +0000 (04:42 +0100)]
Roll DragonFly 2.1.1