Sepherosa Ziehau [Fri, 26 Dec 2008 11:14:19 +0000 (19:14 +0800)]
Rework carp(4) IPv4 support.
Generic layer changes:
- Pass more detailed information to ifaddr_event handler.
o The ifaddr which triggers the event is passed in
o The action (add/delete/change) performed upon the ifaddr is
passed in
- Add ifa_prflags field in ifaddr_container. This field should
be used to hold protocol specific flags. For inet addresses,
IA_PRF_RTEXISTOK is defined to ignore rtinit() EEXIST error in
in_ifinit().
carp(4) changes:
- Add virtual address struct, which holds corresponding carp(4)
inet address and backing address of a "real" interface (backing
interface).
- The list holding virtual address struct is sorted. This is
mainly used to fix the bug in following case:
host1:
ifconfig carp0 192.168.5.1
ifconfig carp0 alias 192.168.5.2
host2:
ifconfig carp0 192.168.5.2
ifconfig carp0 alias 192.168.5.1
Before this change, the inet addresses sha1 calculated for these
two host will be different, thus CARP fails.
Based-on: OpenBSD
- Allow inet addresses to be added to carp(4) interface, even if
no backing interface could be found or the backing interface is
not running.
- Don't abuse IFF_UP, which is administrative flag; use IFF_RUNNING
instead.
- Factor out carp_stop().
- Handle ifaddr_event; most of the carp(4) inet address configuration
happens in this event handler. In carp_ioctl(), we just mark the
carp(4) interface IFF_UP|IFF_RUNNING and set IA_PRF_RTEXISTOK on
the inet address.
- Fix the ifdetach_event handler:
o Don't sit on the branch while we are sawing it off.
o We always need to leave the joined multicast group.
- Free carp_if to the proper kmalloc pool.
- Simplify the carp_if struct; except the TAILQ_HEAD, rest of the
fields are not used; nuke them.
- Use 'void *' as ifnet.if_carp's type. This could ease upcoming
carp(4) MPSAFE work.
- M_NOWAIT -> MB_DONTWAIT
- Throw in assertions
- Cleanup:
o Nuke SC2IFP
o Nuke carp_softc.sc_ifp compat shim
o Constify function parameters
o ...
Matthias Schmidt [Fri, 26 Dec 2008 12:16:32 +0000 (13:16 +0100)]
Fix bug introduced in last grep commit
Sepherosa Ziehau [Thu, 25 Dec 2008 15:13:34 +0000 (23:13 +0800)]
Use 16QW as RX FIFO threshold to improve PCIe compatibility
Sepherosa Ziehau [Thu, 25 Dec 2008 14:45:33 +0000 (22:45 +0800)]
Pull ether_input_chain_init() and ether_input_dispatch() out of
the RX ring loop
Sepherosa Ziehau [Thu, 25 Dec 2008 14:09:26 +0000 (22:09 +0800)]
Bring in RSS (receive side scaling) support, i.e. multiple RX queues
Sepherosa Ziehau [Thu, 25 Dec 2008 10:56:51 +0000 (18:56 +0800)]
First step toward multiple RX ring support
Matthew Dillon [Fri, 26 Dec 2008 05:22:55 +0000 (21:22 -0800)]
Merge branch 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into devel
Matthew Dillon [Fri, 26 Dec 2008 04:22:34 +0000 (20:22 -0800)]
Change the default for vm.burst_fault to 1, no problems were revealed from
testing.
Sascha Wildner [Thu, 25 Dec 2008 01:04:19 +0000 (02:04 +0100)]
Bring in two updates from FreeBSD.
r1.14 - newbus will zero softc, so no need to duplicate the zeroing here.
r1.18 - More properly cleanup the iicbus child when deleting it.[1]
[1] Tested-by: Vincent Stemen <vince.dragonfly@hightek.org>
Sascha Wildner [Wed, 24 Dec 2008 11:17:21 +0000 (12:17 +0100)]
Remove some more dead initialization.
Found-by: LLVM/Clang Static Analyzer
Sascha Wildner [Wed, 24 Dec 2008 10:25:14 +0000 (11:25 +0100)]
Match acpi_cpu_add_child()'s prototype with that of the corresponding
bus method.
Submitted-by: Dmitry Komissaroff <aunoor@gmail.com>
Sascha Wildner [Tue, 23 Dec 2008 22:25:36 +0000 (23:25 +0100)]
Update the whatis database for /usr/pkg/man and /usr/local/man, too.
Sascha Wildner [Tue, 23 Dec 2008 12:30:38 +0000 (13:30 +0100)]
Oops, remove '*'.
Sascha Wildner [Tue, 23 Dec 2008 12:13:13 +0000 (13:13 +0100)]
Adjust some more stuff for the CVS->git switch.
Sascha Wildner [Tue, 23 Dec 2008 09:28:18 +0000 (10:28 +0100)]
Meh, fix description.
Sascha Wildner [Tue, 23 Dec 2008 09:20:57 +0000 (10:20 +0100)]
Add support for the FC929X in mpt(4).
Submitted-by: Ben Matthews <matthb2@scorec.rpi.edu>
Dragonfly-bug: <https://bugs.dragonflybsd.org/issue1186>
Sascha Wildner [Tue, 23 Dec 2008 01:30:42 +0000 (02:30 +0100)]
No need to care about CVS/ directories anymore.
Sepherosa Ziehau [Mon, 22 Dec 2008 13:27:47 +0000 (21:27 +0800)]
- Use ether_input_chain on RX path
- Process polling count on RX path
Sepherosa Ziehau [Sun, 21 Dec 2008 11:37:07 +0000 (19:37 +0800)]
Remove unused macros
Sepherosa Ziehau [Sun, 21 Dec 2008 11:19:53 +0000 (19:19 +0800)]
Fold jme_ring_data and jme_chain_data, so RX ring/descs stuffs and
TX ring/descs stuffs could be grouped together. This eases multi
RX queue support.
Sascha Wildner [Mon, 22 Dec 2008 11:41:46 +0000 (12:41 +0100)]
WARNS is set in usr.bin/Makefile.inc.
Sascha Wildner [Mon, 22 Dec 2008 10:57:57 +0000 (11:57 +0100)]
Fix WARNS6 regression.
Sascha Wildner [Mon, 22 Dec 2008 01:22:34 +0000 (02:22 +0100)]
Raise WARNS to 6 and fix resulting warnings.
Sascha Wildner [Sun, 21 Dec 2008 09:41:41 +0000 (10:41 +0100)]
Silence warnings.
Sascha Wildner [Sun, 21 Dec 2008 09:41:23 +0000 (10:41 +0100)]
Silence UP warning.
Sascha Wildner [Sun, 21 Dec 2008 09:15:38 +0000 (10:15 +0100)]
Merge branch 'master' of ssh://swildner@crater.dragonflybsd.org/repository/git/dragonfly
Sascha Wildner [Sun, 21 Dec 2008 09:14:50 +0000 (10:14 +0100)]
Add missing break.
Hasso Tepper [Sun, 21 Dec 2008 08:49:28 +0000 (10:49 +0200)]
Make i486 default architecture.
We don't support i386 anyway and most of world seems to assume i486 as
de facto default nowadays.
Sepherosa Ziehau [Sun, 21 Dec 2008 05:46:24 +0000 (13:46 +0800)]
- Add polling(4) support for jme(4)
- Fix an off-by-one assertion bug
- Update jme(4) manpage about the interrupt coalescing sysctls
Sascha Wildner [Sun, 21 Dec 2008 00:45:41 +0000 (01:45 +0100)]
Remove extra whitespace.
Sascha Wildner [Sun, 21 Dec 2008 00:10:52 +0000 (01:10 +0100)]
Add parameter names.
Sascha Wildner [Sat, 20 Dec 2008 23:52:20 +0000 (00:52 +0100)]
Clean up #include situation in the SYNOPSIS.
Sascha Wildner [Sat, 20 Dec 2008 23:01:45 +0000 (00:01 +0100)]
Add missing '#include <sys/types.h>' to the SYNOPSIS.
Sascha Wildner [Sat, 20 Dec 2008 22:28:09 +0000 (23:28 +0100)]
Fix indentation.
Sascha Wildner [Sat, 20 Dec 2008 21:58:54 +0000 (22:58 +0100)]
Use BBLOCK.
Sascha Wildner [Sat, 20 Dec 2008 21:52:06 +0000 (22:52 +0100)]
Fix a dereference of an undefined value.
ntmp was being accessed (via ntfs_bntodoff()) before it was allocated.
The whole thing only worked because BBLOCK is 0 and the dereference was
optimized away (though not with -O0).
Found-by: LLVM/Clang Static Analyzer
Sascha Wildner [Sat, 20 Dec 2008 20:11:31 +0000 (21:11 +0100)]
Eliminate some dead initialization.
Found-by: LLVM/Clang Static Analyzer
Sascha Wildner [Sat, 20 Dec 2008 19:44:36 +0000 (20:44 +0100)]
Silence 'unused variable' warning.
Sascha Wildner [Sat, 20 Dec 2008 15:15:23 +0000 (16:15 +0100)]
Clean up #include's in the SYNOPSIS.
Sepherosa Ziehau [Sat, 20 Dec 2008 03:13:45 +0000 (11:13 +0800)]
Explicitly reallocate the inpcb cached route freed due to different
rtentry CPU ownership, e.g. in tcp_connect(), so we could make sure
that a RTF_PRCLONING rtentry will be cloned.
Sepherosa Ziehau [Sat, 20 Dec 2008 02:21:46 +0000 (10:21 +0800)]
Use priority message for arp holding mbufs
Hasso Tepper [Sat, 20 Dec 2008 01:56:54 +0000 (03:56 +0200)]
Adapt to OpenPAM Hydrangea.
Hasso Tepper [Sat, 20 Dec 2008 01:53:58 +0000 (03:53 +0200)]
Merge commit 'crater/vendor/OPENPAM'
YONETANI Tomokazu [Fri, 19 Dec 2008 14:21:12 +0000 (23:21 +0900)]
Adjust VFS_SET() to deal with the change to struct vfsconf
Matthew Dillon [Fri, 19 Dec 2008 04:20:15 +0000 (20:20 -0800)]
Close a possible bug where the p_lock for a new process inherits a
non-zero value from its parent on fork(), preventing the process
from being able to exit later on.
Matthew Dillon [Thu, 18 Dec 2008 21:45:27 +0000 (13:45 -0800)]
Merge branch 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into devel
Matthew Dillon [Thu, 18 Dec 2008 21:37:37 +0000 (13:37 -0800)]
This is a MAJOR rewrite of usched_bsd4 and related support logic, plus
additional improvements to the LWKT scheduler.
* The LWKT scheduler used to run a user thread not needing the MP lock
if it was unable to run a kernel thread that did need it, due to some
other cpu holding the lock. This created a massive priority inversion
LWKT no longer does this. It will happily run other MPSAFE kernel
threads but as long as kernel threads exist which need the MP lock
LWKT will no longer switch to a user mode thread.
Add a new sysctl lwkt.chain_mplock which defaults to 0 (off). If set
to 1 LWKT will attempt to use IPIs to notify conflicting cpus when the
MP lock is available and will also allow user mode threads to run if
kernel threads are present needing the MP lock (but unable to get it).
NOTE: Current turning on this feature results in reduced performance,
though not as bad as pre-patch.
* The main control logic USCHED_BSD4 was almost completely rewritten,
greatly improving interactivity in the face of cpu bound programs
such as compiles.
USCHED_BSD4 no longer needs to use the scheduler helper when the
system is under load. The scheduler helper is only used to allow
one cpu to kick another idle cpu when additional processes are
present.
USCHED_BSD4 now takes great advantage of the scheduler's cpu-local
design and uses a bidding algorithm for processes trying to return
to user mode to determine which one is the best. Winners simply
deschedule losers, and since the loser is clearly not running when
the winner does this the descheduling operation is ultra simple to
accomplish.
Matthew Dillon [Thu, 18 Dec 2008 21:27:20 +0000 (13:27 -0800)]
This is a major revamping of the pageout and low-memory handling code.
The pageout daemon now detects out-of-memory conditions and properly
kills the largest process(es). This condition occurs when swap is
full (or you have no swap) and most of the remaining VM pages in memory
have become dirty. With no swap to page to the dirty pages squeeze out
the clean ones. The pageout daemon detects the case and starts killing
processes.
The pageout daemon now detects stress in the form of excess cpu use
and tries to reduce its cpu footprint when that occurs. Excess cpu use
can occur when the only pages left in-core are dirty and there is nowhere
to swap them to. Previously if this case occured the system would basically
just stop working.
These changes make the system truely have VM = RAM+SWAP. If you 1G of ram
and 1G of swap the system can run up to 2G worth of processes.
Matthew Dillon [Thu, 18 Dec 2008 21:18:29 +0000 (13:18 -0800)]
vnode_pager_haspage() could return TRUE but leave *before and *after
uninitialized, causing vm_fault's burst pagein feature to panic the system.
vm_fault was almost never using its burst pagein feature due to incorrect
test logic. The burst pagein code itself was also seriously buggy, so it
is fortunate the test logic was broken.
Rewrite the broken test logic and fix the bugs in the burst pagein code.
Add a new sysctl vm.burst_fault, defaulting to 0 (disabled). The default
will be changed to 1 in a week or two.
Sascha Wildner [Thu, 18 Dec 2008 11:20:34 +0000 (12:20 +0100)]
Really fix indent.
Matthias Schmidt [Thu, 18 Dec 2008 11:09:09 +0000 (12:09 +0100)]
Apply GNU style indent
Matthias Schmidt [Thu, 18 Dec 2008 10:56:46 +0000 (11:56 +0100)]
Fix annoying bug with grep and HAMMER
grep foo * on an UFS partition was silent if grep hit a subdirectory. If
executed on HAMMER, grep complains about "Invalid argument" because directories
in HAMMER are not treated as files.
Before:
cd /usr/src
grep test *
Makefile: test \
grep: cat: Invalid argument
[...]
After:
grep test *
Makefile: test \
Hasso Tepper [Thu, 18 Dec 2008 08:47:32 +0000 (10:47 +0200)]
Import OpenPAM Hydrangea.
Hasso Tepper [Thu, 18 Dec 2008 03:03:47 +0000 (05:03 +0200)]
Call ata_legacy() only once on attach and save it's result.
Scanning PCI configuration registers (which are not going to change) on
every interrupt looks expensive, especially when interrupt is shared.
Profiling (in FreeBSD) shows 3% of time spent by atapci0 on pure network
load due to IRQ sharing with em0.
Obtained-from: FreeBSD
Joe Talbott [Thu, 18 Dec 2008 02:18:48 +0000 (21:18 -0500)]
Merge branch 'master' of /usr/git/dragonfly
Joe Talbott [Thu, 18 Dec 2008 02:13:40 +0000 (21:13 -0500)]
Page align boundaries for kvm_access_check().
Matthew Dillon [Thu, 18 Dec 2008 01:07:33 +0000 (17:07 -0800)]
Merge branch 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into devel
Matthew Dillon [Thu, 18 Dec 2008 01:02:38 +0000 (17:02 -0800)]
Fix bugs in dealing with low-memory situations when the system has run out
of swap or has no swap.
* Fix an error where the system started killing processes before it needed
to.
* Continue propagating pages from the active queue to the inactive queue
when the system has run out of swap or has no swap, even though the
inactive queue has become bloated. This occurs because the inactive
queue may be unable to drain due to an excess of dirty pages which
cannot be swapped out.
* Use the active queue to detect excessive stress which combined with
an out-of-swap or no-swap situation means the system has run out of
memory. THEN start killing processes.
* This also allows the system to recycle nearly all the clean pages
available when it has no swap space left, to try to keep things going,
leaving only dirty pages in the VM page queues.
Michael Neumann [Wed, 17 Dec 2008 18:40:12 +0000 (18:40 +0000)]
Unbreak buildworld
Michael Neumann [Wed, 17 Dec 2008 17:21:05 +0000 (17:21 +0000)]
Merge branch 'vfsconf'
Michael Neumann [Wed, 17 Dec 2008 17:10:58 +0000 (17:10 +0000)]
Clean up a bit
Michael Neumann [Wed, 17 Dec 2008 15:50:04 +0000 (15:50 +0000)]
Merge branch 'vfsconf'
Michael Neumann [Wed, 17 Dec 2008 15:44:42 +0000 (15:44 +0000)]
Refactor filesystem types list and fix bug.
Refactor the management of the filesystem types list (vfsconf) by
introducing some management functions. Reduce inter-module coupling.
This actually fixes a potential "bug" in vfs_register() which does not
compare the new VFS to register with the last entry from the list, i.e.
two (or more) sequential vfs_register() calls with the same argument
would succeed.
Sepherosa Ziehau [Wed, 17 Dec 2008 13:23:16 +0000 (21:23 +0800)]
Symbol TX desc is only used by 64bits TX chain format.
Sepherosa Ziehau [Wed, 17 Dec 2008 12:30:46 +0000 (20:30 +0800)]
Always free the passed in mbuf if jme_encap() failed.
Non-EFBIG error probably will not recover.
Sepherosa Ziehau [Wed, 17 Dec 2008 12:12:34 +0000 (20:12 +0800)]
Don't use magic number
Sepherosa Ziehau [Wed, 17 Dec 2008 12:10:05 +0000 (20:10 +0800)]
Rename ring_cnt to desc_cnt.
ring_cnt may be used when RSS (multi rx ring) support is experimented.
Sepherosa Ziehau [Wed, 17 Dec 2008 11:52:28 +0000 (19:52 +0800)]
Tunable number of RX/TX descs
Sepherosa Ziehau [Wed, 17 Dec 2008 10:38:10 +0000 (18:38 +0800)]
Remove unused macros
Michael Neumann [Tue, 16 Dec 2008 21:48:47 +0000 (21:48 +0000)]
Use jailed() inline-function
Michael Neumann [Tue, 16 Dec 2008 21:46:10 +0000 (21:46 +0000)]
Use more specific privilege PRIV_VFS_GENERATION
Matthew Dillon [Tue, 16 Dec 2008 17:54:38 +0000 (09:54 -0800)]
LIST_FOREACH_MUTUABLE() was tracking processes not held with PHOLD().
Use a normal LIST_FOREACH() instead because the main iterator is being
protected by PHOLD().
Matthew Dillon [Tue, 16 Dec 2008 17:53:32 +0000 (09:53 -0800)]
Assert that nobody holds the process referenced with PHOLD() in
exit.
Matthew Dillon [Tue, 16 Dec 2008 17:49:35 +0000 (09:49 -0800)]
Add missing range checks to sopt_valsize for the linux emulated
setsockopt().
Matthew Dillon [Tue, 16 Dec 2008 17:46:57 +0000 (09:46 -0800)]
Due to races clean blocks can remain cached, remove a conditional from
the ffs_truncate3 panic.
Reported-by: sorry, I forgot.
Matthew Dillon [Tue, 16 Dec 2008 17:46:10 +0000 (09:46 -0800)]
Cleanup and enhance the vnodeinfo output.
Matthew Dillon [Tue, 16 Dec 2008 17:44:57 +0000 (09:44 -0800)]
Add a socketpair performance tester.
Matthew Dillon [Tue, 16 Dec 2008 17:40:20 +0000 (09:40 -0800)]
Merge branch 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into devel
Matthew Dillon [Tue, 16 Dec 2008 17:31:42 +0000 (09:31 -0800)]
Make __fpending() take a const argument.
Sepherosa Ziehau [Tue, 16 Dec 2008 14:26:58 +0000 (22:26 +0800)]
Replace libpcap's pcap-bpf.h with system's net/bpf.h
Sepherosa Ziehau [Tue, 16 Dec 2008 13:40:42 +0000 (21:40 +0800)]
Remove tcpcb.tt_msg == NULL tests in tcp_callout_*().
tcpcb.tt_msg == NULL could only happen for TCP listen sockets, while
for this kind of sockets, tcp timers should never be used.
Suggested-by: dillon@
Sepherosa Ziehau [Mon, 15 Dec 2008 14:29:35 +0000 (22:29 +0800)]
Restore the semantic of callout_active() testing on tcp timers.
Originally there is no time gap between the running of the tcp timer
handler and the deactivation of the tcp timer callout, but the message
based tcp timer has a time gap in between these two actions. This
time gap affects the code path which depends on the current state of
the tcp timer, i.e. return value of callout_active(tcp_timer). To
close this time gap, we take the pending and running tcp timer tasks
into consideration when testing the current state of the tcp timer.
Reviewed-by: dillon@
Michael Neumann [Mon, 15 Dec 2008 23:11:24 +0000 (23:11 +0000)]
Merge branch 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly
Michael Neumann [Mon, 15 Dec 2008 23:09:33 +0000 (23:09 +0000)]
Fix typo
Peter Avalos [Mon, 15 Dec 2008 18:39:51 +0000 (10:39 -0800)]
Rename a local function that conflicts with one in librpcsvc.
Michael Neumann [Mon, 15 Dec 2008 17:45:26 +0000 (17:45 +0000)]
Remove superfluous shadow variable declaration
Michael Neumann [Mon, 15 Dec 2008 17:42:35 +0000 (17:42 +0000)]
Cosmetic changes (move assignment out of "if")
Michael Neumann [Mon, 15 Dec 2008 16:17:49 +0000 (16:17 +0000)]
Fix missing includes
Michael Neumann [Mon, 15 Dec 2008 16:05:25 +0000 (16:05 +0000)]
Fix missing include
Michael Neumann [Mon, 15 Dec 2008 16:00:10 +0000 (16:00 +0000)]
Depreciate suser_*. Instead use priv_check_*.
The suser_* functions now internally call the corresponding
priv_check_* functions.
Michael Neumann [Mon, 15 Dec 2008 15:41:00 +0000 (15:41 +0000)]
Fix typo (currenet -> current)
Michael Neumann [Mon, 15 Dec 2008 14:55:51 +0000 (14:55 +0000)]
suser_* to priv_* conversion
Michael Neumann [Mon, 15 Dec 2008 01:09:25 +0000 (01:09 +0000)]
Implement priv_check() and priv_check_cred()
Michael Neumann [Mon, 15 Dec 2008 00:57:45 +0000 (00:57 +0000)]
Import sys/priv.h (rev 1.25) from FreeBSD
Michael Neumann [Mon, 15 Dec 2008 00:46:58 +0000 (00:46 +0000)]
Remove unnecessary optimization.
The optimization (>> 9) will be automatically performed by the compiler.
Sascha Wildner [Sun, 14 Dec 2008 17:56:01 +0000 (18:56 +0100)]
Remove some unnecessary casts.
Sascha Wildner [Sun, 14 Dec 2008 17:53:02 +0000 (18:53 +0100)]
Adjust the bt_gethostbyaddr(3) prototype to match gethostbyaddr(3).
Sascha Wildner [Sun, 14 Dec 2008 17:50:21 +0000 (18:50 +0100)]
Bring the prototype of gethostbyaddr(3) in line with the standard.
http://www.opengroup.org/onlinepubs/
009695399/functions/gethostbyname.html
Sepherosa Ziehau [Sun, 14 Dec 2008 14:27:33 +0000 (22:27 +0800)]
Use priority message for TCP timers
Sepherosa Ziehau [Sun, 14 Dec 2008 10:47:31 +0000 (18:47 +0800)]
White space cleanup