Matthew Dillon [Thu, 10 Feb 2011 21:22:51 +0000 (13:22 -0800)]
Merge branch 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly
Matthew Dillon [Thu, 10 Feb 2011 21:15:51 +0000 (13:15 -0800)]
kernel - Greatly reduce usched_bsd4_decay default
* Reduce the usched_bsd4_decay default to 1. It may be removed entirely
in the future.
* This improves the dynamic priority handling by reducing ad-hoc estcpu
decreases from the 1-second interval clock. The tsleep code handles
this a lot better already and the ad-hoc decreases don't do a good job
handling the case where there are a very large number of runnable
cpu-bound processes (because they don't actually get a lot of cpu but
still eat a large proportion of the scheduled time in aggregate).
Tested with blogbench during stage 1. Prior to this fix the 100+ blogbench
threads were being dropped down to almost realtime priorities even though
they remained in a 100% 'R'un state.
* Also reduce the amount the parent process of a fork() is docked for cpu
due to the fork. The value was high enough that interactive sessions were
being pushed up to batch priorities with only a moderate number of forks
and not decaying quickly enough to stabilize.
The child process is docked the same as before (handling the fork chaining
case).
Tested with blogbench and parallel makes of /usr/src/lib/libc. The
blogbench uniformly increases to batch priority and didn't need the
higher boost the old values gave it while the parallel compile's fork
chaining gave it a good shove towards batch priority while the repeated
forks slowly pushed the higher level make and /bin/sh's to more batch-like
priorities.
Sascha Wildner [Thu, 10 Feb 2011 11:37:10 +0000 (12:37 +0100)]
hammer.8: Note that 'recover' needs -f.
Sascha Wildner [Wed, 9 Feb 2011 16:25:06 +0000 (17:25 +0100)]
acpi(4): Fix a bug in acpi_cpu_cstate.c (we have to write, and not to read).
Introduced with
10f976749fd9ad2e8642ea80ce533f7416910a65. The commit message
said "Sync ACPI with FreeBSD 7.2", even though FreeBSD 7.2 doesn't seem to
have this code at all, so I'm not sure about what the idea behind that
change was. I'm guessing it is a typo, since newer FreeBSDs call
AcpiWriteBitRegister() here too.
Reported-by: Andrea Magliano <masterblaster@tiscali.it>
Sepherosa Ziehau [Wed, 9 Feb 2011 14:42:33 +0000 (22:42 +0800)]
acpi/madt: Add definitation for interrupt source override MADT entry
Peter Avalos [Wed, 9 Feb 2011 05:13:12 +0000 (19:13 -1000)]
Upgrade to OpenSSL-1.0.0d.
This fixes CVE-2011-0014.
Peter Avalos [Wed, 9 Feb 2011 05:02:56 +0000 (19:02 -1000)]
Merge branch 'vendor/OPENSSL'
Peter Avalos [Wed, 9 Feb 2011 04:59:57 +0000 (18:59 -1000)]
Import OpenSSL-1.0.0d.
Sascha Wildner [Tue, 8 Feb 2011 14:04:33 +0000 (15:04 +0100)]
Bump .Dd in pkg_radd.1 manpage and make pkg_search working again.
Sascha Wildner [Tue, 8 Feb 2011 13:59:53 +0000 (14:59 +0100)]
Don't remove /etc/settings.conf via 'make upgrade' and rename it instead.
Sascha Wildner [Mon, 7 Feb 2011 19:06:33 +0000 (20:06 +0100)]
Sync zoneinfo database with tzdata2011b from elsie.nci.nih.gov
northamerica: 8.39 -> 8.40
zone.tab: 8.38 -> 8.40
* northamerica: Add America/North_Dakota/Beulah (Mercer County,
North Dakota, moved from Mountain to Central time at the
end of DST in 2010). Also, use the actual version number
rather than "%W%".
* zone.tab: Add America/North_Dakota/Beulah. Also, update
Indonesian location names (with the old names retained in
parentheses).
Sascha Wildner [Mon, 7 Feb 2011 15:42:58 +0000 (16:42 +0100)]
Remove useless belt and suspenders include guards in some of our headers.
For these headers:
/usr/include/machine/atomic.h
/usr/include/machine/bus_dma.h
/usr/include/machine/coredump.h
/usr/include/machine/cpufunc.h
/usr/include/machine/db_machdep.h
/usr/include/machine/elf.h
/usr/include/machine/endian.h
/usr/include/machine/frame.h
/usr/include/machine/limits.h
/usr/include/machine/npx.h
/usr/include/machine/profile.h
/usr/include/machine/segments.h
/usr/include/machine/stdarg.h
/usr/include/machine/stdint.h
/usr/include/machine/trap.h
/usr/include/machine/tss.h
/usr/include/machine/ucontext.h
/usr/include/machine/vframe.h
/usr/include/machine/vm86.h
All these headers #define _CPU_... and not _MACHINE_... even though they
are in /usr/include/machine. And the headers themselves have include
guards already. So there's little point in having them around the actual
#include additionally.
Peter Avalos [Sun, 6 Feb 2011 00:22:40 +0000 (14:22 -1000)]
cat: Clean up whitespace.
Justin C. Sherrill [Mon, 7 Feb 2011 01:19:57 +0000 (17:19 -0800)]
Updating pkg_radd man page, pkg_search to reflect new config file.
Justin C. Sherrill [Mon, 7 Feb 2011 00:44:26 +0000 (16:44 -0800)]
Merge branch 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly
Conflicts:
UPDATING
Justin C. Sherrill [Mon, 7 Feb 2011 00:36:29 +0000 (16:36 -0800)]
Move pkg_radd config to a more obvious name; make sure settings.conf gets
cleaned out on upgrade, and stick a warning in UPGRADING so nobody
(hopefully) gets surprised when pkg_radd starts downloading from
mirror-master again.
Sascha Wildner [Mon, 7 Feb 2011 00:07:18 +0000 (01:07 +0100)]
wmake.1: Describe how to install stuff built with wmake.
Pointed-out-by: corecode
Sascha Wildner [Sun, 6 Feb 2011 21:55:11 +0000 (22:55 +0100)]
secure/lib: Fix building of some cases that include lib/Makefile.inc.
Sascha Wildner [Sun, 6 Feb 2011 21:09:01 +0000 (22:09 +0100)]
lib: Move the definition of WARNS into lib/Makefile.inc.
Sascha Wildner [Sun, 6 Feb 2011 21:06:59 +0000 (22:06 +0100)]
libncp: Fix format.
Sascha Wildner [Sun, 6 Feb 2011 21:05:59 +0000 (22:05 +0100)]
libc/csu: Include <machine/tls.h> for some prototypes.
Sepherosa Ziehau [Sun, 6 Feb 2011 14:06:21 +0000 (22:06 +0800)]
ioapic/icu: Add irqmap
Sepherosa Ziehau [Sun, 6 Feb 2011 11:19:24 +0000 (19:19 +0800)]
intr: Enable ELCR by default
Sascha Wildner [Sat, 5 Feb 2011 20:44:26 +0000 (21:44 +0100)]
libc: Raise WARNS to 6.
YONETANI Tomokazu [Wed, 2 Feb 2011 09:45:16 +0000 (18:45 +0900)]
kernel - adjust devfs mount point according to init_chroot loader variable
/dev is mounted before /sbin/init is executed, so when it chroot() itself
and execute the shell, the chroot()'ed space doesn't contain the mounted
devfs inside it and the shell gets stuck.
Peter Avalos [Sat, 5 Feb 2011 06:52:56 +0000 (20:52 -1000)]
<cpu/ieeefp.h>: Use single-underscore instead of double.
This prevents conflicts for when someone decides to #include <fenv.h>
and <cpu/ieeefp.h>.
Peter Avalos [Sat, 5 Feb 2011 04:24:43 +0000 (18:24 -1000)]
<cpu/ieeefp.h>: Sync i386 with x86_64.
The i386 fpget*() and fpset*() functions were obfuscated, and this
change reduces complexity and makes all the fp*() functions inline.
Update comments.
Obtained-from: FreeBSD
Samuel J. Greear [Sat, 5 Feb 2011 08:25:39 +0000 (08:25 +0000)]
kern - aio - Add missing flags to objcache_get()
Samuel J. Greear [Sat, 5 Feb 2011 04:17:41 +0000 (04:17 +0000)]
kern - Convert NFS from zalloc to objcache
Sponsored-By: Google Code-In
Samuel J. Greear [Sat, 5 Feb 2011 04:17:15 +0000 (04:17 +0000)]
kern - Convert crypto from zalloc to objcache
Sponsored-By: Google Code-In
Samuel J. Greear [Sat, 5 Feb 2011 04:16:45 +0000 (04:16 +0000)]
kern - Convert aio from zalloc to objcache
Sponsored-By: Google Code-In
Samuel J. Greear [Sat, 5 Feb 2011 04:15:11 +0000 (04:15 +0000)]
kern - Convert ufs dirhash from zalloc to objcache
Sponsored-By: Google Code-In
Matthew Dillon [Fri, 4 Feb 2011 19:55:02 +0000 (11:55 -0800)]
HAMMER VFS - Fix deadlock which can occur under severe filesystem pressure
* Inode reflushes (a fsync occuring while the inode is still queued for
a prior fsync) were not ensuring that the inode got pushed to the backend
flusher.
This could lead to deadlocks when the process trying to issue the flush
is the syncer itself.
* The problem typically occured under filesystem loads where a large number
of inodes (aka due to a bulk build) are being flushed at once, and the
flush is unable to finish running before the next syncer cycle comes
around.
Matthew Dillon [Fri, 4 Feb 2011 19:51:26 +0000 (11:51 -0800)]
kernel - Add missing vm_page_wakeup()
* Fix a long-standing issue where a VM page is improperly left PG_BUSY
when vm_page_try_to_cache() races the Modified bit in underlying PTEs.
* This could only occur during periods of severe memory pressure and
would typically lead to a program getting stuck in "pgtblk".
Reported-by: Peter Avalos <peter@theshell.com>
Matthew Dillon [Fri, 4 Feb 2011 18:38:39 +0000 (10:38 -0800)]
ps - Adjust field widths
* Reduce VSZ, RSS, RSZ by one (its still wider than the original but now
only 6 digits). We will have to come up with another way to represent
process sizes >= 1GB.
* Reduce the tty column to 2 chars
Matthew Dillon [Fri, 4 Feb 2011 18:37:30 +0000 (10:37 -0800)]
ps - Support unix98 ttys for -t
* Support numeric ttys, e.g. ps xt5 specifies tty /dev/pts/5
Antonio Huete Jimenez [Fri, 4 Feb 2011 17:37:28 +0000 (18:37 +0100)]
vkernel64 - Raise the memory requirements to 64MB.
Antonio Huete Jimenez [Fri, 4 Feb 2011 17:16:09 +0000 (18:16 +0100)]
vkernel - Avoid appending the error message in some cases.
Antonio Huete Jimenez [Fri, 4 Feb 2011 16:08:07 +0000 (17:08 +0100)]
kern - Clarify the description of hw.physmem.
Antonio Huete [Fri, 4 Feb 2011 11:29:05 +0000 (12:29 +0100)]
vm - Correct sysctl output formatting for kvm_size and kvm_free
Antonio Huete [Fri, 4 Feb 2011 13:34:05 +0000 (14:34 +0100)]
kern - Properly return the number of bytes in hw.physmem sysctl.
- Changed physmem from int to long, now it can hold a
really high number of pages.
- Return an unsigned long value with appropiate formatting
for hw.physmem.
- Add description for it.
Based-upon: FreeBSD
Matthew Dillon [Fri, 4 Feb 2011 07:25:41 +0000 (23:25 -0800)]
kernel - Add options VM_PAGE_DEBUG
* Add options VM_PAGE_DEBUG for kernel configs. This requires a full kernel
rebuild (if you use the option) and supplies additional information in
the vm_page structure to help track down problems.
Peter Avalos [Tue, 1 Feb 2011 07:59:03 +0000 (21:59 -1000)]
ieeefp.h: Remove i386 specifics.
Move the contents of <machine/floatingpoint.h> to <machine/ieeefp.h> for
i386 to match x86_64.
While I'm here, mark which versions of these files we have for x86_64.
Obtained-from: FreeBSD
Peter Avalos [Tue, 1 Feb 2011 07:17:02 +0000 (21:17 -1000)]
style(9): Remove whitespace.
Peter Avalos [Tue, 1 Feb 2011 07:06:52 +0000 (21:06 -1000)]
fenv: Explicitly specify sizes for control and status words.
Obtained-from: FreeBSD
YONETANI Tomokazu [Thu, 3 Feb 2011 05:29:48 +0000 (14:29 +0900)]
libc - fix handling of temporary file used by hash(3)
This fixes applications using DB_HASH, such as tsort, to unexpectedly
try to open a temporary file in the current directory and fail if it
has no write permission there.
Obtained from FreeBSD, r190485, by delphij:
db/btree/bt_open.c: check return value of snprintf() and return value
if the result is truncated.
db/hash/hash_page.c: use the same way to create temporary file as
bt_open.c; check snprintf() return value.
Obtained from: OpenBSD
Sascha Wildner [Thu, 3 Feb 2011 23:20:47 +0000 (00:20 +0100)]
Build our GCCs with CSTD=gnu89.
This fixes building of the GCCs using gcc44.
Reported-by: Max Herrgard <herrgard@gmail.com> and others
Submitted-by: Max Herrgard <herrgard@gmail.com>
Dragonfly-bug: <http://bugs.dragonflybsd.org/issue1978>
Matthew Dillon [Thu, 3 Feb 2011 18:52:34 +0000 (10:52 -0800)]
kernel - Fix physmap base calculation for x86-64
Reported-by: luxh, tuxillo
Nicolas Thery [Fri, 28 Jan 2011 00:55:12 +0000 (01:55 +0100)]
kernel - migrate knote from zone to kmalloc
Sascha Wildner [Wed, 2 Feb 2011 18:04:16 +0000 (19:04 +0100)]
acpi(4): Always compile the files dealing with ACPI_DEBUG into the module.
Before this commit, one had to define ACPI_DEBUG as a make variable to
enable debugging support in the module, such as in:
$ make -DACPI_DEBUG buildkernel
Specifying ACPI_DEBUG in the kernel config alone did not enable it, but
our modules are supposed to honor kernel options. Also this was contrary
to what the manual page says.
So to make this work for ACPI_DEBUG too, we just put all the affected
source files into SRCS and always compile them. #ifdef's in these
source files will take care of enabling/disabling debugging support
so a module compiled without ACPI_DEBUG defined in the kernel or on the
command line will still not have support after this commit (I've checked
with nm(1)).
The only change for someone not using ACPI_DEBUG is a little bit of
additional buildkernel time.
FWIW, it is the same way in FreeBSD, too.
Reported-by: Andrea Magliano <masterblaster@tiscali.it>
Sepherosa Ziehau [Wed, 2 Feb 2011 15:55:43 +0000 (23:55 +0800)]
intr: Further delay MachIntrABI.finalize()
It only affects SMP case. For ICU, it will be better if finalize()
is called after IMCR detection is done, though on most modern systems
IMCR does not exist. For I/O APIC, finalize() _should_ be called
after BSP's LAPIC is initialized, since it alters BSP LAPIC's LINT0
and LINT1 configuration.
Add stabilize() ABI to MachIntrABI which is only implemented by ICU
currently and this ABI is called in the place where finalize() used
to be called.
Sepherosa Ziehau [Wed, 2 Feb 2011 13:44:25 +0000 (21:44 +0800)]
ioapic: File renaming (apic -> ioapic)
Sepherosa Ziehau [Wed, 2 Feb 2011 13:20:48 +0000 (21:20 +0800)]
icu: Add icu/icu_abi.h
Mainly to avoid manually declaring MachIntrABI_ICU
Sepherosa Ziehau [Wed, 2 Feb 2011 12:46:18 +0000 (20:46 +0800)]
ioapic: Function/variable renaming (apic -> ioapic)
Sepherosa Ziehau [Wed, 2 Feb 2011 09:41:17 +0000 (17:41 +0800)]
apic_initialize: Adjust comment
Matthew Dillon [Wed, 2 Feb 2011 07:59:22 +0000 (23:59 -0800)]
powerd - Do a more sophisticated domain scan, use kern.usched_global_cpumask
* Do a more sophisticated domain scan, cpu domains do not necessarily start
at 0.
* Handle the case where multiple cpus may belong to a single domain.
* Dynamically adjust kern.usched_global_cpumask to the number of cpus we
are running at max frequency, leaving the remaining cpus set at their
lowest frequency and left mostly idle.
* Tested on the 48-core monster and phenom x 6.
Matthew Dillon [Wed, 2 Feb 2011 07:44:07 +0000 (23:44 -0800)]
kernel - Add kern.usched_global_cpumask
* Add sysctl kern.usched_global_cpumask, a global cpumask that restricts
which cpus userland processes are allowed to run on. This sysctl may be
set dynamically.
* NOTE: This sysctl is intended to be used by powerd. Setting it manually
will only work properly if powerd is not running.
Sepherosa Ziehau [Wed, 2 Feb 2011 04:26:20 +0000 (12:26 +0800)]
pf: Fix typo in pf_mask_del()
Reported-by: Jan Lentfer <Jan.Lentfer@web.de>
Noticed-by: Joe Talbott <josepht@cstone.net>
Sepherosa Ziehau [Wed, 2 Feb 2011 04:14:12 +0000 (12:14 +0800)]
kernrl: Fix LINT building for recent rn_inithead API change
Reported-by: Thomas Nikolajsen <thomas.nikolajsen@mail.dk>
Matthew Dillon [Tue, 1 Feb 2011 23:57:36 +0000 (15:57 -0800)]
ps - Increase selected field widths
* Increase field widths for STAT, XSTAT, VSZ, RSS, and RSZ to accomodate
more typical program run sizes and to accomodate systems with more than
10 cpus.
Matthew Dillon [Tue, 1 Feb 2011 22:23:29 +0000 (14:23 -0800)]
kernel64 - Greatly reduce memory probe times, remove basemem calculation
* Greatly reduce memory probe times by testing in multiples of 128K instead
of multiples of 4K. Also add cpu_mfence() instructions to flush the
cpu store buffer.
This greatly reduces the startup time for x86-64 on monster machines
with lots of memory (tested w/64G).
* Remove the basemem calculation, it is no longer used.
Sepherosa Ziehau [Tue, 1 Feb 2011 14:32:52 +0000 (22:32 +0800)]
pc32: Split out isa_intr.h and move isa/intr_machdep.h to include/
Sepherosa Ziehau [Tue, 1 Feb 2011 13:49:34 +0000 (21:49 +0800)]
pc64: Split out isa_intr.h and move isa/intr_machdep.h to include/
Sepherosa Ziehau [Tue, 1 Feb 2011 12:58:26 +0000 (20:58 +0800)]
intr: Add ELCR support
This device controls/shows ICU pin's trigger mode (level/edge).
Currently, it is not enabled by default.
Obtained-from: FreeBSD (jhb@freebsd.org)
Sepherosa Ziehau [Tue, 1 Feb 2011 08:09:26 +0000 (16:09 +0800)]
radix: Fix the non-per-cpu radix tree usage.
- Install a mask radix tree in each radix tree, while, mask radix tree does
not have mask radix tree (of course).
- rn_cpumaskhead() is added to provide the global per-cpu mask radix tree.
- rn_inithead() requires a mask radix tree as paramter. Mask radix tree is
initialized by passing NULL. INET/INET6/ATALK pass the mask radix tree
obtained from rn_cpumaskhead(), i.e. the old sementics.
- pf(4) now creates its own mask radix tree, and all of its internal radix
trees will use that mask radix tree instead of the global per-cpu mask
radix tree. pf(4) radix tree operations are protected by its own token.
- rn_addmask() requires a mask radix tree, instead of using the global
per-cpu mask radix tree. For most cases, the caller has access to the
radix tree that has a mask radix tree installed. For _rtmask_lookup(),
which is always called from route_output(), we could safely assume that
global per-cpu mask radix tree is used.
This is mainly used to fix the following bug concerning global per-cpu
mask radix tree:
Before this commit, pf(4) could create mask on CPU0's mask radix tree,
while the deletion of the mask happens on other CPUs, which results pf(4)'s
radix tree operation to fail (can't locate the mask).
Dragonfly-bug: http://bugs.dragonflybsd.org/issue1969
Root-cause-found-by: Jan Lentfer <Jan.Lentfer@web.de>
Matthew Dillon [Mon, 31 Jan 2011 21:40:55 +0000 (13:40 -0800)]
kernel64 - Fix disabled interrupts during dbg/bpt trap
* Interrupts were left improperly disabled during a dbg or bpt trap.
i386 enables interrupts for these traps. x86-64 needs to as well
or it will hit an assertion in lwkt_switch() under certain circumstances.
* Make debug code in lwkt_switch() also require INVARIANTS to function.
NOTE: This is temporary debug code and should be removed at some point
after 48-core testing is complete.
Matthew Dillon [Mon, 31 Jan 2011 21:09:39 +0000 (13:09 -0800)]
kernel - Fix SMP assumption of at least 2 cpus w/TCP
* TCP was assuming at least 2 cpus on SMP builds and would panic if only
1 cpu was available.
* Fix by testing ncpus even for SMP builds.
* This problem did not effect UP builds.
Matthew Dillon [Mon, 31 Jan 2011 21:08:26 +0000 (13:08 -0800)]
kernel - Fix stall after mountroot w/ SMP & ncpus == 1
* Fix a degenerate case for SMP builds when ncpus == 1. This effects
both the vkernel and the normal kernel (when a SMP kernel is booted
on a non-SMP box which has a LAPIC).
The init process was not bring scheduled properly.
Matthew Dillon [Mon, 31 Jan 2011 07:56:13 +0000 (23:56 -0800)]
vkernel - Fix lwbuf build error for vkernel64
* Fix a compile error that was preventing vkernel64's from building.
Peter Avalos [Mon, 31 Jan 2011 07:37:38 +0000 (21:37 -1000)]
Turn off all warnings when compiling gdtoa sources.
Obtained-from: FreeBSD
Peter Avalos [Mon, 31 Jan 2011 06:28:20 +0000 (20:28 -1000)]
Merge branch 'vendor/GDTOA'
Peter Avalos [Mon, 31 Jan 2011 06:24:13 +0000 (20:24 -1000)]
Remove unneeded files from contrib/gdtoa.
Add a README.DELETED while I'm here.
Sepherosa Ziehau [Mon, 31 Jan 2011 05:55:00 +0000 (13:55 +0800)]
kernel options: Add MCLSHIFT
So on the system w/o wireless NICs, we could configure MCLBYTES to 2K
(options MCLSHIFT=11) instead of the default 4K
Sepherosa Ziehau [Mon, 31 Jan 2011 05:43:24 +0000 (13:43 +0800)]
netisr barrier: Prevent netisr_barrier_dispatch() from false wakeup
- Change wait states into wait flags, only test NOTDONE flag when
being woken up.
- Simplify wakeup logic.
With-help-from: dillon@
Peter Avalos [Mon, 31 Jan 2011 05:00:55 +0000 (19:00 -1000)]
Sepherosa Ziehau [Mon, 31 Jan 2011 03:09:35 +0000 (11:09 +0800)]
netisr barrier: Avoid lockless wakeup/tsleep race
Add a waiting state (NETISR_BR_WAITDONE), before it is set wakeup()
will not be called. And use atomic_cmpset_int() to do the state
transition.
With-help-from: dillon@
Sepherosa Ziehau [Fri, 28 Jan 2011 06:43:05 +0000 (14:43 +0800)]
netisr: Make sure that netisr barrier's done is globally visible
Sepherosa Ziehau [Tue, 25 Jan 2011 05:35:11 +0000 (13:35 +0800)]
netisr: Make netisr barrier's done and cpumask volatile
So that the assignment to these two variables and the following wakeup()
will not get reordered.
Sepherosa Ziehau [Mon, 24 Jan 2011 08:48:04 +0000 (16:48 +0800)]
tcp6: Set TF_SYNCACHE properly in tcp6_usr_listen()
Sepherosa Ziehau [Mon, 24 Jan 2011 07:52:45 +0000 (15:52 +0800)]
tcp: Make listen(2) socket close(2) MPSAFE
- Cleanup the syncache entries on each CPU for this inp.
- Before whacking the inp, we unhook the inp wildcard hash and cleanup
its syncache entries by using a synchronized message.
- Fix up comment.
Reported-by: pavalos@
DragonFly-bug: http://bugs.dragonflybsd.org/issue1960
Sepherosa Ziehau [Mon, 24 Jan 2011 07:39:54 +0000 (15:39 +0800)]
tcp_usr_listen: Use domsg when duplicate listen socket's inp wildcard hash
This makes sure that the each protocol threads sees the socket when
tcp_usr_listen() returns.
Sepherosa Ziehau [Fri, 21 Jan 2011 08:14:03 +0000 (16:14 +0800)]
tcp: Don't abuse TF_SYNCACHE to ill-optimize syncache_destroy()
- We now turn on TF_SYNCACHE when listen(2) is called.
- When a listen(2) socket is to be close(2), the syncache list on the
current CPU is thoroughly iterated and all related syncache are marked
to be dropped instead of the first syncache.
Sepherosa Ziehau [Fri, 21 Jan 2011 05:55:36 +0000 (13:55 +0800)]
udp6: Protect udbinfo by udbinfo barrier
Sepherosa Ziehau [Fri, 21 Jan 2011 05:30:19 +0000 (13:30 +0800)]
inpcb: Save UDP inpcb into temporary memory during in_pcblist
The temorary memory is used later to do the SYSCTL_OUT without
the udbinfo serializer being held. Mainly to avoid deadlock
triggered by holding serializer and copyout.
Reminded-by: dillon@
Sepherosa Ziehau [Fri, 21 Jan 2011 02:44:35 +0000 (10:44 +0800)]
udp_getcred: Release serializer when doing SYSCTL_OUT
Mainly to avoid deadlock during copyout
Reminded-by: dillon@
Sepherosa Ziehau [Thu, 23 Dec 2010 08:03:08 +0000 (16:03 +0800)]
udp: pcb list/hashtable protection stage 2/2
- Use serializer to protect pcb list/hashtable iteration not running
in netisrs.
- Don't use marker pcb, so except for the functions running in netisr0,
no other functions will alter pcb list.
Sepherosa Ziehau [Thu, 23 Dec 2010 08:01:41 +0000 (16:01 +0800)]
inpcb: Add pcblist sysctl helper function w/o using marker inpcb
Sepherosa Ziehau [Thu, 23 Dec 2010 05:27:51 +0000 (13:27 +0800)]
udp: pcb list/hashtable protection stage 1/2
Use netisr barrier make sure that netisr will not iterating pcb list or
hashtable when adding or removing pcb
Add assertion that all UDP pru functions run in netisr0.
Sepherosa Ziehau [Thu, 23 Dec 2010 05:12:39 +0000 (13:12 +0800)]
netisr: Add netisr barrier which stalls all netisrs
netisr_barrier_set()
Set a netisr barrier, which stalls all netisr. Currently it must be
called from netisr0.
netisr_barrier_rem()
Remove the netisr barrier, which unstalls all netisr. Currently it
must be called from netisr0.
These interfaces could be used to work out a lockless pcb lookup or
iteration (on network hotpath e.g. input/output) at the cost of
relatively expensive pcb adding and removing (e.g. connect(2)).
Matthew Dillon [Mon, 31 Jan 2011 01:39:24 +0000 (17:39 -0800)]
kernel - Revert last commit for a better upcoming fix
* Revert this fix, a better one is going to be committed soon.
Matthew Dillon [Mon, 31 Jan 2011 00:19:40 +0000 (16:19 -0800)]
kernel - Fix syncache vs close(listen_socket) race
* Attempt to fix a race where a listen socket is closed with an active
syncache. The tcpcb is detached prior to the syncache being destroyed
resulting in a race where a new incoming connection can complete and
attempt to dive the listen socket's tcpcb.
* Detach the tcpcb after the syncache is destroyed rather than before.
Sascha Wildner [Sun, 30 Jan 2011 21:51:42 +0000 (22:51 +0100)]
libc: Remove some unneeded inclusions of <sys/cdefs.h>.
Sascha Wildner [Sun, 30 Jan 2011 21:51:27 +0000 (22:51 +0100)]
Fix up <utmp.h> and <utmpx.h> for C++ programs.
__BEGIN_DECLS and __END_DECLS are absolutely needed around prototypes
so the functions can be called from C++ code (see the definition of
__BEGIN_DECLS in <sys/ctype.h>).
While here, put non-standard stuff in __BSD_VISIBLE instead of just
noting it in a comment.
This fixes at least x11/rxvt-unicode.
Tested-by: tuxillo
Matthew Dillon [Sun, 30 Jan 2011 21:44:11 +0000 (13:44 -0800)]
libc - Fix bogus pthread_getspecific() return value due to bug in nmalloc
* nmalloc was calling pthread_set_specific() prior to calling
pthread_key_create(), causing it to use key 0 which might already
have been allocated for other purposes.
* Reorder initializations in _nmalloc_thr_init() to solve the problem.
* This also solves certain application crashes (mail/milter-greylist).
Reported-by: Francois Tigeot <ftigeot@wolfpond.org>
Sepherosa Ziehau [Sun, 30 Jan 2011 09:44:26 +0000 (17:44 +0800)]
icu: Split out icu/icu.c
Sepherosa Ziehau [Sun, 30 Jan 2011 08:31:29 +0000 (16:31 +0800)]
icu: Put ICU_IMR_OFFSET into machine_base/icu/icu.h
Sepherosa Ziehau [Sun, 30 Jan 2011 06:33:52 +0000 (14:33 +0800)]
ioapic/icu: Rework PIC selection code
- In the early stage, before I/O APIC is detected and setup, ICU controls
interrupts, so all IDT entries should be set to ICU's intr code.
- Switch to I/O APIC only after ICU is completely disconnected, i.e. after
IMCR is set and LINT0 is masked.
Matthew Dillon [Sun, 30 Jan 2011 02:59:32 +0000 (18:59 -0800)]
kernel - Have the crypto subsystem use the new mpipe_alloc_callback() API
* The crypto subsystem can deadlock on blocked mpipe operations while
holding a CAM lockmgr lock.
* Change the subsystem to use the new mpipe_alloc_callback() mechanism,
avoiding any deadlocks.
Reported-by: Peter Avalos <peter@theshell.com>
Matthew Dillon [Sun, 30 Jan 2011 02:57:09 +0000 (18:57 -0800)]
kernel - Add callback API for mpipe
* Add a callback API for mpipe which uses a dedicated kthread,
allowing clients to avoid deadlocks related to held locks during
strategy calls.
* Add mpipe_alloc_callback(). Use of this function also requires
that MPF_CALLBACK be supplied to mpipe_init().
* Add mpipe_wait(). This function may be used for clients which
which to roll their own mpipe retry loop (or already have their
own thread(s) to deal with it in a safe manner).
Sepherosa Ziehau [Sun, 30 Jan 2011 02:36:11 +0000 (10:36 +0800)]
cpu_sfence: Don't use sfence
As suggested by dillon@, it will create many unnecessary stalls