4 years agokernel - Add per-process token, adjust signal code to use it.
Matthew Dillon [Fri, 11 Feb 2011 22:47:58 +0000 (14:47 -0800)]
kernel - Add per-process token, adjust signal code to use it.

* Add proc->p_token and use it to interlock signal-related operations.

* Remove the use of proc_token in various signal paths.  Note that proc_token
  is still used in conjuction with pfind().

* Remove the use of proc_token in CURSIG*()/issignal() sequences, which
  also removes its use in the tsleep path and the syscall path.  p->p_token
  is use instead.

* Move the automatic interlock in the tsleep code to before the CURSIG code,
  fixing a rare race where a SIGCHLD could race against a parent process
  in sigsuspend().  Also acquire p->p_token here to interlock LWP_SINTR

4 years agoHAMMER Utility - Change the minimum UNDO/REDO FIFO from 100M to 500M
Matthew Dillon [Thu, 10 Feb 2011 22:32:25 +0000 (14:32 -0800)]
HAMMER Utility - Change the minimum UNDO/REDO FIFO from 100M to 500M

* The minimum undo/redo fifo really needs to be larger.  Don't play
  around, make it 500M.  People who want to run HAMMER on small hard
  drives or images need to be cognizent of the requirement.

* This partially solves (only partially) a FIFO overflow condition.
  Effectively the complexity of buffered operations hammer allows to
  build up in the kernel could easily overflow a minimally-sized on-media
  UNDO/REDO FIFO.  Upping the requirement makes the case less likely.

  The remainder of the resolution will require some fixes in the
  HAMMER VFS code.

Reported-by: Thomas Nikolajsen <thomas.nikolajsen@mail.dk>
4 years agoMerge branch 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly
Matthew Dillon [Thu, 10 Feb 2011 21:22:51 +0000 (13:22 -0800)]
Merge branch 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly

4 years agokernel - Greatly reduce usched_bsd4_decay default
Matthew Dillon [Thu, 10 Feb 2011 21:15:51 +0000 (13:15 -0800)]
kernel - Greatly reduce usched_bsd4_decay default

* Reduce the usched_bsd4_decay default to 1.  It may be removed entirely
  in the future.

* This improves the dynamic priority handling by reducing ad-hoc estcpu
  decreases from the 1-second interval clock.  The tsleep code handles
  this a lot better already and the ad-hoc decreases don't do a good job
  handling the case where there are a very large number of runnable
  cpu-bound processes (because they don't actually get a lot of cpu but
  still eat a large proportion of the scheduled time in aggregate).

  Tested with blogbench during stage 1.  Prior to this fix the 100+ blogbench
  threads were being dropped down to almost realtime priorities even though
  they remained in a 100% 'R'un state.

* Also reduce the amount the parent process of a fork() is docked for cpu
  due to the fork.  The value was high enough that interactive sessions were
  being pushed up to batch priorities with only a moderate number of forks
  and not decaying quickly enough to stabilize.

  The child process is docked the same as before (handling the fork chaining

  Tested with blogbench and parallel makes of /usr/src/lib/libc.  The
  blogbench uniformly increases to batch priority and didn't need the
  higher boost the old values gave it while the parallel compile's fork
  chaining gave it a good shove towards batch priority while the repeated
  forks slowly pushed the higher level make and /bin/sh's to more batch-like

4 years agohammer.8: Note that 'recover' needs -f.
Sascha Wildner [Thu, 10 Feb 2011 11:37:10 +0000 (12:37 +0100)]
hammer.8: Note that 'recover' needs -f.

4 years agoacpi(4): Fix a bug in acpi_cpu_cstate.c (we have to write, and not to read).
Sascha Wildner [Wed, 9 Feb 2011 16:25:06 +0000 (17:25 +0100)]
acpi(4): Fix a bug in acpi_cpu_cstate.c (we have to write, and not to read).

Introduced with 10f976749fd9ad2e8642ea80ce533f7416910a65. The commit message
said "Sync ACPI with FreeBSD 7.2", even though FreeBSD 7.2 doesn't seem to
have this code at all, so I'm not sure about what the idea behind that
change was. I'm guessing it is a typo, since newer FreeBSDs call
AcpiWriteBitRegister() here too.

Reported-by: Andrea Magliano <masterblaster@tiscali.it>
4 years agoacpi/madt: Add definitation for interrupt source override MADT entry
Sepherosa Ziehau [Wed, 9 Feb 2011 14:42:33 +0000 (22:42 +0800)]
acpi/madt: Add definitation for interrupt source override MADT entry

4 years agoUpgrade to OpenSSL-1.0.0d.
Peter Avalos [Wed, 9 Feb 2011 05:13:12 +0000 (19:13 -1000)]
Upgrade to OpenSSL-1.0.0d.

This fixes CVE-2011-0014.

4 years agoMerge branch 'vendor/OPENSSL'
Peter Avalos [Wed, 9 Feb 2011 05:02:56 +0000 (19:02 -1000)]
Merge branch 'vendor/OPENSSL'

4 years agoImport OpenSSL-1.0.0d.
Peter Avalos [Wed, 9 Feb 2011 04:59:57 +0000 (18:59 -1000)]
Import OpenSSL-1.0.0d.

5 years agoBump .Dd in pkg_radd.1 manpage and make pkg_search working again.
Sascha Wildner [Tue, 8 Feb 2011 14:04:33 +0000 (15:04 +0100)]
Bump .Dd in pkg_radd.1 manpage and make pkg_search working again.

5 years agoDon't remove /etc/settings.conf via 'make upgrade' and rename it instead.
Sascha Wildner [Tue, 8 Feb 2011 13:59:53 +0000 (14:59 +0100)]
Don't remove /etc/settings.conf via 'make upgrade' and rename it instead.

5 years agoSync zoneinfo database with tzdata2011b from elsie.nci.nih.gov
Sascha Wildner [Mon, 7 Feb 2011 19:06:33 +0000 (20:06 +0100)]
Sync zoneinfo database with tzdata2011b from elsie.nci.nih.gov

northamerica:   8.39 -> 8.40
zone.tab:       8.38 -> 8.40

* northamerica: Add America/North_Dakota/Beulah (Mercer County,
    North Dakota, moved from Mountain to Central time at the
    end of DST in 2010). Also, use the actual version number
    rather than "%W%".

* zone.tab: Add America/North_Dakota/Beulah. Also, update
    Indonesian location names (with the old names retained in

5 years agoRemove useless belt and suspenders include guards in some of our headers.
Sascha Wildner [Mon, 7 Feb 2011 15:42:58 +0000 (16:42 +0100)]
Remove useless belt and suspenders include guards in some of our headers.

For these headers:


All these headers #define _CPU_... and not _MACHINE_... even though they
are in /usr/include/machine. And the headers themselves have include
guards already. So there's little point in having them around the actual
#include additionally.

5 years agocat: Clean up whitespace.
Peter Avalos [Sun, 6 Feb 2011 00:22:40 +0000 (14:22 -1000)]
cat: Clean up whitespace.

5 years agoUpdating pkg_radd man page, pkg_search to reflect new config file.
Justin C. Sherrill [Mon, 7 Feb 2011 01:19:57 +0000 (17:19 -0800)]
Updating pkg_radd man page, pkg_search to reflect new config file.

5 years agoMerge branch 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly
Justin C. Sherrill [Mon, 7 Feb 2011 00:44:26 +0000 (16:44 -0800)]
Merge branch 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly


5 years agoMove pkg_radd config to a more obvious name; make sure settings.conf gets
Justin C. Sherrill [Mon, 7 Feb 2011 00:36:29 +0000 (16:36 -0800)]
Move pkg_radd config to a more obvious name; make sure settings.conf gets
cleaned out on upgrade, and stick a warning in UPGRADING so nobody
(hopefully) gets surprised when pkg_radd starts downloading from
mirror-master again.

5 years agowmake.1: Describe how to install stuff built with wmake.
Sascha Wildner [Mon, 7 Feb 2011 00:07:18 +0000 (01:07 +0100)]
wmake.1: Describe how to install stuff built with wmake.

Pointed-out-by: corecode
5 years agosecure/lib: Fix building of some cases that include lib/Makefile.inc.
Sascha Wildner [Sun, 6 Feb 2011 21:55:11 +0000 (22:55 +0100)]
secure/lib: Fix building of some cases that include lib/Makefile.inc.

5 years agolib: Move the definition of WARNS into lib/Makefile.inc.
Sascha Wildner [Sun, 6 Feb 2011 21:09:01 +0000 (22:09 +0100)]
lib: Move the definition of WARNS into lib/Makefile.inc.

5 years agolibncp: Fix format.
Sascha Wildner [Sun, 6 Feb 2011 21:06:59 +0000 (22:06 +0100)]
libncp: Fix format.

5 years agolibc/csu: Include <machine/tls.h> for some prototypes.
Sascha Wildner [Sun, 6 Feb 2011 21:05:59 +0000 (22:05 +0100)]
libc/csu: Include <machine/tls.h> for some prototypes.

5 years agoioapic/icu: Add irqmap
Sepherosa Ziehau [Sun, 6 Feb 2011 14:06:21 +0000 (22:06 +0800)]
ioapic/icu: Add irqmap

5 years agointr: Enable ELCR by default
Sepherosa Ziehau [Sun, 6 Feb 2011 11:19:24 +0000 (19:19 +0800)]
intr: Enable ELCR by default

5 years agolibc: Raise WARNS to 6.
Sascha Wildner [Sat, 5 Feb 2011 20:44:26 +0000 (21:44 +0100)]
libc: Raise WARNS to 6.

5 years agokernel - adjust devfs mount point according to init_chroot loader variable
YONETANI Tomokazu [Wed, 2 Feb 2011 09:45:16 +0000 (18:45 +0900)]
kernel - adjust devfs mount point according to init_chroot loader variable

/dev is mounted before /sbin/init is executed, so when it chroot() itself
and execute the shell, the chroot()'ed space doesn't contain the mounted
devfs inside it and the shell gets stuck.

5 years ago<cpu/ieeefp.h>: Use single-underscore instead of double.
Peter Avalos [Sat, 5 Feb 2011 06:52:56 +0000 (20:52 -1000)]
<cpu/ieeefp.h>:  Use single-underscore instead of double.

This prevents conflicts for when someone decides to #include <fenv.h>
and <cpu/ieeefp.h>.

5 years ago<cpu/ieeefp.h>: Sync i386 with x86_64.
Peter Avalos [Sat, 5 Feb 2011 04:24:43 +0000 (18:24 -1000)]
<cpu/ieeefp.h>: Sync i386 with x86_64.

The i386 fpget*() and fpset*() functions were obfuscated, and this
change reduces complexity and makes all the fp*() functions inline.

Update comments.

Obtained-from: FreeBSD

5 years agokern - aio - Add missing flags to objcache_get()
Samuel J. Greear [Sat, 5 Feb 2011 08:25:39 +0000 (08:25 +0000)]
kern - aio - Add missing flags to objcache_get()

5 years agokern - Convert NFS from zalloc to objcache
Samuel J. Greear [Sat, 5 Feb 2011 04:17:41 +0000 (04:17 +0000)]
kern - Convert NFS from zalloc to objcache

Sponsored-By: Google Code-In
5 years agokern - Convert crypto from zalloc to objcache
Samuel J. Greear [Sat, 5 Feb 2011 04:17:15 +0000 (04:17 +0000)]
kern - Convert crypto from zalloc to objcache

Sponsored-By: Google Code-In
5 years agokern - Convert aio from zalloc to objcache
Samuel J. Greear [Sat, 5 Feb 2011 04:16:45 +0000 (04:16 +0000)]
kern - Convert aio from zalloc to objcache

Sponsored-By: Google Code-In
5 years agokern - Convert ufs dirhash from zalloc to objcache
Samuel J. Greear [Sat, 5 Feb 2011 04:15:11 +0000 (04:15 +0000)]
kern - Convert ufs dirhash from zalloc to objcache

Sponsored-By: Google Code-In
5 years agoHAMMER VFS - Fix deadlock which can occur under severe filesystem pressure
Matthew Dillon [Fri, 4 Feb 2011 19:55:02 +0000 (11:55 -0800)]
HAMMER VFS - Fix deadlock which can occur under severe filesystem pressure

* Inode reflushes (a fsync occuring while the inode is still queued for
  a prior fsync) were not ensuring that the inode got pushed to the backend

  This could lead to deadlocks when the process trying to issue the flush
  is the syncer itself.

* The problem typically occured under filesystem loads where a large number
  of inodes (aka due to a bulk build) are being flushed at once, and the
  flush is unable to finish running before the next syncer cycle comes

5 years agokernel - Add missing vm_page_wakeup()
Matthew Dillon [Fri, 4 Feb 2011 19:51:26 +0000 (11:51 -0800)]
kernel - Add missing vm_page_wakeup()

* Fix a long-standing issue where a VM page is improperly left PG_BUSY
  when vm_page_try_to_cache() races the Modified bit in underlying PTEs.

* This could only occur during periods of severe memory pressure and
  would typically lead to a program getting stuck in "pgtblk".

Reported-by: Peter Avalos <peter@theshell.com>
5 years agops - Adjust field widths
Matthew Dillon [Fri, 4 Feb 2011 18:38:39 +0000 (10:38 -0800)]
ps - Adjust field widths

* Reduce VSZ, RSS, RSZ by one (its still wider than the original but now
  only 6 digits).  We will have to come up with another way to represent
  process sizes >= 1GB.

* Reduce the tty column to 2 chars

5 years agops - Support unix98 ttys for -t
Matthew Dillon [Fri, 4 Feb 2011 18:37:30 +0000 (10:37 -0800)]
ps - Support unix98 ttys for -t

* Support numeric ttys, e.g. ps xt5 specifies tty /dev/pts/5

5 years agovkernel64 - Raise the memory requirements to 64MB.
Antonio Huete Jimenez [Fri, 4 Feb 2011 17:37:28 +0000 (18:37 +0100)]
vkernel64 - Raise the memory requirements to 64MB.

5 years agovkernel - Avoid appending the error message in some cases.
Antonio Huete Jimenez [Fri, 4 Feb 2011 17:16:09 +0000 (18:16 +0100)]
vkernel - Avoid appending the error message in some cases.

5 years agokern - Clarify the description of hw.physmem.
Antonio Huete Jimenez [Fri, 4 Feb 2011 16:08:07 +0000 (17:08 +0100)]
kern - Clarify the description of hw.physmem.

5 years agovm - Correct sysctl output formatting for kvm_size and kvm_free
Antonio Huete [Fri, 4 Feb 2011 11:29:05 +0000 (12:29 +0100)]
vm - Correct sysctl output formatting for kvm_size and kvm_free

5 years agokern - Properly return the number of bytes in hw.physmem sysctl.
Antonio Huete [Fri, 4 Feb 2011 13:34:05 +0000 (14:34 +0100)]
kern - Properly return the number of bytes in hw.physmem sysctl.

- Changed physmem from int to long, now it can hold a
  really high number of pages.
- Return an unsigned long value with appropiate formatting
  for hw.physmem.
- Add description for it.

Based-upon: FreeBSD

5 years agokernel - Add options VM_PAGE_DEBUG
Matthew Dillon [Fri, 4 Feb 2011 07:25:41 +0000 (23:25 -0800)]
kernel - Add options VM_PAGE_DEBUG

* Add options VM_PAGE_DEBUG for kernel configs.  This requires a full kernel
  rebuild (if you use the option) and supplies additional information in
  the vm_page structure to help track down problems.

5 years agoieeefp.h: Remove i386 specifics.
Peter Avalos [Tue, 1 Feb 2011 07:59:03 +0000 (21:59 -1000)]
ieeefp.h: Remove i386 specifics.

Move the contents of <machine/floatingpoint.h> to <machine/ieeefp.h> for
i386 to match x86_64.

While I'm here, mark which versions of these files we have for x86_64.

Obtained-from: FreeBSD

5 years agostyle(9): Remove whitespace.
Peter Avalos [Tue, 1 Feb 2011 07:17:02 +0000 (21:17 -1000)]
style(9): Remove whitespace.

5 years agofenv: Explicitly specify sizes for control and status words.
Peter Avalos [Tue, 1 Feb 2011 07:06:52 +0000 (21:06 -1000)]
fenv: Explicitly specify sizes for control and status words.

Obtained-from: FreeBSD

5 years agolibc - fix handling of temporary file used by hash(3)
YONETANI Tomokazu [Thu, 3 Feb 2011 05:29:48 +0000 (14:29 +0900)]
libc - fix handling of temporary file used by hash(3)

This fixes applications using DB_HASH, such as tsort, to unexpectedly
try to open a temporary file in the current directory and fail if it
has no write permission there.

Obtained from FreeBSD, r190485, by delphij:

  db/btree/bt_open.c: check return value of snprintf() and return value
  if the result is truncated.

  db/hash/hash_page.c: use the same way to create temporary file as
  bt_open.c; check snprintf() return value.

  Obtained from:  OpenBSD

5 years agoBuild our GCCs with CSTD=gnu89.
Sascha Wildner [Thu, 3 Feb 2011 23:20:47 +0000 (00:20 +0100)]
Build our GCCs with CSTD=gnu89.

This fixes building of the GCCs using gcc44.

Reported-by: Max Herrgard <herrgard@gmail.com> and others
Submitted-by: Max Herrgard <herrgard@gmail.com>
Dragonfly-bug: <http://bugs.dragonflybsd.org/issue1978>

5 years agokernel - Fix physmap base calculation for x86-64
Matthew Dillon [Thu, 3 Feb 2011 18:52:34 +0000 (10:52 -0800)]
kernel - Fix physmap base calculation for x86-64

Reported-by: luxh, tuxillo
5 years agokernel - migrate knote from zone to kmalloc
Nicolas Thery [Fri, 28 Jan 2011 00:55:12 +0000 (01:55 +0100)]
kernel - migrate knote from zone to kmalloc

5 years agoacpi(4): Always compile the files dealing with ACPI_DEBUG into the module.
Sascha Wildner [Wed, 2 Feb 2011 18:04:16 +0000 (19:04 +0100)]
acpi(4): Always compile the files dealing with ACPI_DEBUG into the module.

Before this commit, one had to define ACPI_DEBUG as a make variable to
enable debugging support in the module, such as in:

$ make -DACPI_DEBUG buildkernel

Specifying ACPI_DEBUG in the kernel config alone did not enable it, but
our modules are supposed to honor kernel options. Also this was contrary
to what the manual page says.

So to make this work for ACPI_DEBUG too, we just put all the affected
source files into SRCS and always compile them. #ifdef's in these
source files will take care of enabling/disabling debugging support
so a module compiled without ACPI_DEBUG defined in the kernel or on the
command line will still not have support after this commit (I've checked
with nm(1)).

The only change for someone not using ACPI_DEBUG is a little bit of
additional buildkernel time.

FWIW, it is the same way in FreeBSD, too.

Reported-by: Andrea Magliano <masterblaster@tiscali.it>
5 years agointr: Further delay MachIntrABI.finalize()
Sepherosa Ziehau [Wed, 2 Feb 2011 15:55:43 +0000 (23:55 +0800)]
intr: Further delay MachIntrABI.finalize()

It only affects SMP case.  For ICU, it will be better if finalize()
is called after IMCR detection is done, though on most modern systems
IMCR does not exist.  For I/O APIC, finalize() _should_ be called
after BSP's LAPIC is initialized, since it alters BSP LAPIC's LINT0
and LINT1 configuration.

Add stabilize() ABI to MachIntrABI which is only implemented by ICU
currently and this ABI is called in the place where finalize() used
to be called.

5 years agoioapic: File renaming (apic -> ioapic)
Sepherosa Ziehau [Wed, 2 Feb 2011 13:44:25 +0000 (21:44 +0800)]
ioapic: File renaming (apic -> ioapic)

5 years agoicu: Add icu/icu_abi.h
Sepherosa Ziehau [Wed, 2 Feb 2011 13:20:48 +0000 (21:20 +0800)]
icu: Add icu/icu_abi.h

Mainly to avoid manually declaring MachIntrABI_ICU

5 years agoioapic: Function/variable renaming (apic -> ioapic)
Sepherosa Ziehau [Wed, 2 Feb 2011 12:46:18 +0000 (20:46 +0800)]
ioapic: Function/variable renaming (apic -> ioapic)

5 years agoapic_initialize: Adjust comment
Sepherosa Ziehau [Wed, 2 Feb 2011 09:41:17 +0000 (17:41 +0800)]
apic_initialize: Adjust comment

5 years agopowerd - Do a more sophisticated domain scan, use kern.usched_global_cpumask
Matthew Dillon [Wed, 2 Feb 2011 07:59:22 +0000 (23:59 -0800)]
powerd - Do a more sophisticated domain scan, use kern.usched_global_cpumask

* Do a more sophisticated domain scan, cpu domains do not necessarily start
  at 0.

* Handle the case where multiple cpus may belong to a single domain.

* Dynamically adjust kern.usched_global_cpumask to the number of cpus we
  are running at max frequency, leaving the remaining cpus set at their
  lowest frequency and left mostly idle.

* Tested on the 48-core monster and phenom x 6.

5 years agokernel - Add kern.usched_global_cpumask
Matthew Dillon [Wed, 2 Feb 2011 07:44:07 +0000 (23:44 -0800)]
kernel - Add kern.usched_global_cpumask

* Add sysctl kern.usched_global_cpumask, a global cpumask that restricts
  which cpus userland processes are allowed to run on.  This sysctl may be
  set dynamically.

* NOTE: This sysctl is intended to be used by powerd.  Setting it manually
  will only work properly if powerd is not running.

5 years agopf: Fix typo in pf_mask_del()
Sepherosa Ziehau [Wed, 2 Feb 2011 04:26:20 +0000 (12:26 +0800)]
pf: Fix typo in pf_mask_del()

Reported-by: Jan Lentfer <Jan.Lentfer@web.de>
Noticed-by: Joe Talbott <josepht@cstone.net>
5 years agokernrl: Fix LINT building for recent rn_inithead API change
Sepherosa Ziehau [Wed, 2 Feb 2011 04:14:12 +0000 (12:14 +0800)]
kernrl: Fix LINT building for recent rn_inithead API change

Reported-by: Thomas Nikolajsen <thomas.nikolajsen@mail.dk>
5 years agops - Increase selected field widths
Matthew Dillon [Tue, 1 Feb 2011 23:57:36 +0000 (15:57 -0800)]
ps - Increase selected field widths

* Increase field widths for STAT, XSTAT, VSZ, RSS, and RSZ to accomodate
  more typical program run sizes and to accomodate systems with more than
  10 cpus.

5 years agokernel64 - Greatly reduce memory probe times, remove basemem calculation
Matthew Dillon [Tue, 1 Feb 2011 22:23:29 +0000 (14:23 -0800)]
kernel64 - Greatly reduce memory probe times, remove basemem calculation

* Greatly reduce memory probe times by testing in multiples of 128K instead
  of multiples of 4K.  Also add cpu_mfence() instructions to flush the
  cpu store buffer.

  This greatly reduces the startup time for x86-64 on monster machines
  with lots of memory (tested w/64G).

* Remove the basemem calculation, it is no longer used.

5 years agopc32: Split out isa_intr.h and move isa/intr_machdep.h to include/
Sepherosa Ziehau [Tue, 1 Feb 2011 14:32:52 +0000 (22:32 +0800)]
pc32: Split out isa_intr.h and move isa/intr_machdep.h to include/

5 years agopc64: Split out isa_intr.h and move isa/intr_machdep.h to include/
Sepherosa Ziehau [Tue, 1 Feb 2011 13:49:34 +0000 (21:49 +0800)]
pc64: Split out isa_intr.h and move isa/intr_machdep.h to include/

5 years agointr: Add ELCR support
Sepherosa Ziehau [Tue, 1 Feb 2011 12:58:26 +0000 (20:58 +0800)]
intr: Add ELCR support

This device controls/shows ICU pin's trigger mode (level/edge).

Currently, it is not enabled by default.

Obtained-from: FreeBSD (jhb@freebsd.org)

5 years agoradix: Fix the non-per-cpu radix tree usage.
Sepherosa Ziehau [Tue, 1 Feb 2011 08:09:26 +0000 (16:09 +0800)]
radix: Fix the non-per-cpu radix tree usage.

- Install a mask radix tree in each radix tree, while, mask radix tree does
  not have mask radix tree (of course).
- rn_cpumaskhead() is added to provide the global per-cpu mask radix tree.
- rn_inithead() requires a mask radix tree as paramter.  Mask radix tree is
  initialized by passing NULL.  INET/INET6/ATALK pass the mask radix tree
  obtained from rn_cpumaskhead(), i.e. the old sementics.
- pf(4) now creates its own mask radix tree, and all of its internal radix
  trees will use that mask radix tree instead of the global per-cpu mask
  radix tree.  pf(4) radix tree operations are protected by its own token.
- rn_addmask() requires a mask radix tree, instead of using the global
  per-cpu mask radix tree.  For most cases, the caller has access to the
  radix tree that has a mask radix tree installed.  For _rtmask_lookup(),
  which is always called from route_output(), we could safely assume that
  global per-cpu mask radix tree is used.

This is mainly used to fix the following bug concerning global per-cpu
mask radix tree:
Before this commit, pf(4) could create mask on CPU0's mask radix tree,
while the deletion of the mask happens on other CPUs, which results pf(4)'s
radix tree operation to fail (can't locate the mask).

Dragonfly-bug: http://bugs.dragonflybsd.org/issue1969
Root-cause-found-by: Jan Lentfer <Jan.Lentfer@web.de>
5 years agokernel64 - Fix disabled interrupts during dbg/bpt trap
Matthew Dillon [Mon, 31 Jan 2011 21:40:55 +0000 (13:40 -0800)]
kernel64 - Fix disabled interrupts during dbg/bpt trap

* Interrupts were left improperly disabled during a dbg or bpt trap.
  i386 enables interrupts for these traps.  x86-64 needs to as well
  or it will hit an assertion in lwkt_switch() under certain circumstances.

* Make debug code in lwkt_switch() also require INVARIANTS to function.

  NOTE: This is temporary debug code and should be removed at some point
  after 48-core testing is complete.

5 years agokernel - Fix SMP assumption of at least 2 cpus w/TCP
Matthew Dillon [Mon, 31 Jan 2011 21:09:39 +0000 (13:09 -0800)]
kernel - Fix SMP assumption of at least 2 cpus w/TCP

* TCP was assuming at least 2 cpus on SMP builds and would panic if only
  1 cpu was available.

* Fix by testing ncpus even for SMP builds.

* This problem did not effect UP builds.

5 years agokernel - Fix stall after mountroot w/ SMP & ncpus == 1
Matthew Dillon [Mon, 31 Jan 2011 21:08:26 +0000 (13:08 -0800)]
kernel - Fix stall after mountroot w/ SMP & ncpus == 1

* Fix a degenerate case for SMP builds when ncpus == 1.  This effects
  both the vkernel and the normal kernel (when a SMP kernel is booted
  on a non-SMP box which has a LAPIC).

  The init process was not bring scheduled properly.

5 years agovkernel - Fix lwbuf build error for vkernel64
Matthew Dillon [Mon, 31 Jan 2011 07:56:13 +0000 (23:56 -0800)]
vkernel - Fix lwbuf build error for vkernel64

* Fix a compile error that was preventing vkernel64's from building.

5 years agoTurn off all warnings when compiling gdtoa sources.
Peter Avalos [Mon, 31 Jan 2011 07:37:38 +0000 (21:37 -1000)]
Turn off all warnings when compiling gdtoa sources.

Obtained-from: FreeBSD

5 years agoMerge branch 'vendor/GDTOA'
Peter Avalos [Mon, 31 Jan 2011 06:28:20 +0000 (20:28 -1000)]
Merge branch 'vendor/GDTOA'

5 years agoRemove unneeded files from contrib/gdtoa.
Peter Avalos [Mon, 31 Jan 2011 06:24:13 +0000 (20:24 -1000)]
Remove unneeded files from contrib/gdtoa.

Add a README.DELETED while I'm here.

5 years agokernel options: Add MCLSHIFT
Sepherosa Ziehau [Mon, 31 Jan 2011 05:55:00 +0000 (13:55 +0800)]
kernel options: Add MCLSHIFT

So on the system w/o wireless NICs, we could configure MCLBYTES to 2K
(options MCLSHIFT=11) instead of the default 4K

5 years agonetisr barrier: Prevent netisr_barrier_dispatch() from false wakeup
Sepherosa Ziehau [Mon, 31 Jan 2011 05:43:24 +0000 (13:43 +0800)]
netisr barrier: Prevent netisr_barrier_dispatch() from false wakeup

- Change wait states into wait flags, only test NOTDONE flag when
  being woken up.
- Simplify wakeup logic.

With-help-from: dillon@

5 years agoImport gdtoa-20101105.
Peter Avalos [Mon, 31 Jan 2011 05:00:55 +0000 (19:00 -1000)]
Import gdtoa-20101105.

5 years agonetisr barrier: Avoid lockless wakeup/tsleep race
Sepherosa Ziehau [Mon, 31 Jan 2011 03:09:35 +0000 (11:09 +0800)]
netisr barrier: Avoid lockless wakeup/tsleep race

Add a waiting state (NETISR_BR_WAITDONE), before it is set wakeup()
will not be called.  And use atomic_cmpset_int() to do the state

With-help-from: dillon@

5 years agonetisr: Make sure that netisr barrier's done is globally visible
Sepherosa Ziehau [Fri, 28 Jan 2011 06:43:05 +0000 (14:43 +0800)]
netisr: Make sure that netisr barrier's done is globally visible

5 years agonetisr: Make netisr barrier's done and cpumask volatile
Sepherosa Ziehau [Tue, 25 Jan 2011 05:35:11 +0000 (13:35 +0800)]
netisr: Make netisr barrier's done and cpumask volatile

So that the assignment to these two variables and the following wakeup()
will not get reordered.

5 years agotcp6: Set TF_SYNCACHE properly in tcp6_usr_listen()
Sepherosa Ziehau [Mon, 24 Jan 2011 08:48:04 +0000 (16:48 +0800)]
tcp6: Set TF_SYNCACHE properly in tcp6_usr_listen()

5 years agotcp: Make listen(2) socket close(2) MPSAFE
Sepherosa Ziehau [Mon, 24 Jan 2011 07:52:45 +0000 (15:52 +0800)]
tcp: Make listen(2) socket close(2) MPSAFE

- Cleanup the syncache entries on each CPU for this inp.
- Before whacking the inp, we unhook the inp wildcard hash and cleanup
  its syncache entries by using a synchronized message.
- Fix up comment.

Reported-by: pavalos@
DragonFly-bug: http://bugs.dragonflybsd.org/issue1960

5 years agotcp_usr_listen: Use domsg when duplicate listen socket's inp wildcard hash
Sepherosa Ziehau [Mon, 24 Jan 2011 07:39:54 +0000 (15:39 +0800)]
tcp_usr_listen: Use domsg when duplicate listen socket's inp wildcard hash

This makes sure that the each protocol threads sees the socket when
tcp_usr_listen() returns.

5 years agotcp: Don't abuse TF_SYNCACHE to ill-optimize syncache_destroy()
Sepherosa Ziehau [Fri, 21 Jan 2011 08:14:03 +0000 (16:14 +0800)]
tcp: Don't abuse TF_SYNCACHE to ill-optimize syncache_destroy()

- We now turn on TF_SYNCACHE when listen(2) is called.
- When a listen(2) socket is to be close(2), the syncache list on the
  current CPU is thoroughly iterated and all related syncache are marked
  to be dropped instead of the first syncache.

5 years agoudp6: Protect udbinfo by udbinfo barrier
Sepherosa Ziehau [Fri, 21 Jan 2011 05:55:36 +0000 (13:55 +0800)]
udp6: Protect udbinfo by udbinfo barrier

5 years agoinpcb: Save UDP inpcb into temporary memory during in_pcblist
Sepherosa Ziehau [Fri, 21 Jan 2011 05:30:19 +0000 (13:30 +0800)]
inpcb: Save UDP inpcb into temporary memory during in_pcblist

The temorary memory is used later to do the SYSCTL_OUT without
the udbinfo serializer being held.  Mainly to avoid deadlock
triggered by holding serializer and copyout.

Reminded-by: dillon@
5 years agoudp_getcred: Release serializer when doing SYSCTL_OUT
Sepherosa Ziehau [Fri, 21 Jan 2011 02:44:35 +0000 (10:44 +0800)]
udp_getcred: Release serializer when doing SYSCTL_OUT

Mainly to avoid deadlock during copyout

Reminded-by: dillon@
5 years agoudp: pcb list/hashtable protection stage 2/2
Sepherosa Ziehau [Thu, 23 Dec 2010 08:03:08 +0000 (16:03 +0800)]
udp: pcb list/hashtable protection stage 2/2

- Use serializer to protect pcb list/hashtable iteration not running
  in netisrs.
- Don't use marker pcb, so except for the functions running in netisr0,
  no other functions will alter pcb list.

5 years agoinpcb: Add pcblist sysctl helper function w/o using marker inpcb
Sepherosa Ziehau [Thu, 23 Dec 2010 08:01:41 +0000 (16:01 +0800)]
inpcb: Add pcblist sysctl helper function w/o using marker inpcb

5 years agoudp: pcb list/hashtable protection stage 1/2
Sepherosa Ziehau [Thu, 23 Dec 2010 05:27:51 +0000 (13:27 +0800)]
udp: pcb list/hashtable protection stage 1/2

Use netisr barrier make sure that netisr will not iterating pcb list or
hashtable when adding or removing pcb

Add assertion that all UDP pru functions run in netisr0.

5 years agonetisr: Add netisr barrier which stalls all netisrs
Sepherosa Ziehau [Thu, 23 Dec 2010 05:12:39 +0000 (13:12 +0800)]
netisr: Add netisr barrier which stalls all netisrs

  Set a netisr barrier, which stalls all netisr.  Currently it must be
  called from netisr0.

  Remove the netisr barrier, which unstalls all netisr.  Currently it
  must be called from netisr0.

These interfaces could be used to work out a lockless pcb lookup or
iteration (on network hotpath e.g. input/output) at the cost of
relatively expensive pcb adding and removing (e.g. connect(2)).

5 years agokernel - Revert last commit for a better upcoming fix
Matthew Dillon [Mon, 31 Jan 2011 01:39:24 +0000 (17:39 -0800)]
kernel - Revert last commit for a better upcoming fix

* Revert this fix, a better one is going to be committed soon.

5 years agokernel - Fix syncache vs close(listen_socket) race
Matthew Dillon [Mon, 31 Jan 2011 00:19:40 +0000 (16:19 -0800)]
kernel - Fix syncache vs close(listen_socket) race

* Attempt to fix a race where a listen socket is closed with an active
  syncache.  The tcpcb is detached prior to the syncache being destroyed
  resulting in a race where a new incoming connection can complete and
  attempt to dive the listen socket's tcpcb.

* Detach the tcpcb after the syncache is destroyed rather than before.

5 years agolibc: Remove some unneeded inclusions of <sys/cdefs.h>.
Sascha Wildner [Sun, 30 Jan 2011 21:51:42 +0000 (22:51 +0100)]
libc: Remove some unneeded inclusions of <sys/cdefs.h>.

5 years agoFix up <utmp.h> and <utmpx.h> for C++ programs.
Sascha Wildner [Sun, 30 Jan 2011 21:51:27 +0000 (22:51 +0100)]
Fix up <utmp.h> and <utmpx.h> for C++ programs.

__BEGIN_DECLS and __END_DECLS are absolutely needed around prototypes
so the functions can be called from C++ code (see the definition of
__BEGIN_DECLS in <sys/ctype.h>).

While here, put non-standard stuff in __BSD_VISIBLE instead of just
noting it in a comment.

This fixes at least x11/rxvt-unicode.

Tested-by: tuxillo
5 years agolibc - Fix bogus pthread_getspecific() return value due to bug in nmalloc
Matthew Dillon [Sun, 30 Jan 2011 21:44:11 +0000 (13:44 -0800)]
libc - Fix bogus pthread_getspecific() return value due to bug in nmalloc

* nmalloc was calling pthread_set_specific() prior to calling
  pthread_key_create(), causing it to use key 0 which might already
  have been allocated for other purposes.

* Reorder initializations in _nmalloc_thr_init() to solve the problem.

* This also solves certain application crashes (mail/milter-greylist).

Reported-by: Francois Tigeot <ftigeot@wolfpond.org>
5 years agoicu: Split out icu/icu.c
Sepherosa Ziehau [Sun, 30 Jan 2011 09:44:26 +0000 (17:44 +0800)]
icu: Split out icu/icu.c

5 years agoicu: Put ICU_IMR_OFFSET into machine_base/icu/icu.h
Sepherosa Ziehau [Sun, 30 Jan 2011 08:31:29 +0000 (16:31 +0800)]
icu: Put ICU_IMR_OFFSET into machine_base/icu/icu.h

5 years agoioapic/icu: Rework PIC selection code
Sepherosa Ziehau [Sun, 30 Jan 2011 06:33:52 +0000 (14:33 +0800)]
ioapic/icu: Rework PIC selection code

- In the early stage, before I/O APIC is detected and setup, ICU controls
  interrupts, so all IDT entries should be set to ICU's intr code.
- Switch to I/O APIC only after ICU is completely disconnected, i.e. after
  IMCR is set and LINT0 is masked.

5 years agokernel - Have the crypto subsystem use the new mpipe_alloc_callback() API
Matthew Dillon [Sun, 30 Jan 2011 02:59:32 +0000 (18:59 -0800)]
kernel - Have the crypto subsystem use the new mpipe_alloc_callback() API

* The crypto subsystem can deadlock on blocked mpipe operations while
  holding a CAM lockmgr lock.

* Change the subsystem to use the new mpipe_alloc_callback() mechanism,
  avoiding any deadlocks.

Reported-by: Peter Avalos <peter@theshell.com>