Matthew Dillon [Tue, 22 Feb 2011 16:54:48 +0000 (08:54 -0800)]
kernel - Unconditionally clear BRIDGE_MBUF_TAGGED in two cases
* First unconditionally clear BRIDGE_MBUF_TAGGED if the target MAC
in the link header points to us, regardless of what we do with the
packet.
* Second, unconditionally clear BRIDGE_MBUF_TAGGED if IPFW2 redirects
the packet destination, bad things will happen if the original source
MAC is kept in the link header. i.e. the packet becomes routed at
that point.
Sepherosa Ziehau [Tue, 22 Feb 2011 14:24:01 +0000 (22:24 +0800)]
kernel/i386: Add -msoft-float to CFLAGS
Sepherosa Ziehau [Tue, 22 Feb 2011 12:03:40 +0000 (20:03 +0800)]
libstand: Make sure that -march=i386 is specified
This unbreaks the loader compiled by gcc44
Sepherosa Ziehau [Tue, 22 Feb 2011 12:03:11 +0000 (20:03 +0800)]
dloader: Make sure that -march=i386 is specified
Venkatesh Srinivas [Tue, 22 Feb 2011 11:35:54 +0000 (03:35 -0800)]
Merge branch 'master' of /repository/git/dragonfly
Sepherosa Ziehau [Tue, 22 Feb 2011 10:59:14 +0000 (18:59 +0800)]
mptable: Implement stub I/O APIC enumerator
Matthew Dillon [Tue, 22 Feb 2011 00:02:30 +0000 (16:02 -0800)]
kernel - More if_bridge work + misc fixes
* When bridging packets sent from one of our own MACs we always override
the ether_shost in the output packet so it comes from the actual
interface the packet is being sent out on.
* LINK0 will still nominally keep the ether_shost intact when forwarding
across a bridge, except in the above case. That is, any foreign MAC
set as the source coming in on one interface will be retained as the
source when being thrown out on another interface. But any local MAC
will be replaced with the MAC of the outgoing interface.
* When receiving a unicast frame on one interface which is targetted to
another interface, retain the original rcvif for any vlan or arp
processing. Otherwise (for example) if this were an ARP reply the ARP
code would associate the reply with the wrong interface. We would want
the ARP entry to be associated with the first interface, not the second,
because the first interface is the one the reply actually came in on.
* Adjust the ARP code in if_ether.c to use rcvif and not ifp, and don't
log if non-matching interfaces are part of the same bridge (unless
log_arp_wrong_iface is set to 2).
* Augment the ether_reinput_cpu() API to pass additional flags in,
allowing the caller to specify that m->m_pkthdr.rcvif not be
overwritten. Used to support the above features.
* Clear M_HASH in a few more cases in pf.c
Venkatesh Srinivas [Mon, 21 Feb 2011 23:07:16 +0000 (15:07 -0800)]
Add definitions for SIGEV_THREAD.
Sascha Wildner [Mon, 21 Feb 2011 20:47:42 +0000 (21:47 +0100)]
rtld.1: Staticize the variable in the _rtld_functrace example.
Pointed-out-by: corecode
Sascha Wildner [Mon, 21 Feb 2011 20:38:54 +0000 (21:38 +0100)]
rtld.1: Add an example on how to set up _rtld_functrace.
While here, put the function's prototype into the SYNOPSIS and add a
_rtld_functrace(3) MLINK.
Sepherosa Ziehau [Mon, 21 Feb 2011 05:42:28 +0000 (13:42 +0800)]
tcp: Allow listen(2) to be called on the same socket for any number of times
DragonFly-bug: http://bugs.dragonflybsd.org/issue1993
Matthew Dillon [Mon, 21 Feb 2011 03:36:00 +0000 (19:36 -0800)]
Merge branch 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly
Matthew Dillon [Mon, 21 Feb 2011 03:35:12 +0000 (19:35 -0800)]
kernel - Fix extra rel_mplock() in if_tap
* Fix crash/panic when running openvpn w/if_tap due to dangling
rel_mplock().
Sepherosa Ziehau [Mon, 21 Feb 2011 02:52:20 +0000 (10:52 +0800)]
inpcb: Exclusive the usage of wildcard hash and connect hash
DragonFly-bug: http://bugs.dragonflybsd.org/issue1993
Matthew Dillon [Mon, 21 Feb 2011 02:26:17 +0000 (18:26 -0800)]
HAMMER - Fix long stalls when writing out core files
* Fix a long-stall case (hmrwww) due to a broken pipelining algorithm when
using large write()s to write out large files.
Peter Avalos [Sun, 20 Feb 2011 22:14:03 +0000 (12:14 -1000)]
newsyslog: Sync with FreeBSD.
-Don't consider non-existence of a PID file an error.
-Add -P flag which prevents further action if the pidfile is empty or
doesn't exist.
-Add a -S switch to override the default syslog pid file.
-Add support for creating the archived log filenames using a time-stamp
instead of the traditional simple counter.
-Add xz(1) support.
-Rewrite and simplify logfile compression code.
-Convert newsyslog to using queue(3) macros.
-Add file include processing.
Obtained-from: FreeBSD
Peter Avalos [Sun, 20 Feb 2011 21:33:23 +0000 (11:33 -1000)]
Sync /bin/sh regression tests with FreeBSD.
-Add some tests for omitting whitespace.
-Split off some special behaviour into separate tests.
-Do not use "local" in the test runner.
-Make execution/fork1.0 work even if the basename of ${SH} is not "sh".
-Test that the read builtin passes through all byte values except NUL,
newline and backslash.
-Unset some locale vars in two tests that may cause them to break.
Obtained-from: FreeBSD
Sascha Wildner [Sun, 20 Feb 2011 12:22:08 +0000 (13:22 +0100)]
<sys/elf_generic.h>: Fix typo in a #warning.
Sascha Wildner [Sun, 20 Feb 2011 09:36:45 +0000 (10:36 +0100)]
Remove some kref(9) related files via 'make upgrade'.
Sascha Wildner [Sun, 20 Feb 2011 02:02:42 +0000 (03:02 +0100)]
LINT: Fix wording and remove a duplicate option from the comments.
Matthew Dillon [Sat, 19 Feb 2011 22:04:42 +0000 (14:04 -0800)]
kernel - Clear BRIDGE_MBUF_TAGGED for NAT translations
* Clear the new BRIDGE_MBUF_TAGGED flag when a NAT or other translation
changes the source IP for a packet, otherwise packets traversing a bridged
interface may wind up with a source MAC that has nothing to do with
the translated source IP.
Matthew Dillon [Sat, 19 Feb 2011 20:57:56 +0000 (12:57 -0800)]
kernel - Add a transparent MAC bridging feature to if_bridge
* Defaults to non-transparent (historical) operation, which is safer.
Set link0 to use in transparent MAC mode.
* Transparent MAC mode will attempt to retain the MAC source in the
link address header when retransmitting a packet on a different
interface.
Only IP/IPV6 packets will retain the MAC. ARP and other ether types
will get the outgoing interface's MAC address, which is usually
desireable.
* Note that transparent MAC mode is a bit dangerous, which is why it
isn't turned on by default. If a packet with the originating MAC
winds up being sent out the same interface it came in on with the
MAC intact, any switches between the two boxes will suddenly think
the originating machine is somewhere else and will get confused.
The code tries to avoid this situation.
Bridging loops can also cause this sort of behavior even with the spanning
tree protocol. link0 is not recommended if you have loops.
* Coded because I needed this for braindead at&t uverse routers which
do MAC-based security and only allow one IP association for each MAC,
and whos firewalls cannot be completely disabled, and which cannot deal
with IPs on routed networks (it expects everything to be directly connected
on a switched network. sigh).
Matthew Dillon [Sat, 19 Feb 2011 09:02:26 +0000 (01:02 -0800)]
kernel - Fix minor mistake corrupting an allocation in recent MPTable work
* Fix an allocation which was too small (sizeof pointer vs structure),
which fixes an early-boot panic.
Reported-by: Peter Avalos <peter@theshell.com>
Matthew Dillon [Sat, 19 Feb 2011 04:49:20 +0000 (20:49 -0800)]
kernel - Fix fairq, PF table hash was not being initialized
* fairq depends on the PF table entry hash, which was not being
initialized.
* Fixes problems with fairq not queueing fairly.
Matthew Dillon [Sat, 19 Feb 2011 04:44:03 +0000 (20:44 -0800)]
kernel - Allow rn_inithead() to be called early
* Allow rn_inithead() to be called earlier than rn_init(). rn_init() is
called very late and was responsible for creating the all-ones and
all-zeros keys. It also required the proto domains to be initialized(?).
* A PF module preload was calling rn_inithead() before the all-ones and
all-zeros keys could be allocated, resulting in a crash.
Antonio Huete Jimenez [Fri, 18 Feb 2011 09:05:41 +0000 (10:05 +0100)]
vkernel64 - Enable function name resolution in DDB.
Sepherosa Ziehau [Fri, 18 Feb 2011 07:54:17 +0000 (15:54 +0800)]
mptable: Save PCI interrupt pin to I/O APIC pin maps
Peter Avalos [Fri, 18 Feb 2011 01:23:49 +0000 (15:23 -1000)]
ps: Update man page for adding comm as an alias for ucomm.
Peter Avalos [Sun, 13 Feb 2011 22:00:12 +0000 (12:00 -1000)]
ps: Add the comm keyword which is an alias for ucomm.
While I'm here, don't pad the output of ucomm if it's the last column.
Venkatesh Srinivas [Thu, 17 Feb 2011 18:03:42 +0000 (10:03 -0800)]
librt: Initial userland implementation of POSIX AIO functionality.
Issues synchronous IO requests in response to AIO calls; does not
support SIGEV_THREAD or SIGEV_SIGNAL; the former because our
infrastructure doesn't have SIGEV_THREAD already; the latter
because we do not support sigqueue() or sending signals w/ a
sigval payload.
Matthew Dillon [Thu, 17 Feb 2011 08:47:58 +0000 (00:47 -0800)]
kernel - Add batch heuristic to scheduler and refactor some of the code 1/2
* Split the dynamic priority mechanism into two stages:
* Stage 1 is the normal dynamic priority mechanism which reacts very quickly
to cpu hogging vs idle / not hogging.
* Stage 2 is a long-term (30-second) batch operations detector which de-tunes
the estcpu calculation based on how long the process has been acting
batch-like or non-batch-like. estcpu is detuned up to 50% for processes
considered to be fully interactive.
* Newly forked processes are placed two queue slots higher (less desireable)
than their parent in stage 1. If they aren't batch they will quickly
recover.
* Newly forked processes are given a batch heuristic value that is mid-range
for stage 2 and must prove themselves one way or the other.
* 'ps -o batch -axl' can be used to see the batch heuristic. ps will display
it as a value between 0 and 11 for the moment.
The idea here is for something like firefox and the X server to remain
interactive even if they use a lot of cpu, while something like a parallel
buildworld winds up remaining batch-like because the core processes are
cpu-bound.
Matthew Dillon [Thu, 17 Feb 2011 08:41:43 +0000 (00:41 -0800)]
kernel - Fix seg-fault in clock interrupt due to race
* Fix a seg-fault which can occur due to races against a newly created
process whos lwp has not been completely initialized.
Sepherosa Ziehau [Thu, 17 Feb 2011 08:00:17 +0000 (16:00 +0800)]
x86_64: 64-bit index register should be used.
Looks like qemu does not accept 32-bit index register, while the
real boxs and virtualbox accept 32-bit index regiter.
However, according to AMD <<24593--Rev. 3.17--June 2010>> Page 25,
64-bit index register should be used to create effective address.
DragonFly-bug: http://bugs.dragonflybsd.org/issue1991
Matthew Dillon [Wed, 16 Feb 2011 22:32:43 +0000 (14:32 -0800)]
ps - Fix sorting mistake in recent commits
* qsort passes a pointer to the element, so its a KINFO** instead of
a KINFO*. Fix the indirection.
* Fixes weird default output sorting that was all wrong.
Matthew Dillon [Wed, 16 Feb 2011 19:17:37 +0000 (11:17 -0800)]
kernel - Fix MP refcount race in struct sigacts (2)
* Fix buildworld failure by changing u_int to unsigned int in
header file.
Matthew Dillon [Wed, 16 Feb 2011 17:53:54 +0000 (09:53 -0800)]
kernel - Fix MP refcount race in struct sigacts
* Code wasn't MPSAFE.
* Use the refcount API to manage refs for struct sigacts
Matthew Dillon [Wed, 16 Feb 2011 17:44:26 +0000 (09:44 -0800)]
kernel - knote_alloc() must always succeed
* We cannot use M_NOWAIT here, knote_alloc() must always succeed.
Matthew Dillon [Wed, 16 Feb 2011 17:43:47 +0000 (09:43 -0800)]
kernel - Fix MP refcount race in struct pargs (2)
* Fix additional case in kern_exit.c
Sepherosa Ziehau [Wed, 16 Feb 2011 08:22:54 +0000 (16:22 +0800)]
mptable: Test the usage of default MPTABLE config during mptable_probe()
Matthew Dillon [Wed, 16 Feb 2011 07:51:22 +0000 (23:51 -0800)]
Merge branch 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly
Sepherosa Ziehau [Wed, 16 Feb 2011 03:22:43 +0000 (11:22 +0800)]
mptable: Function renaming
In preparation for the upcoming MPTABLE I/O APIC enumerator.
Sepherosa Ziehau [Wed, 16 Feb 2011 06:21:13 +0000 (14:21 +0800)]
mptable: Prepare to create I/O APIC MPTABLE enumerator
- Run mptable_probe() at BOOT2_PRESMP/ORDER_FIRST.
- Save MPTABLE's physical address in global variable.
- Move preliminary LAPIC checks from MPTABLE probing to MPTABLE LAPIC
enumerator's probing function.
Matthew Dillon [Wed, 16 Feb 2011 05:53:05 +0000 (21:53 -0800)]
kernel - Fix MP refcount race in struct pargs
* Protect p->p_args with p->p_token.
* Protect pargs refcounts with sys/refcount.h.
* This fixes at least one memory corruption bug during heavy fork/exec/exit
testing with fastbulk.
Testing-by: dillon, tuxillo
Venkatesh Srinivas [Wed, 16 Feb 2011 05:04:35 +0000 (21:04 -0800)]
kernel -- vm locking: Add vm object locking to vm_object_referenced.
Venkatesh Srinivas [Wed, 16 Feb 2011 04:48:56 +0000 (20:48 -0800)]
Remove VFS_AIO config option.
Venkatesh Srinivas [Wed, 16 Feb 2011 04:40:55 +0000 (20:40 -0800)]
kernel -- Eliminate AIO.
AIO was not enabled by default and the code had received little attention.
Sepherosa Ziehau [Wed, 16 Feb 2011 02:49:43 +0000 (10:49 +0800)]
irqmap: Consume the syscall entry in irqmap
Venkatesh Srinivas [Wed, 16 Feb 2011 02:33:43 +0000 (18:33 -0800)]
kernel -- vm locking: Lock kernel_object in kmem_alloc3.
Venkatesh Srinivas [Wed, 16 Feb 2011 01:20:10 +0000 (17:20 -0800)]
Merge branch 'master' of /repository/git/dragonfly
Venkatesh Srinivas [Wed, 16 Feb 2011 01:19:37 +0000 (17:19 -0800)]
Update stale comment in lwkt_token_init().
Venkatesh Srinivas [Wed, 16 Feb 2011 01:18:27 +0000 (17:18 -0800)]
kernel -- vm locking: Take per-object token in vm_map and vm_object_coalesce.
Matthew Dillon [Wed, 16 Feb 2011 00:58:51 +0000 (16:58 -0800)]
ps - Fix longstanding bug in initial populating loop
* Fix the populating loop to not try to load KInfo[nentries],
overflowing the array.
* Fixes a seg-fault which can occur when the allocated array is right on
a page boundary.
Venkatesh Srinivas [Tue, 15 Feb 2011 23:59:44 +0000 (15:59 -0800)]
kernel -- vm locking: Add vm_page_(un)lock and vm_object_(un)lock.
Each vm_object and vm_page are associated with a token; for vm_pages,
we use a pool token; for objects, a per-object token. For vm_pages,
the token will interlock access to the pv_chain, at least.
Also remove per-vm_object range locks. They were unused.
Matthew Dillon [Tue, 15 Feb 2011 19:42:40 +0000 (11:42 -0800)]
ps - Remove debugging printfs
* Remove debug printfs that were accidently left in
Matthew Dillon [Tue, 15 Feb 2011 19:37:38 +0000 (11:37 -0800)]
kernel - Add options SLAB_DEBUG to help debug memory corruption
* Adding options SLAB_DEBUG to your kernel config will reconfigure
kmalloc(), krealloc(), and kstrdup() to record all allocation
sources on a zone-by-zone basis, file and line number.
A full kernel recompile is needed when you add or drop this option
from your kernel config.
* Limited to 32 slots per slab. Since slabs offer a narrow range of
chunk sizes this will normally be sufficient.
* When a memory corruption related panic occurs kgdb can be used
to determine who allocated out of the slab in question.
Sepherosa Ziehau [Tue, 15 Feb 2011 15:02:30 +0000 (23:02 +0800)]
madt: Ignore interrupt override entry if no overriding will happen
While I'm here, fix up the warning message about bogus trigger mode
and polarity.
Matthew Dillon [Tue, 15 Feb 2011 07:23:39 +0000 (23:23 -0800)]
ps - Add a new option, -R, which sub-sorts by parent/child and indents
* This new option will sub-sort by process parent/child associations and
will indent the command to make it very, very obvious.
* This is an ultra useful feature.
Matthew Dillon [Tue, 15 Feb 2011 04:29:25 +0000 (20:29 -0800)]
kernel - MPSAFE work, fix race in init zombie cleanup
* Acquire the required initproc->p_token when reparenting an exiting
process's children to init, fixing a race against init's wait*().
Matthew Dillon [Mon, 14 Feb 2011 21:20:57 +0000 (13:20 -0800)]
kernel - Remove the last MP locks from tmpfs (2).
* The tmpfs interlock is the per-mount token now, not the MP lock
Reported-by: Venkatesh Srinivas <me@endeavour.zapto.org>
Matthew Dillon [Mon, 14 Feb 2011 21:11:30 +0000 (13:11 -0800)]
kernel - Remove the last MP locks from tmpfs.
* Remove get_mplock/rel_mplock calls in tmpfs which are no longer needed.
This will improve the write path a bit though we still utilize the
per-mount token in most places.
Reported-by: Venkatesh Srinivas <me@endeavour.zapto.org>
Matthew Dillon [Mon, 14 Feb 2011 19:15:46 +0000 (11:15 -0800)]
kernel - Remove safety mplocks around VFS system calls
* Remove the safety get_mplock()/rel_mplock() calls around numerous
VFS system calls. The MP lock or per-mount token is handled deeper
in the filesystem code.
* open() has been running without the safety mplock's for a while to
testing nlookup(). nlookup() should be MPSAFE. The safety mplocks
being removed were primarily there to protect it.
Matthew Dillon [Mon, 14 Feb 2011 18:52:30 +0000 (10:52 -0800)]
kernel - Remove incorrect assertion
* Remove the assertion that p2->p_lock == 0 in fork(). Since p2 has already
been added to allproc PHOLD/PRELEs can cause this field to become non-zero.
This assertion was originally added when p_lock was moved out of the
copy zone. It is no longer applicable or correct.
Venkatesh Srinivas [Mon, 14 Feb 2011 14:27:26 +0000 (06:27 -0800)]
Remove kref.9 manpage.
Sepherosa Ziehau [Mon, 14 Feb 2011 09:25:33 +0000 (17:25 +0800)]
ioapic: Pass ioapic address to ioapic_{read,write}()
This makes easier for us to get rid of the ioapic addresses array.
Sepherosa Ziehau [Mon, 14 Feb 2011 06:47:39 +0000 (14:47 +0800)]
Introduce ioapic enumerators, which is used to probe and config ioapics
ioapic enumerator implementation should provide two methods:
ioapic_probe()
Make sure that if this enumerator is selected, later ioapic
enumeration could work. Return error code upon failure.
ioapic_enumerate()
Enumerate ioapic and prepare for later ioapic configuration in
the common code (the configuration in the common code is not
implemented yet).
ioapic enumerator implementation could be registered by calling
ioapic_enumerator_register() with ioapic_enumerator struct. The
higher the priority field, the earlier the ioapic enumerator's
ioapic_probe method will be invoked.
Currently a do-nothing-other-than-logging ioapic enumerators are
implemented and registered. This dummy ioapic enumerator uses ACPI
MADT.
Matthew Dillon [Mon, 14 Feb 2011 04:57:32 +0000 (20:57 -0800)]
kernel - Make numerous proc accesses use p->p_token instead of proc_token.
* pfind() zpfind() now returns a referenced proc structure, callers must
release the proc with PRELE(). Callers no longer need to hold proc_token
for stable access.
* Enhance pgrp, adding pgrp->pg_token and pgrp->pg_refs in addition to
pgrp->pg_lock. The lock is used to interlock races between fork() and
signals while the token and refs are used to control access.
* Add pfindn(), a version of pfind() which does not ref the returned proc.
Some code still uses it (linux emulation) ---> needs work.
* Add pgref() and pgrel() to mess with the pgrp's pg_refs. pgrel()
automatically destroys the pgrp when the last reference goes away.
* Most process group operations now use the per-process token instead of
proc_token, though pgfind() still needs it temporarily.
* pgfind() now returns a referenced pgrp or NULL.
* Interlock signal handling with p->p_token instead of proc_token.
* Adjust most nice/priority functions to use the per-process token.
* Add protective PHOLD()s in various places in the signal code, the
ptrace code, and procfs.
* Change funsetown() to take the address of the sigio pointer to match
fsetown(), add sanity assertions.
* pgrp's in tty sessions are now ref-counted.
Matthew Dillon [Sun, 13 Feb 2011 21:39:29 +0000 (13:39 -0800)]
kernel - Replace sys/ref.h & kern/kern_ref.c with sys/refcount.h
* The sys/refcount.h API is a bit nicer so use it instead of the sys/ref.h
API. No need for duplication so remove the sys/ref.h API.
Peter Avalos [Sun, 13 Feb 2011 10:46:49 +0000 (00:46 -1000)]
Add printf(1) regression tests.
Obtained-from: FreeBSD
Peter Avalos [Sun, 13 Feb 2011 10:12:53 +0000 (00:12 -1000)]
printf(1): Sync with FreeBSD
-Fix more issues allowing /bin/sh to pass more regression tests.
-Do not use sh memory functions in sh builtin.
-We work on ctype's and not only on numbers so set LC_ALL.
-The only caller of mknum() provides a char instead of an int, so make
it match the definition.
-Remove support for building as a csh builtin.
-POSIX compliance.
-Prefer intmax_t over long long.
-Handle null characters in the format string.
-No reason to write \a and \v as octal escape sequences.
-Move parts of the long main() function into a new function doformat().
-Rewrite the loop in main() to be more understandable.
-Replace buggy for-loops to skip certain character with strspn().
-Support the L modifier for floating-point values as an extension.
-Allow %' to be used as a format flag by printf(1).
-Enable support for the %a, %A, and %F format specifiers.
-Let printf(1) tell the difference between zero width/precision and
unspecified width/precision.
-Allow format strings containing "%%" to be reused.
-Allow `%' to be written out with an octal escape (\45 or \045).
-Man page markup.
Obtained-from: FreeBSD
Peter Avalos [Sun, 13 Feb 2011 07:48:31 +0000 (21:48 -1000)]
kill(1): Sync with FreeBSD
-Make sys_signame upper case.
-Stop processing if a syntactically invalid pid is encountered.
-Do not restrict the allowed signals that can be specified by number
-Cleanup man page markup.
Obtained-from: FreeBSD
Peter Avalos [Sun, 13 Feb 2011 07:32:57 +0000 (21:32 -1000)]
Add regression tests for /bin/test.
Obtained-from: FreeBSD
Peter Avalos [Sun, 13 Feb 2011 06:52:35 +0000 (20:52 -1000)]
bin/test: Sync with FreeBSD
-Help /bin/sh pass a few more regression tests.
-Convert to use st_mtim instead of st_mtimespec.
-Fix various cases with 3 or 4 parameters in test(1) to be POSIX
compliant.
-Use intmax_t as a quad replacement instead of long long.
-__printflike() should really be __printf0like().
-Localize (LC_CTYPE).
-Cleanup whitespace.
-Simplify markup in man page.
-Describe how test(1) will evaluate its expressions for a symlink.
-Document that both sides of -a or -o are always evaluated.
Obtained-from: FreeBSD
Peter Avalos [Sun, 13 Feb 2011 01:47:27 +0000 (15:47 -1000)]
sh: Detect dividing the smallest integer by -1.
Obtained-from: FreeBSD
Matthew Dillon [Sat, 12 Feb 2011 22:20:09 +0000 (14:20 -0800)]
kernel - Make most of the fork and exit paths MPSAFE
* Remove the MP lock from numerous system calls (mainly socket calls) that
no longer need it.
* Use proc_token in a couple of places that still need work (instead of
the MP lock). For example, the process group (pgrp) and several places
which call pfind() still need to use the proc_token.
* Use the per-process p->p_token in fork1(), exit1(), and lwp_exit().
The critical portions of these paths now have significant concurrency.
* Use the per-process p->p_token when traversing p->p_children, primarily
aiding the kern_wait() code. So the wait*() system calls should now
have significant concurrency.
* Change the fgetown() API to avoid certain races.
* Add M_ZERO to the struct filedesc_to_leader allocation for safety
purposes.
Matthew Dillon [Sat, 12 Feb 2011 22:17:01 +0000 (14:17 -0800)]
kernel - Fix list corruption in dsched
* dsched_thread_io_alloc() also needs to hold a context lock when
doing an internal list operation.
Matthew Dillon [Sat, 12 Feb 2011 21:30:16 +0000 (13:30 -0800)]
kernel - Fix list corruption in dsched
* dsched_thread_ctx_alloc() was not acquiring the global lock across
calls to dsched_thread_io_alloc(), creating a race condition in the
call to TAILQ_INSERT_TAIL(&tdio->diskctx->tdio_list, ...).
This case can occur under heavy fork/exit loads, particularly when
(soon) fork and exit get out from the MP lock, but even with the MP
lock on the frontend it can occur against the dsched backend.
* dsched_new_policy_thread_tdio() was not acquiring the global lock
across its initial call to dsched_thread_io_alloc(). This case could
only occur with extreme rarity.
Matthew Dillon [Sat, 12 Feb 2011 20:59:19 +0000 (12:59 -0800)]
kernel - Fix wild pointer in DDB trace
* Pre-initialize result fields to NULL/0 because we do not check the return
value from linker_ddb_symbol_values(). This way if the lookup fails we
print a nice "(null)" out instead of faulting the debugger.
Matthew Dillon [Sat, 12 Feb 2011 16:46:18 +0000 (08:46 -0800)]
kernel - Add per-process token, adjust signal code to use it (2).
* In order to avoid a hard-code-section assertion from hardclock()'s
itimer code the per-process token must be held on the call to ksignal().
Sepherosa Ziehau [Sat, 12 Feb 2011 09:09:14 +0000 (17:09 +0800)]
x86_64: pmap_init() is called early enough for pmap_mapdev() to work
Sepherosa Ziehau [Sat, 12 Feb 2011 08:40:43 +0000 (16:40 +0800)]
madt: Structure renaming
Sepherosa Ziehau [Sat, 12 Feb 2011 07:27:52 +0000 (15:27 +0800)]
madt: Support MADT rev 3 (used by ACPI 4.0)
Sepherosa Ziehau [Sat, 12 Feb 2011 07:38:02 +0000 (15:38 +0800)]
madt: Function renaming
Prepare for the upcoming I/O APIC stuffs
Peter Avalos [Sat, 12 Feb 2011 07:27:25 +0000 (21:27 -1000)]
Add regression tests for /bin/sh.
Obtained-from: FreeBSD
Sepherosa Ziehau [Sat, 12 Feb 2011 07:12:11 +0000 (15:12 +0800)]
acpi: Remove unused files
Peter Avalos [Mon, 7 Feb 2011 08:05:28 +0000 (22:05 -1000)]
sh: Sync with FreeBSD
This is a combination of 198 commits that enhances performance,
standards compliance, and bug fixes:
-Fix exit status of case statement.
-sh.1: Update markup.
-Fix PWD values.
-Fix null pointer dereferences.
-Fix bugs where arithmetic expansion$((...)) was truncated
-Pass the correct flags to expandarg() for NFROMFD and NTOFD.
-Add __dead2
-Fix $? at the first command of a function.
-Report error messages to stderr instead of stdout.
-Make alias builtin POSIX compliant.
-Improve the IFS handling of the read built-in.
-Don't let empty lines overwrite the result of the last command with 0.
-Fix the eval command in combination with set -e.
-Make read's timeout (-t) apply to the entire line.
-align local ckmalloc() with malloc(3) by using a size_t
-style(9) to enhance readability
-explicit 'unsigned int' instead of just 'unsigned'
-Mention the range for the exit status for the exit builtin
-Don't skip forking for an external command if any traps are active.
-Avoid leaving unnecessary waiting shells.
-Properly flush input after an error in backquotes
-Fix some issues with quoted output
-Fix race condition in noclobber option
-Improve handling of setjmp/longjmp volatile
-Do not fork for EV_EXIT.
-Quote -x tracing output so it is unambiguous.
-Designate special builtins as such in command -V and type.
-Fix crash when undefining or redefining an executing function.
-Fix memory leak when using a variable in arithmetic like $((x)).
-Use sigaction instead of signal/siginterrupt combination.
-Allow a newline before "in" in a for command, as required by POSIX.
-Some changes to stderr flushing.
-Handle current work directories of arbitrary length.
-trap: do not consider a bad signal name a fatal error.
-Ensure the same command input file is on top after execing builtin.
-Fix various things about SIGINT handling.
-Fix some cases where file descriptors from redirections leak to programs.
-Remove setting variables from dotcmd/exportcmd.
-Fix a memory leak when calling . with variable assignments.
-Constify various strings.
-Do not consider a tilde-prefix with expansions in it.
-Do not run callers' exception handlers in subshells.
-Remove declaration of function that no longer exists.
-WARNS fixes to reduce diffs to FreeBSD
-arith: Return only 0 and 1 from && and ||.
-Fix memory leak when parsing backticks (``).
-Ensure funcnest is decremented if there's an error in the function.
-Allow command -pv and command -pV
-Fix some bugs with backquoted builtins.
-Send the "not found" message for builtin <cmd> to redirected fd 2.
-Do not stat() $MAIL/$MAILPATH in non-interactive shells.
-Fix expansion of \W in prompt strings when the working directory is "/".
-Improve the command builtin:
-Make sure to popredir() even if a special builtin caused an error.
-Make sure to popredir() even if a function caused an error.
-Make parsebackq a function instead of an emulated nested function.
-Do not abort on a redirection error if there is no command word.
-Do not abort on a redirection error on a compound command.
-Treat unexpected newlines in substitutions as a syntax error.
-Fix various things about expansions.
-Remove special handling for ' and " in arithmetic.
-Allow quoting pattern match characters in ${v%pat} and ${v#pat}.
-Do tilde expansion in substitutions.
-Automatically enable -o emacs in interactive shells with terminals.
-On startup of the shell, use PWD from the environment if it is valid.
-Use stalloc for arith variable names.
-Apply locale vars on builtins, recognize LC_MESSAGES as a locale var.
-Have only one copy of _PATH_STDPATH in the binary.
-Fix "reserved word" vs "keyword" inconsistency.
-Fix pathname expansion with quoted slashes like *\/.
-Reap any zombies before forking for a background command.
-Rework documentation of shell variables.
-Recognize "--" in . and exec.
-Change interaction of command substitution and here documents.
-Fix a crash if a heredoc was not properly ended and parsing continued.
-Pass TERM changes to libedit.
-Pass through SIGINT from a child if interactive and job control is enabled.
-Forget about terminated background processes sooner.
-Use $PWD instead of getcwd() for the \w and \W prompt expansions.
-Allow a background command consisting solely of redirections.
-Fix crash due to uninitialized here-document.
-Return 0 from eval if no command was given.
-Remove unnecessary duplicate letters in mksyntax.c
-Fix heap-based buffer overflow in pathname generation.
-Fix shadowing of sigset.
-Fix break/continue/return sometimes not skipping the rest of dot script.
-Add a brief summary of arithmetic expressions.
-Remove remnants of '!!' to negate pattern.
-Do not use locale for determining if something is a name.
-Get rid of some magic numbers.
-Improve comments in expand.c.
-Fix 'read' if all chars before the first IFS char are backslash-escaped.
-Remove xrefs for expr(1) and getopt(1).
-Apply variable assignments left-to-right in bltinlookup().
-Fix exit status if return is used within a loop condition.
-Suggest that DEBUG_FLAGS be used to enable extra debugging
-Make DEBUG traces 64-bit clean.
-Remove the "STATIC" macro
-Fix a bug in STACKSTRNUL()
-Clarify subshells/processes for pipelines.
-There cannot be a TNOT in simplecmd(), remove checks.
-Change ! within a pipeline to start a new pipeline instead.
-Check whether dup2 was successful for >&FD and <&FD.
-Make sure defined functions can actually be called.
-Do not allow overriding a special builtin with a function.
-Ignore double-quotes in arithmetic rather than treating them as quotes.
-Make double-quotes quote a '}' inside ${v#...} and ${v%...}.
-Only accept a '}' inside ${v+-=?...} if double-quote state matches.
-Do IFS splitting on word in ${v+word} and ${v-word}.
-Fix some issues with CTL* bytes and ${var#pat}.
-Error out on various specials/keywords in the wrong place in backticks.
-Detect various additional errors in the parser.
-Reject function names ending in one of !%*+-=?@}~
-Tweak some string constants to reduce code size.
-Use iteration instead of recursion to evaluate semicolon lists.
-Reindent evaltree().
-Correct synopsis and make precise how $0 is set.
-Fix some issues with aliases and case, by importing dash checkkwd code.
-Modernize the introduction a bit.
-Remove unused man page for echo builtin.
-Do the additional actions if 'local -' restore changes -i/-m/-E/-V.
-Add binary buffered output for use by the printf builtin.
-document printf builtin
-Code size optimizations to buffered output.
-Remove the check that alpha/name/in_name chars are not CTL* bytes.
-Fix confusing behaviour if chdir succeeded but getcwd failed in cd -P.
-Code size optimizations to "stack string" memory allocation
-jobs -p: Do not ask the kernel for the pgid.
-Improve jobs output of pipelines.
-POSIX says there should not be a space between Done and (exitstatus).
-Improve internal-representation-to-text code to avoid binary output.
-Use vsnprintf() rather than crafting our own in fmtstr().
-Replace some macros and repeated code in expand.c with functions.
-Remove the herefd hack.
-Various simplifications to jobs.c
-Remove duplicate check, turning dead code into live code.
-Fix corruption of command substitutions with special chars after newline
-Remove dead code.
-arith: Disallow decimal constants starting with 0 (containing 8 or 9).
-Make warnings in the printf builtin non-fatal.
-Add a function to print warnings (with command name and newline).
-Add kill builtin.
-Explain why it is a bad idea to use aliases in scripts.
-Allow arbitrary large numbers in CHECKSTRSPACE.
-Make expansion errors in optimized command substitution non-fatal.
-Don't do optimized command substitution if expansions have side effects.
-Properly restore exception handler in fc.
-Avoid side effects from builtins in optimized command substitution.
-Check if dup2 for redirection from/to a file succeeds.
-Check readonly status for assignments on regular builtins.
-Do not call exitshell() from evalcommand() unless evalcommand() forked itself
-Make exit without parameters from EXIT trap POSIX-compliant.
-Remove special %builtin PATH entry.
-Make 'trap -l' look like 'kill -l'.
-Fix some things about -- in trap.
-If exit is used without args from a trap action, exit on the signal.
-Fix signal messages being sent to the wrong file sometimes.
-Send messages about signals to stderr.
-Return only 126 or 127 for execve() failures.
-Make sys_signame upper case.
-Remove special code for shell scripts without magic number.
-Do not try to execute binary files as scripts.
-Forget all cached command locations on any PATH change.
-Fix two things about {(...)} <redir.
-Import arithmetic expression code from dash.
-Install /bin/sh safely.
Obtained-from: FreeBSD
Sepherosa Ziehau [Sat, 12 Feb 2011 07:05:17 +0000 (15:05 +0800)]
x86_64: Function renaming lapic_init -> lapic_map
Keep the function name as close to i386's as possible
Sepherosa Ziehau [Sat, 12 Feb 2011 06:22:09 +0000 (14:22 +0800)]
madt: Prepare to extract I/O APIC information from MADT
- Run madt_probe() at BOOT2_PRESMP/ORDER_FIRST.
- Save MADT's physical address in global variable.
- Move preliminary LAPIC checks from MADT probing to MADT LAPIC
enumerator's probing function.
Sepherosa Ziehau [Sat, 12 Feb 2011 02:52:10 +0000 (10:52 +0800)]
madt: Function renaming
Matthew Dillon [Fri, 11 Feb 2011 22:47:58 +0000 (14:47 -0800)]
kernel - Add per-process token, adjust signal code to use it.
* Add proc->p_token and use it to interlock signal-related operations.
* Remove the use of proc_token in various signal paths. Note that proc_token
is still used in conjuction with pfind().
* Remove the use of proc_token in CURSIG*()/issignal() sequences, which
also removes its use in the tsleep path and the syscall path. p->p_token
is use instead.
* Move the automatic interlock in the tsleep code to before the CURSIG code,
fixing a rare race where a SIGCHLD could race against a parent process
in sigsuspend(). Also acquire p->p_token here to interlock LWP_SINTR
handling.
Matthew Dillon [Thu, 10 Feb 2011 22:32:25 +0000 (14:32 -0800)]
HAMMER Utility - Change the minimum UNDO/REDO FIFO from 100M to 500M
* The minimum undo/redo fifo really needs to be larger. Don't play
around, make it 500M. People who want to run HAMMER on small hard
drives or images need to be cognizent of the requirement.
* This partially solves (only partially) a FIFO overflow condition.
Effectively the complexity of buffered operations hammer allows to
build up in the kernel could easily overflow a minimally-sized on-media
UNDO/REDO FIFO. Upping the requirement makes the case less likely.
The remainder of the resolution will require some fixes in the
HAMMER VFS code.
Reported-by: Thomas Nikolajsen <thomas.nikolajsen@mail.dk>
Matthew Dillon [Thu, 10 Feb 2011 21:22:51 +0000 (13:22 -0800)]
Merge branch 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly
Matthew Dillon [Thu, 10 Feb 2011 21:15:51 +0000 (13:15 -0800)]
kernel - Greatly reduce usched_bsd4_decay default
* Reduce the usched_bsd4_decay default to 1. It may be removed entirely
in the future.
* This improves the dynamic priority handling by reducing ad-hoc estcpu
decreases from the 1-second interval clock. The tsleep code handles
this a lot better already and the ad-hoc decreases don't do a good job
handling the case where there are a very large number of runnable
cpu-bound processes (because they don't actually get a lot of cpu but
still eat a large proportion of the scheduled time in aggregate).
Tested with blogbench during stage 1. Prior to this fix the 100+ blogbench
threads were being dropped down to almost realtime priorities even though
they remained in a 100% 'R'un state.
* Also reduce the amount the parent process of a fork() is docked for cpu
due to the fork. The value was high enough that interactive sessions were
being pushed up to batch priorities with only a moderate number of forks
and not decaying quickly enough to stabilize.
The child process is docked the same as before (handling the fork chaining
case).
Tested with blogbench and parallel makes of /usr/src/lib/libc. The
blogbench uniformly increases to batch priority and didn't need the
higher boost the old values gave it while the parallel compile's fork
chaining gave it a good shove towards batch priority while the repeated
forks slowly pushed the higher level make and /bin/sh's to more batch-like
priorities.
Sascha Wildner [Thu, 10 Feb 2011 11:37:10 +0000 (12:37 +0100)]
hammer.8: Note that 'recover' needs -f.
Sascha Wildner [Wed, 9 Feb 2011 16:25:06 +0000 (17:25 +0100)]
acpi(4): Fix a bug in acpi_cpu_cstate.c (we have to write, and not to read).
Introduced with
10f976749fd9ad2e8642ea80ce533f7416910a65. The commit message
said "Sync ACPI with FreeBSD 7.2", even though FreeBSD 7.2 doesn't seem to
have this code at all, so I'm not sure about what the idea behind that
change was. I'm guessing it is a typo, since newer FreeBSDs call
AcpiWriteBitRegister() here too.
Reported-by: Andrea Magliano <masterblaster@tiscali.it>
Sepherosa Ziehau [Wed, 9 Feb 2011 14:42:33 +0000 (22:42 +0800)]
acpi/madt: Add definitation for interrupt source override MADT entry
Peter Avalos [Wed, 9 Feb 2011 05:13:12 +0000 (19:13 -1000)]
Upgrade to OpenSSL-1.0.0d.
This fixes CVE-2011-0014.
Peter Avalos [Wed, 9 Feb 2011 05:02:56 +0000 (19:02 -1000)]
Merge branch 'vendor/OPENSSL'
Peter Avalos [Wed, 9 Feb 2011 04:59:57 +0000 (18:59 -1000)]
Import OpenSSL-1.0.0d.
Sascha Wildner [Tue, 8 Feb 2011 14:04:33 +0000 (15:04 +0100)]
Bump .Dd in pkg_radd.1 manpage and make pkg_search working again.
Sascha Wildner [Tue, 8 Feb 2011 13:59:53 +0000 (14:59 +0100)]
Don't remove /etc/settings.conf via 'make upgrade' and rename it instead.