Sepherosa Ziehau [Thu, 20 Jun 2013 03:34:22 +0000 (11:34 +0800)]
altq: Use tsc_mpsync to detect whether TSC could be used or not
Sepherosa Ziehau [Thu, 20 Jun 2013 03:10:03 +0000 (11:10 +0800)]
clock/tsc: Detect invariant TSC CPU synchronization
The detected result could be used to determine whether TSC could be
used as cputimer or not, and could be used by other stuffs, e.g.
CoDel AQM packet time stamping.
- Only invariant TSC will be tested
- If there is only one CPU, then invariant TSC is always synchronized
- Only CPUs from Intel are tested (*)
The test is conducted using lwkt_cpusync interfaces:
BSP read the TSC, then ask APs to read TSC. If TSC read from any APs
is less then the BSP's TSC, the invariant TSC is not synchronized
across CPUs.
Currently the test runs ~100ms.
(*)
AMD family 15h model 00h-0fh may also have synchronized TSC across
CPUs as pointed out by vsrinivas@, however, according to AMD:
<Revision Guide for AMD Family 15h Models 00h-0Fh Processors
Rev. 3.18 October 2012>
759 One Core May Observe a Time Stamp Counter Skew
AMD family 15h model 00h-0fh is _not_ ready yet.
Sascha Wildner [Wed, 19 Jun 2013 17:19:42 +0000 (19:19 +0200)]
bsd-family-tree: Sync with FreeBSD (NetBSD 5.2, 6.0.1, 6.0.2).
Sepherosa Ziehau [Wed, 19 Jun 2013 08:54:41 +0000 (16:54 +0800)]
net: Use tsc_invariant when it is necessary; mainly in time measure
Sepherosa Ziehau [Wed, 19 Jun 2013 08:37:55 +0000 (16:37 +0800)]
clock/tsc: Detect invariant TSC
According to Intel's description:
"The invariant TSC will run at a constant rate in all ACPI P-, C-. "
and T-states. ..."
The difference between invariant TSC and constant TSC is that
invariant TSC is not affected by frequency changes and deep ACPI
C-state.
Constant TSC could be detected based on the CPU model (Intel has
the model list, while there is no information from AMD's document);
constant TSC is not detected yet.
Sepherosa Ziehau [Wed, 19 Jun 2013 04:21:46 +0000 (12:21 +0800)]
clock: Use sysclock_t to save value from sys_cputimer->count()
Sepherosa Ziehau [Wed, 19 Jun 2013 03:14:56 +0000 (11:14 +0800)]
sio: Use sysclock_t to save value from sys_cputimer->count()
Sepherosa Ziehau [Wed, 19 Jun 2013 02:55:24 +0000 (10:55 +0800)]
cputimer: The freq should be sysclock_t
This prepares for 64bit sysclock_t
Sascha Wildner [Tue, 18 Jun 2013 16:18:00 +0000 (18:18 +0200)]
fstat(1): fsp is in fact used.
Antonio Huete Jimenez [Tue, 18 Jun 2013 10:35:03 +0000 (12:35 +0200)]
fstat(1) - Add support for EXT2FS filesystem.
Sepherosa Ziehau [Tue, 18 Jun 2013 05:34:25 +0000 (13:34 +0800)]
altq: Clean up the code for PCC usage detection
Sepherosa Ziehau [Tue, 18 Jun 2013 01:55:00 +0000 (09:55 +0800)]
polling: Fix comment
Antonio Huete Jimenez [Mon, 17 Jun 2013 22:40:43 +0000 (00:40 +0200)]
fstat(1) - Add support for NTFS filesystem.
Antonio Huete Jimenez [Mon, 17 Jun 2013 22:34:46 +0000 (00:34 +0200)]
ntfs - Expose NTFS structures to userland.
Antonio Huete Jimenez [Mon, 17 Jun 2013 18:10:14 +0000 (20:10 +0200)]
fstat(1) - Add support for HAMMER filesystem.
- Also use %u for printing major/minor to avoid overflows.
Antonio Huete Jimenez [Mon, 17 Jun 2013 13:41:14 +0000 (15:41 +0200)]
hammer - Allow userland programs to access hammer.h definitions
Expose hammer.h to userland, programs like fstat(1) might need some
of the structs (struct hammer_inode).
Sepherosa Ziehau [Mon, 17 Jun 2013 09:46:11 +0000 (17:46 +0800)]
mbuf: Add comment about the remaining implicit padding on x86_64
Sepherosa Ziehau [Mon, 17 Jun 2013 09:42:16 +0000 (17:42 +0800)]
mtag: u_intXX_t -> uintXX_t; no functional changes
Sepherosa Ziehau [Mon, 17 Jun 2013 09:28:26 +0000 (17:28 +0800)]
mbuf: Save 16 bytes from pkthdr on x86_64
- 'wlan_seqno' is not necessary, reuse the 'ether_vlantag'
- Not all parts of 'pkthdr_br' is useful; save the ethernet source
address should be enough.
- Move 'len' after 'header', on x86_64:
o Remove the implicit 4 bytes padding
o 'len' is still in the same cacheline as before this commit
(m_hdr is 160 bytes as of this commit)
o Make the size of the fields after 'header' but before the 'pf'
8 bytes aligned, so there will be on implicit padding before 'pf'
John Marino [Sun, 16 Jun 2013 23:21:53 +0000 (01:21 +0200)]
libc: Add symbol versions (not active)
Sepherosa Ziehau [Mon, 17 Jun 2013 02:36:51 +0000 (10:36 +0800)]
mbuf: White space cleanup and use uintXX_t instead of u_intXX_t
No functional changes.
Sepherosa Ziehau [Sun, 16 Jun 2013 13:55:44 +0000 (21:55 +0800)]
altq: Update comment
Sepherosa Ziehau [Sun, 16 Jun 2013 13:40:11 +0000 (21:40 +0800)]
bnx: Update man page
Sascha Wildner [Sun, 16 Jun 2013 12:58:20 +0000 (14:58 +0200)]
Fix some typos in manual pages.
Sascha Wildner [Sat, 15 Jun 2013 07:42:48 +0000 (09:42 +0200)]
libc/fmtmsg: Check the correct variable against MM_NULLACT.
Taken-from: FreeBSD (r199046)
Sascha Wildner [Fri, 7 Jun 2013 18:46:15 +0000 (20:46 +0200)]
Remove some unused variables.
John Marino [Fri, 14 Jun 2013 22:54:37 +0000 (00:54 +0200)]
libncurses: Add symbol versions (not active)
John Marino [Fri, 14 Jun 2013 21:13:39 +0000 (23:13 +0200)]
libedit: Add symbol versions (not active)
Sascha Wildner [Sat, 15 Jun 2013 07:12:30 +0000 (09:12 +0200)]
Update the pciconf(8) database.
June 6, 2013 snapshot from http://pciids.sourceforge.net/
Matthew Dillon [Sat, 15 Jun 2013 00:44:25 +0000 (17:44 -0700)]
hammer2 - pfsmount -> clustermount separation part 2
* Further separate the high-level VNOPS/inode (hammer2_pfsmount) layer
from the lower level device (hammer2_mount, hammer2_chain) layer.
* Remove hmp fields from hammer2_trans and hammer2_inode.
* Add hammer2_cluster to the pfsmount as degenerate case for now. This
will be used to list all devices backing the PFS mount, pertaining to
the copies mechanism.
* Run all logical (file) buffer cache operations through the device buffer
cache. Remove previous direct-mapped shortcuts and disable BMAP for now.
Basically the issue here is that with multiple devices backing a HAMMER2
mount, the normal file buffer cache 'cached disk offset' operations used
to shortcut I/O just won't work. We can add the shortcut back in later
for single-backing-device mounts but for now separate them out entirely
and bcopy() between them.
* This will also make it easier for the GSOC H2 file compression project.
* Restore some of the lost performance by using the newly implemented
cluster_readcb() buffer cache function.
Matthew Dillon [Sat, 15 Jun 2013 00:42:11 +0000 (17:42 -0700)]
kernel - Add cluster_readcb()
* This function is similar to breadcb() in that it issues the requested
buffer I/O asynchronously with a callback, but then also clusters
additional asynchronous I/Os (without a callback) to improve performance.
* Used by HAMMER2 to improve performance.
John Marino [Fri, 14 Jun 2013 16:03:09 +0000 (18:03 +0200)]
libmd: Add symbol versions (not active)
John Marino [Fri, 14 Jun 2013 15:49:50 +0000 (17:49 +0200)]
libarchive: Use vendor version numbers instead of DF306
John Marino [Fri, 14 Jun 2013 15:39:04 +0000 (17:39 +0200)]
libz: Use vendor version numbers instead of DF306
This allows for a cleaner set of private symbols.
John Marino [Fri, 14 Jun 2013 15:19:21 +0000 (17:19 +0200)]
liblzma: Add symbol versions (not active)
John Marino [Fri, 14 Jun 2013 14:37:54 +0000 (16:37 +0200)]
libbz2: Add symbol versions (not active)
Matthew Dillon [Fri, 14 Jun 2013 01:32:19 +0000 (18:32 -0700)]
kernel - Document bugs in sendfile that we currently punt on
* sendfile tries to soft-busy the VM pages it backs the mbuf with. This
is meant to prevent the VM page's data from being modified while TCP
is playing with it. However, it doesn't work. There are two issues.
* (1) The page still may be mmap()'d writable. A simple vm_page_protect()
would fix this.
* (2) The page may be associated with a buffer cache buffer and can be
modified via a VOP_WRITE through that buffer regardless of whether
soft-busy or busy is set. This is a real problem.
Even if we find and discard the buffer it can just be reinstantiated
and wind up with the same problem.
From-discussion-with: jeffr, rookie on IRC
Sepherosa Ziehau [Fri, 14 Jun 2013 01:31:13 +0000 (09:31 +0800)]
systat/ip: Unbreak UDP stats
John Marino [Thu, 13 Jun 2013 22:51:50 +0000 (00:51 +0200)]
libarchive: Add symbol versions (not active)
Matthew Dillon [Thu, 13 Jun 2013 20:23:38 +0000 (13:23 -0700)]
kernel - Increase KVM default for 32-bit systems from 1GB to 1.5GB
* Increases KVA_PAGES default from 256 to 384, giving the kernel 1.5GB
of KVM and userland 2.5GB (instead of 1/3).
Numerous people running 32-bit systems are hitting resource limits and
actually running out of KVM. There are many reasons for why this is
happening. It isn't simply a resource-tuning issue because most of the
resource limits we have today are already quite reasonable. It's when
the system combines to fully utilize multiple resources where the problems
begin. Tuning-down the resources impacts performance too much and makes
the systems less usable. PCI resources tend to reserve larger areas,
system structures are fatter, and many other issues crop up.
In addition, 32-bit systems today can be greatly extended by adding swapcache
and swapcache requires significantly KVM resources. For example, adding
64GB of swapcache eats ~64MB of ram and heavy tmpfs use often requires an
even higher ratio (64GB swap w/ kern.maxswzone=128m in /boot/loader.conf).
With the price point for SSDs coming down, 256GB and larger SSDs are far
more common these days and we want even 32-bit systems to be able to make
use of them.
On a fresh system boot well over 512MB of KVM out of the (previous) 1GB
space is already accounted for. This leaves precious little for dynamic
expansion of system structures.
This leaves us with one real option... increase KVM and decrease UVM.
By increasing KVM from 1GB to 1.5GB we nearly double the KVM available to
the kernel for dynamic expansion of system structures. User virtual memory
is reduced from 3GB down to 2.5GB. While this may impact some applications
such as Perl, those applications are already tending to run on the edge
anyway and, in fact, modern application development is starting to assume
64-bit address spaces for optimal operation anyway.
I've come to the conclusion that it is better to move the line on UVM down
in order to completely solve the KVM issue for system resources on 32-bit
systems.
Matthew Dillon [Thu, 13 Jun 2013 19:07:36 +0000 (12:07 -0700)]
hammer2 - pfsmount -> clustermount separation part 1
* Start working on turning the hammer2_pfsmount structure into an
abstracted PFS 'cluster' structure.
* Move hammer2_inode's allocation infrastructure to the pmp (except
for the super-root inode which has no pmp). It was previously
hmp-centric.
John Marino [Thu, 13 Jun 2013 16:11:56 +0000 (18:11 +0200)]
libz: Add symbol versions (not active)
Every time symbol versioning is added to a system library, it will
require a full buildworld for the next build. In addition, if the major
version isn't bumped, it will break the binary packages that link to it.
The working plan is to add versioning to libaries, but not hook it into
the build unless "RELEASE36" is defined. When all the libraries have
been versioned, then these conditions will be removed so only one full
buildworld will be required to minimize inconvenience to users.
All versioned libraries will be bumped with the exception of libc as it
has already been bumped for branch 3.5, thus libc.so.7 is already
available. In other words, we bump to avoid package breakage and that's
no longer an issue for libc.
After this commit, libz will build as before unless RELEASE36=yes is
defined in make.conf or passed to make during buildworld.
Sepherosa Ziehau [Sat, 8 Jun 2013 05:47:43 +0000 (13:47 +0800)]
altq: Implement two level "rough" priority queue for plain sub-queue
The "rough" part comes from two sources:
- Hardware queue could be deep, normally 512 or more even for GigE
- Round robin on the transmission queues is used by all of the multiple
transmission queue capable hardwares supported by DragonFly as of this
commit.
These two sources affect the packet priority set by DragonFly.
DragonFly's "rough" prority queue has only two level, i.e. high priority
and normal priority, which should be enough. Each queue has its own
header. The normal priority queue will be dequeue only when there is no
packets in the high priority queue. During enqueue, if the sub-queue is
full and the high priority queue length is less than half of the sub-
queue length (both packet count and byte count), drop-head will be
applied on the normal priority queue.
M_PRIO mbuf flag is added to mark that the mbuf is destined for the high
priority queue. Currently TCP uses it to prioritize SYN, SYN|ACK, and
pure ACK w/o FIN and RST. This behaviour could be turn off by
net.inet.tcp.prio_synack, which is on by default.
The performance improvement!
The test environment:
All three boxes are using Intel i7-2600 w/ HT enabled
+-----+
| |
+->- emx1 | B | TCP_MAERTS
+-----+ | | |
| | | +-----+
| A | bnx0 ---+
| | | +-----+
+-----+ | | |
+-<- emx1 | C | TCP_STREAM/TCP_RR
| |
+-----+
A's kernel has this commit compiled. bnx0 has all four transmission
queues enabled. For bnx0, the hardware's transmission queue round-robin
is on TSO segment boundry.
Some base line measurement:
B<--A TCP_MAERTS (raw stats) (128 client): 984 Mbps
(tcp_stream -H A -l 15 -i 128 -r)
C-->A TCP_STREAM (128 client): 942 Mbps (tcp_stream -H A -l 15 -i 128)
C-->A TCP_CC (768 client): 221199 conns/s (tcp_cc -H A -l 15 -i 768)
To effectively measure the TCP_CC, the prefix route's MSL is changed to
10ms: route change 10.1.0.0/24 -msl 10
All stats gather in the following measurement are below the base line
measurement (well, they should be).
C-->A TCP_CC improvement, during test B<--A TCP_MAERTS is running:
TCP_MAERTS(raw) TCP_CC
TSO prio_synack=1 948 Mbps 15988 conns/s
TSO prio_synack=0 965 Mbps 8867 conns/s
non-TSO prio_synack=1 943 Mbps 18128 conns/s
non-TSO prio_synack=0 959 Mbps 11371 conns/s
* 80% TCP_CC performance improvement w/ TSO and 60% w/o TSO!
C-->A TCP_STREAM improvement, during test B<--A TCP_MAERTS is running:
TCP_MAERTS(raw) TCP_STREAM
TSO prio_synack=1 969 Mbps 920 Mbps
TSO prio_synack=0 969 Mbps 865 Mbps
non-TSO prio_synack=1 969 Mbps 920 Mbps
non-TSO prio_synack=0 969 Mbps 879 Mbps
* 6% TCP_STREAM performance improvement w/ TSO and 4% w/o TSO.
John Marino [Thu, 13 Jun 2013 12:29:28 +0000 (14:29 +0200)]
/usr/share/mk: Install bsd.symver.mk and version_gen.awk
Sascha Wildner [Thu, 13 Jun 2013 12:31:46 +0000 (14:31 +0200)]
Fix SEE ALSO sorting order in some manual pages.
Sepherosa Ziehau [Thu, 13 Jun 2013 11:46:22 +0000 (19:46 +0800)]
tools/netrate: Add simple tools to calculated multiple netperf results
netperf itself must be installed through dports.
John Marino [Thu, 13 Jun 2013 07:19:29 +0000 (09:19 +0200)]
rtld: Sync 7/7 - Use symbol versioning instead of exports mapping
Now that DragonFly has the symbol versioning framework in place, rtld
can leverage it by offloading the symbol export duties to it. This
further reduces differences between FreeBSD and DragonFly linkers.
Keeping the exports table up to date after FreeBSD removed it was extra
work.
John Marino [Thu, 13 Jun 2013 06:19:35 +0000 (08:19 +0200)]
rtld: Sync 6/7 - Minimize differences from FreeBSD
DragonFly developed some rtld features before FreeBSD, and consequently
those features were ported back to FreeBSD. Some portions of these
new lines were modified for various reasons, e.g. the variable names
weren't liked or additional constraints were deemed necessary such as
the ability to maintain the old (incorrect) behavior of RUNPATH.
In any case, there were minor differences including whitespace, and
this commit reduces those differences to ease future syncing.
John Marino [Wed, 12 Jun 2013 23:42:14 +0000 (01:42 +0200)]
rtld: Sync 5/7 - Fix fd leak with parallel dlopen and fork
Rtld did not set FD_CLOEXEC on its internal file descriptors; therefore,
such a file descriptor may be passed to a process created by another
thread running in parallel to dlopen() or fdlopen().
No other threads are expected to be running during parsing of the hints
and libmap files but the file descriptors need not be passed to child
processes so add O_CLOEXEC there as well.
As the F_DUPFD_CLOEXEC support was added in the kernel today, rtld
will temporarily fall back to separate dup/cloexec commands if
F_DUPFD_CLOEXEC fails. This fallback should be removed before
3.6 branches.
Taken from:
FreeBSD SVN 242587 (04 NOV 2012)
John Marino [Wed, 12 Jun 2013 12:24:28 +0000 (14:24 +0200)]
rtld: Sync 4/7 - Fix token substitution
The origin_subst_one() function limits the length of the string to
PATH_MAX after the token substitution. This is wrong, because
origin_subst_one() performs the substitution on the whole rpath and
similar strings, which contain several pathes separated by colon. As
result, a long (but correct) rpath consisting of many path elements is
rejected by the function.
Correct the problem by rewriting the origin_subst_one() to perform two
passes, first pass to calculate the number of substitutions to be
performed, and second pass to generate the resulting string. Second
pass allocates the memory for the result based on the count from the
first pass, without enforcing a limit.
Taken verbatim from:
FreeBSD SVN 249525 (15 APR 2013)
FreeBSD SVN 250075 (29 APR 2013)
John Marino [Wed, 12 Jun 2013 12:06:58 +0000 (14:06 +0200)]
rtld: Sync 3/7 - LD_PRELOAD and z_nodeflib fix
Do not reference z_nodeflib for !objgiven case in order to fix LD_PRELOAD
for a non-absolute path.
Taken from:
FreeBSD SVN 240686 (18 SEP 2012)
John Marino [Wed, 12 Jun 2013 11:15:29 +0000 (13:15 +0200)]
rtld: Sync 2/7 - Remove potential map leakage
Eliminate the static buffer used to read the first page of the mapped
object, and eliminate the pread(2) call as well. Mmap the first page
of the object temporarily and unmap it on error or last use. Potentially
this leaves a one page gap between succeeding dlopen(3), but there are
other mmap(2) consumers as well.
This fixes several cases where the whole mapping of the object leaked
upon error. The MAP_PREFAULT_READ code had to be skipped because the
mmap on DragonFly doesn't support this flag.
----
Map libraries linked with -Ttext-segment=base_addr at base_addr.
Normal libraries have a base address of zero and are unaffected by this
change.
Taken from:
FreeBSD SVN 237058 (14 JUN 2012)
FreeBSD SVN 247396 (27 FEB 2013)
John Marino [Wed, 12 Jun 2013 10:49:17 +0000 (12:49 +0200)]
rtld: Sync 1/7 - Handle premature symlook_obj call
It doesn't appear that this code is needed for x86 platforms as this
case should already be caught in early code, but it doesn't hurt
DragonFly to handle every possible case.
Work around a situation where symlook_obj() could be called for the
object for which digest_dynamic1() was not done yet. Just return
EINVAL and do not try to dereference NULL buckets hash array.
This seems to happen on ia64 for rtld object itself where the
R_IA_64_FPTR64LSB relocations require symbol lookup. The dynamic
linker itself does not rely on identity of the C-level function
pointers (i.e. function descriptors).
Taken verbatim from:
FreeBSD SVN 235054 (05 MAY 2012)
Matthew Dillon [Thu, 13 Jun 2013 07:11:04 +0000 (00:11 -0700)]
hammer2 - freemap part 4, misc fixes
* Revamp the freemap a bit. Remove Layer 0. Layer 1 is now the LEAF.
2GB of media storage is now represented by a single 64KB Layer 1 block.
Synchronize the FREEMAP document with the current thinking.
The Layer 1 block contains 1024x64 entries. Each entry represents 2MBytes
of media storage. These entries are no longer blockrefs pointing to
Layer 0 but are instead terminal structures.
* Each entry represents a 16KB allocation granularity in 2 bits and has
128 bit pairs (256 bits total), plus additional information to represent
the 2MBytes of storage.
Fine-grained allocations are supported via an iterator field, currently
allowing fine-grained allocations down to 1KB and potentially expandable
in the future to even smaller allocation sizes.
* Fix a SMP race in voldata handling during flush. The freemap portion of
voldata could be updated during crc calculations due to hmp->fchain not
being held locked, causing random volume header/backups to fail their CRC
test on remount.
* Add missing BUF_KERNPROC() when chain->bp is replaced. Fixes a kernel
lock ownership assertion.
* Add freezone/radix fields to the inode_data structure. Each inode can
accomodate four fields. The fields are not yet utilized. Current thinking
is to use them to optimize the bulk free-scan for freeing blocks.
John Marino [Wed, 12 Jun 2013 19:42:21 +0000 (21:42 +0200)]
kernel: Add three new commands to fctnl
This commit adds the following new commands to fcntl():
F_DUP2FD - non-portable functional equivalent of dup2(fd, arg)
F_DUPFD_CLOEXEC - A version of F_DUPFD that sets the close-on-exec
on the new file descriptor
F_DUP2FD_CLOEXEC - A version of F_DUP2FD that sets the close-on-exec
on the new file descriptor. It is non-portable
It also adds a missing break in a case statement for F_GETOWN in
sys_fnctl(), spotted by dillon.
reviewed-by: dillon
Matthew Dillon [Wed, 12 Jun 2013 19:10:41 +0000 (12:10 -0700)]
kernel - fix statistics counters for if_bridge.
* Count input statistics in bridge_input() instead of bridge_forward()
* Add output statistics in bridge_forward(). That is, the bridge should
in abstract look like a piece of hardware from a statistics standpoint,
so when a packet is forwarded through a bridge interface it needs to
show up as both an input packet and an output packet.
* Fixes statistics reporting for e.g. 'netstat -in 1'.
Matthew Dillon [Wed, 12 Jun 2013 18:02:12 +0000 (11:02 -0700)]
netstat - Do not double-count interfaces associated with bridges
* Do not double count packets & bytes for interfaces which are
also associated with bridges when 'netstat -in' is used without
an interface specification.
John Marino [Wed, 12 Jun 2013 09:12:20 +0000 (11:12 +0200)]
Remove MACHINE_ARCH=amd64 and legacy make instructions from makefiles
Apparently the x86_64 platform used to be referred to as "amd64" so some
makefile code was added to help with the transition. It probably should
have been removed when bringing bmake in.
With the commit, the makefiles expect bmake and will break if an older
make is used (e.g. from DragonFly 3.2). That means an upgrade to
DragonFly 3.6 will have to be upgraded to DragonFly 3.4 first. There
may be other reasons to do this as well, besides just bmake.
The recent symbol versioning makefile for libraries requires bmake, so
legacy make can't build world anymore in any case.
John Marino [Wed, 12 Jun 2013 06:28:43 +0000 (08:28 +0200)]
Change initial symbol version from DFLY36.0 to DF306.0
The first one will make handling DragonFly 3.9 and later awkward.
John Marino [Tue, 11 Jun 2013 22:28:04 +0000 (00:28 +0200)]
libm: Add several new functions and symbol versioning
The following long double functions were added to the math library:
logl
log2l
log10l
log1pl
expm1l
acoshl
asinhl
atanhl
In addition, the FreeBSD functionality that creates symbol versioning
for libraries was adapted for FreeBSD. The first version is called
"DFLY36.0". If it is necessary to create a new version of the 3.5 or
3.6 branch, the number after the decimal will be incremented. The 3.7
branch will start with "DFLY38.0" if it needs its own version.
libm was baselined with all symbols being the same version: DFLY36.0.
With symbol versioning, it will not be necessary to increment the major
version anymore, so this library shall always be known as libm.so.4 from
this point on.
John Marino [Tue, 11 Jun 2013 11:11:21 +0000 (13:11 +0200)]
rtld: increase TLS storage space (bug 2566)
It appears that the TLS storage space, which is currently defined as 256
bytes on both platforms, is insufficient to handle libc TLS data.
Due to nmalloc setting of the thr_mags structure as Thread Local Storage,
the TLS elf section of x86-64 libc is 1172 bytes, while the i386 libc TLS
section weighs in at 588 bytes. For comparison, the FreeBSD libc TLS
section is 17 bytes, and FreeBSD rtld only reserves 128 bytes for TLS.
The requirements for dmalloc are more modest, so this shortfall was likely
an unintended side-effect of switching from dmalloc back to nmalloc.
This commit sets the reserved TLS space to 1280 bytes for x86-64 and 640
bytes for i386. This should allow 3.4 packages to continue to work on
the latest 3.5 branch as long as libc.so.7 is present. For world upgrades
this is the normal case, but the libraries can be obtained by installing
misc/compat34x from dports as well.
Even if the libc TLS space requirements drop significantly, the large
RTLD_STATIC_TLS_EXTRA value needs to be maintained as long as
compatibility with old libc (3.4 and earlier) libraries is required.
Antonio Huete Jimenez [Fri, 7 Jun 2013 23:22:27 +0000 (01:22 +0200)]
tmpfs - Remove redundant call to insmntque()
vnode is moved to the mount queue specified in the call to getnewvnode(),
we do not need to move it again.
Matthew Dillon [Tue, 11 Jun 2013 00:15:21 +0000 (17:15 -0700)]
AHCI - Fix panic if additional I/O is queued during SATA error processing.
* If additional I/O is queued during SATA error processing the AHCI
driver was improperly trying to initiate the new I/O. This caused
the error processing code to assert on unexpected command activity.
* Fix by implying exclusive access mode when the error CCB is in use and
giving the error CCB queueing priority over all other CCBs.
Sepherosa Ziehau [Sun, 9 Jun 2013 02:45:48 +0000 (10:45 +0800)]
pcb: Allow kmalloc(WAITOK) to return in in_pcballoc()
The PCB kmalloc limit could be exceeded on local netperf TCP_CC with the
default MSL; panic is obviously not wanted under this situation.
Sascha Wildner [Sat, 8 Jun 2013 21:51:08 +0000 (23:51 +0200)]
ppbus.4: Mention DEBUG_1284.
Sascha Wildner [Sat, 8 Jun 2013 16:28:53 +0000 (18:28 +0200)]
Adjust some more files for dports.
Sascha Wildner [Sat, 8 Jun 2013 11:42:15 +0000 (13:42 +0200)]
<sys/thread.h>: Remove two unnecessary forward declarations.
They are in <sys/msgport.h> which is included by <sys/thread.h>.
Sascha Wildner [Sat, 8 Jun 2013 10:13:50 +0000 (12:13 +0200)]
build.7: Some more clarification.
Sascha Wildner [Sat, 8 Jun 2013 10:09:49 +0000 (12:09 +0200)]
Adjust some manual pages for dports.
Sascha Wildner [Fri, 7 Jun 2013 16:50:40 +0000 (18:50 +0200)]
Fix some printf()s.
Sascha Wildner [Sat, 8 Jun 2013 08:31:53 +0000 (10:31 +0200)]
bsd-family-tree: Sync with FreeBSD (adds NetBSD 6.1 and FreeBSD 8.4).
Sepherosa Ziehau [Sat, 8 Jun 2013 03:21:04 +0000 (11:21 +0800)]
socket: Prioritiries rcvd message
Sepherosa Ziehau [Fri, 7 Jun 2013 09:18:33 +0000 (17:18 +0800)]
altq: Add byte based limit and counter
- This avoids having too much mbufs sitting on the send queue for TSO
capable devices. Even by default, DragonFly has already limited TSO
burst to at most 4 TCP segments, for TSO capable devices, there still
could be 4 times mbufs sitting on the send queue compared with non-TSO
capable devices.
- This paves way for the AQMs, which require send queue byte counter,
e.g. CoDel.
For ethernet devices, the byte based limit is (1514 x max_packets).
For other devices, e.g. pseudo devices, the byte based limit is
(MCLBYTES x max_packets).
Matthew Dillon [Fri, 7 Jun 2013 23:00:02 +0000 (16:00 -0700)]
kernel - Enable ncp shared locks by default.
* Change debug.ncp_shared_lock_disable to 0, enabling ncp shared
locks in master by default.
Matthew Dillon [Fri, 7 Jun 2013 22:13:21 +0000 (15:13 -0700)]
kernel - Fix a case in the path lookup that results in high latencies
* When many cpu cores are looking up paths with matching components,
so as when doing a parallel buildworld or a parallel build of
/usr/src/lib/libc, the namecache's shared/exclusive lock mechanic
can break-down and create a chain-reaction of exclusive locks which
destroys performance.
* When attempting to get a shared lock we were previously backing-down to
an exclusive lock if the shared lock could not be obtained non-blocking.
The original code could cause a chain-reaction of unnecessary exclusive
locks.
Instead, we now only back-down to an exclusive lock only if the current
thread already holds an exclusive lock on the same namecache entry
or if we detect that another thread is trying to get an exclusive lock.
Otherwise we fall-through and obtain the shared lock in a blocking manner.
Reported-by: ftigeot
Sepherosa Ziehau [Fri, 7 Jun 2013 08:01:06 +0000 (16:01 +0800)]
wlan: 802.11 devices and vap is not ready for ALTQ packet schedulers
Current power saving queue inject and send queue flush code do not
work with ALTQ packet schedulers.
Sepherosa Ziehau [Fri, 7 Jun 2013 07:37:16 +0000 (15:37 +0800)]
altq: Update comment
Sepherosa Ziehau [Fri, 7 Jun 2013 07:18:59 +0000 (15:18 +0800)]
altq: Make sure mbuf contains pkthdr in enqueue method
Sepherosa Ziehau [Fri, 7 Jun 2013 06:40:59 +0000 (14:40 +0800)]
rtchange: Don't migrate CPU when change routes ifaddr; use rt_threads
Sepherosa Ziehau [Fri, 7 Jun 2013 06:18:16 +0000 (14:18 +0800)]
in_ifadown: Don't migrate CPU when delete addr routes; use rt_threads
Sepherosa Ziehau [Fri, 7 Jun 2013 02:56:42 +0000 (10:56 +0800)]
if_detach: Don't migrate CPU when delete interface routes; use rt_threads
Sascha Wildner [Thu, 6 Jun 2013 18:45:04 +0000 (20:45 +0200)]
<sys/cdefs.h>: Bring in a compatibility macro for C11's _Static_assert.
Apparently gcc47 has support outside of C11, but for processing our
<complex.h> with gcc44, this is necessary.
Taken-from: FreeBSD
Sascha Wildner [Thu, 6 Jun 2013 18:27:46 +0000 (20:27 +0200)]
gcc44: Bring back an accidentally removed file.
Sascha Wildner [Thu, 6 Jun 2013 18:20:10 +0000 (20:20 +0200)]
gcc: Stop installing GCC's tgmath.h to /usr/libdata/gcc{44,47}.
Before this commit, the recently imported <tgmath.h> (in
/usr/include when installed) wasn't taken because of GCC's
own copies in /usr/libdata/gcc{44,47}, which comes first
in the compiler's internal include search path.
Since it is a standard header, we want our version to be
taken, so stop installing GCC's version and remove it via
'make upgrade'.
There are no other standard headers in /usr/libdata/gcc{44,47}.
Sascha Wildner [Thu, 6 Jun 2013 18:11:04 +0000 (20:11 +0200)]
<sys/cdefs.h>: Add some macros needed for our <tgmath.h>.
Taken-from: FreeBSD
Matthew Dillon [Thu, 6 Jun 2013 06:52:02 +0000 (23:52 -0700)]
kernel - Fix several bugs in FAIRQ
* Fix several possible overflows due to high-valued machclk_freq constants
and uint's that should have been uint64's. Among other things this fixes
bandwidth calculations that could previously get into weird states.
* Refactor the fairq_selectq() routine to fix numerous cases where the
head of the queue could get advanced multiple times without pulling a
packet off the queue, causing packets in queues to be excessively
delayed.
Both of these were rather serious issues. Operation is far smoother with
the bugs fixed.
Matthew Dillon [Thu, 6 Jun 2013 04:02:18 +0000 (21:02 -0700)]
kernel - Increase IFQ_MAXLEN from 50 to 250
* IFQ_MAXLEN is used as the default for numerous pseudo-network drivers.
The value 50 is just too low. Increase to 250.
John Marino [Wed, 5 Jun 2013 23:51:47 +0000 (01:51 +0200)]
/usr/Makefile: Add pkg-bootstrap target
New snapshots provide a pre-built "pkg" tool so pre-build dports
binaries can be installed easily. However, upgrades from older releases
won't have "pkg" available, and for those systems they had to download
the entire dports repository just to build pkg in order to take advantage
of the available pre-built packages.
To fix this situation, and the situation where /usr/local/sbin/pkg is
lost for any reason, a new make target has been added to /usr/Makefile:
pkg-bootstrap
The pkg-bootstrap target will download a pre-built "pkg-static" program
along with pkg.conf and all the man pages. In reality, pkg-static is
only used for one command, and that is to install a full "pkg" program
from the dragonfly repository.
If "pkg.conf" already exists, a message will instruct the user to move
it first. If /usr/local/sbin/pkg already exists, the target won't work
and it won't even show as an option.
After pkg-static is installed, the user will be instructed to type
"rehash; pkg-static install -y pkg; rehash" which should result in the
system having the latest pkg on their system and thus can proceed to
install packages normally.
Matthew Dillon [Tue, 4 Jun 2013 21:42:52 +0000 (14:42 -0700)]
hammer2 - Adjust newfs_hammer2 for recent media changes
* Adjust newfs_hammer2 for recent media changes. The freemap is now based
in the volume header with a blockref set instead of a single blockref.
Matthew Dillon [Tue, 4 Jun 2013 21:41:50 +0000 (14:41 -0700)]
hammer2 - Add 'hammer2 freemap' directive
* Beef up 'show' and add the 'hammer2 freemap' directive to dump the
freemap.
Matthew Dillon [Tue, 4 Jun 2013 21:29:20 +0000 (14:29 -0700)]
hammer2 - freemap part 3 - group by allocation size
* Each freemap leaf represents ~2MB worth of storage. Assign a radix to
each leaf, limiting allocations from that leaf to that radix.
This primarily results in inodes being grouped together, improving
the performance for find, ls or other topological scans. We could
improve this but for now we'll stick with it as-is.
This mechanic also allows us to use cluster_read(). This function is
used for everything except volume-header and freemap elements.
* More formally handle logical sizes vs allocation sizes vs device I/O
sizes. For example, a 1KB inode allocates 1KB using 16KB device I/O's.
* Beef up the sysctl I/O counters.
Sascha Wildner [Tue, 4 Jun 2013 17:27:16 +0000 (19:27 +0200)]
rpc.statd(8): Fix warnings and bump WARNS to 6.
Sepherosa Ziehau [Tue, 4 Jun 2013 09:36:44 +0000 (17:36 +0800)]
ifsubque: Cut ties with ifqueue
Sepherosa Ziehau [Tue, 4 Jun 2013 09:09:07 +0000 (17:09 +0800)]
ifq: Add ifsq_poll_pktlen, which calculate the polled mbuf's length
Calculating the polled mbuf's length w/o ALTQ lock is not MPSAFE.
Sepherosa Ziehau [Tue, 4 Jun 2013 08:55:19 +0000 (16:55 +0800)]
ifq/classic: Add pkthdr assertion
Sepherosa Ziehau [Tue, 4 Jun 2013 08:24:48 +0000 (16:24 +0800)]
altq: Remove the unused parameter 'mpolled' from dequeue method
Sascha Wildner [Sat, 1 Jun 2013 10:36:34 +0000 (12:36 +0200)]
cam(3): Fix a wrong check and bump WARNS to 2.
Sepherosa Ziehau [Tue, 4 Jun 2013 02:00:27 +0000 (10:00 +0800)]
ifq: Remove the unused parameter 'mpolled' from ifq dequeue interface
The ifq_poll() -> ifq_dequeue() model is not MPSAFE, and mpolled has
not been used, i.e. set to NULL, for years; time to let it go.
Sascha Wildner [Mon, 3 Jun 2013 21:32:24 +0000 (23:32 +0200)]
math.3/tgmath.3: Bump dates and fix minor mdoc issues.
Sascha Wildner [Mon, 20 May 2013 13:02:40 +0000 (15:02 +0200)]
rconfig(8): Stop creating /usr/pkg/etc/mk.conf from the scripts.