Matthew Dillon [Sat, 2 Jun 2012 17:22:04 +0000 (10:22 -0700)]
Merge branches 'hammer2' and 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into hammer2
Matthew Dillon [Sat, 2 Jun 2012 17:21:03 +0000 (10:21 -0700)]
kernel - Add comment on spinlocks_wr
* Document a side effect related to spinlocks_wr in the LWKT scheduler.
Matthew Dillon [Sat, 2 Jun 2012 17:15:51 +0000 (10:15 -0700)]
kernel - Remove kernel-level ccms module (it will be moved into hammer2)
* Remove the CCMS kernel layer. The CCMS module is going to be moved
directly into hammer2 in order to make hammer2 more portable. For
now that means moving the files into vfs/hammer2 in the hammer2 branch.
* CCMS is a logical cache coherency locking layer that has been in the
DragonFly tree for a while but was not enabled by default. Originally
the plan was to not lock vnodes across operations but to instead acquire
the appropriate CCMS lock(s), but rewiring all the filesystems proved to
be too large a task.
* HAMMER2's cluster work is going to need this layer for real, but nothing
else does. What we will do instead (eventually) is add a mount flag to
allow us to avoid locking vnodes across VNOPS calls which HAMMER2 will be
able to specify.
Matthew Dillon [Thu, 31 May 2012 17:26:35 +0000 (10:26 -0700)]
Merge branches 'hammer2' and 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into hammer2
Sepherosa Ziehau [Sun, 27 May 2012 13:16:46 +0000 (21:16 +0800)]
igb: Optimize TX path
Reduce the number of status reports of TX ring: at most 16 reports every
TX descriptor count tranmission. It is unnecessary to report status for
every TX descriptor. This could greatly reduce bus traffic.
Use "Transmit Completions Head Write Back" as mentioned in the datasheet.
In this model, TX descriptors are no longer written by hardware thus cache
trashing is avoided. This also greatly reduce the complexity of igb_txeof.
Implemention note of "Transmit Completions Head Write Back",
- HWBTHRESH is not used, since:
o 82575 does not support it
o Number of status reports are already greatly reduced
- WB_on_EITR is not used, since:
o 82575 does not support it
o It will cause unnecessary head write-back
Performance is almost same as previous code:
- 1.48Mpps for 18bytes UDP datagram
- Line rate for 1472bytes UDP datagram and TCP stream
Sepherosa Ziehau [Thu, 31 May 2012 09:32:08 +0000 (17:32 +0800)]
tcp: Adjust tcpcb fields comment about NewReno fast recovery
We have SACK based fast recovery; don't limits the fields to NewReno
Sascha Wildner [Thu, 31 May 2012 05:45:57 +0000 (07:45 +0200)]
kernel/drm: Remove bogus .PATHs.
François Tigeot [Tue, 29 May 2012 21:12:15 +0000 (23:12 +0200)]
drm: Stow drivers for various chip families
putting them into their own subdirectories in sys/dev/drm/
Inspired-by: David Shao's dflygsocdrm work
Aggelos Economopoulos [Wed, 30 May 2012 14:03:21 +0000 (16:03 +0200)]
Fix for password truncation when using crypt(3) with DES
Passwords containing a 0x80 byte (UTF-8 encoded ones, ASCII and
ISO-8859-* not affected) would get truncated as if a '\0' byte
had been encountered. This could result in some very weak passwords.
Reported-by: Rubin Xu, Joseph Bonneau, Donting Yu (CVE-2012-2143)
Sepherosa Ziehau [Wed, 30 May 2012 08:23:48 +0000 (16:23 +0800)]
icmp: Discard ICMP Source Quench per RFC6633
Sepherosa Ziehau [Wed, 30 May 2012 05:16:02 +0000 (13:16 +0800)]
tcp: Only tcpopt.to_flags are needed in tcp_recv_dupack()
While im here, change tcpopt.to_flags from u_long to u_int
Sepherosa Ziehau [Wed, 30 May 2012 03:48:03 +0000 (11:48 +0800)]
tcp: Even for PAWS tolerance, no segments should follow segment with FIN
Sepherosa Ziehau [Tue, 29 May 2012 09:12:07 +0000 (17:12 +0800)]
tcp: Don't let fast retransmit disrupt RTO rebasing
While im here, add and adjust comment about spurious timeout retransmit
detection.
Sepherosa Ziehau [Wed, 30 May 2012 03:34:22 +0000 (11:34 +0800)]
tcp/reass: Fix the cases that FIN got lost during reassemble
While im here, set SACK report's right edge correctly if the current
segment could be merged with its succeeding segment.
Sepherosa Ziehau [Wed, 30 May 2012 01:43:50 +0000 (09:43 +0800)]
tcp/sack: If other side reneged, discard the current SACK scoreboard
Other side reneging is detected using the first SACK record:
If its left edge is less than or equal to the cumulative ACK of the
incoming segment, other side probably reneged.
This fixes the later assertion that the first SACK record's left edge
must be above snd_una in tcp_sack_first_unsacked_len()
Add statistics about other side reneging
Sepherosa Ziehau [Tue, 29 May 2012 08:09:47 +0000 (16:09 +0800)]
socket: Fix wrongly numbered SIOCGIFDATA
While im here, add comment about used number in 'i' group
DragonFly-bug: http://bugs.dragonflybsd.org/issues/1897
Sascha Wildner [Mon, 28 May 2012 11:33:17 +0000 (13:33 +0200)]
drm.4: A little clean up.
François Tigeot [Sun, 27 May 2012 20:06:37 +0000 (22:06 +0200)]
kernel: increase watchdog default period to 30s
Reducing idle cpu time and power a bit
Sepherosa Ziehau [Mon, 28 May 2012 06:33:28 +0000 (14:33 +0800)]
tcp/sack: Constify function arguments if possible
Sepherosa Ziehau [Mon, 28 May 2012 02:38:08 +0000 (10:38 +0800)]
man/ktr: Adjust for the recent ether function cleanup
Reminded-by: swildner@
Sepherosa Ziehau [Mon, 28 May 2012 02:32:41 +0000 (10:32 +0800)]
tcp/sack: Only retransmit unSACKed data when fast retransmit
Sepherosa Ziehau [Sun, 27 May 2012 11:32:00 +0000 (19:32 +0800)]
pktgen: Unbreak compile
François Tigeot [Sun, 27 May 2012 06:40:27 +0000 (08:40 +0200)]
kernel: in_cksum2.s is needed by inet6 code
Sascha Wildner [Sat, 26 May 2012 23:00:15 +0000 (01:00 +0200)]
Remove a few more casts of NULL to some pointer type.
Francois Tigeot [Sat, 26 May 2012 14:07:50 +0000 (16:07 +0200)]
kernel: tcp_fasttimo() is dead
* It was actually killed in 1999
* Remove its last two remaining references
Sepherosa Ziehau [Sat, 26 May 2012 15:06:07 +0000 (23:06 +0800)]
pci: Disable PCI express memory mapped access method by default
It seems to hang some systems during boot.
Reported-by: y0netan1@
Sepherosa Ziehau [Sat, 26 May 2012 15:05:01 +0000 (23:05 +0800)]
tools: Add netblast
Obtained-from: FreeBSD
Sascha Wildner [Sat, 26 May 2012 11:40:43 +0000 (13:40 +0200)]
acpi: strupr() isn't used anywhere, so remove it.
Sascha Wildner [Sat, 26 May 2012 08:21:02 +0000 (10:21 +0200)]
ndis.4: Comment out an unneeded sentence.
It is supported on all platforms we have.
Venkatesh Srinivas [Sat, 26 May 2012 03:15:00 +0000 (20:15 -0700)]
Merge branch 'master' of /repository/git/dragonfly
Sascha Wildner [Fri, 25 May 2012 21:28:33 +0000 (23:28 +0200)]
kernel: Remove the inclusion of opt_ddb.h from where it is unnecessary.
None of these files uses DDB, DDB_UNATTENDED or GDB_REMOTE_CHAT (which
is what opt_ddb.h defines).
Venkatesh Srinivas [Fri, 25 May 2012 19:43:58 +0000 (12:43 -0700)]
libc -- dmalloc: Call malloc_init as-needed, rather than via ctor (#2)
This commit is a second revision of
e12d3396c777165504d60d2a1408dcd7cb63660d; for details, see the original
commit message.
That commit was reverted quickly, as it broke pthreads; this revision
does not suffer from that problem, as it preserves the __constructor
logic for malloc_init.
Reverts:
4018c6eddd57f4abf9134690cbfa46c9d7103558 (Revert libc ...)
Reported-by: marino@
Closes-bug: 2305
Sascha Wildner [Fri, 25 May 2012 18:07:33 +0000 (20:07 +0200)]
Remove some useless casts of NULL to another pointer type.
Sepherosa Ziehau [Fri, 25 May 2012 08:28:55 +0000 (16:28 +0800)]
pci: Print PCIe memory mapped accessing information a little bit earlier
Sepherosa Ziehau [Fri, 25 May 2012 07:38:50 +0000 (15:38 +0800)]
tcp: Enable RFC3517bis by default
Sepherosa Ziehau [Fri, 25 May 2012 07:23:59 +0000 (15:23 +0800)]
tcp: Function renaming
tcp_recv_dupack() probably is better than tcp_fast_recovery(), which does
more the fast recovery.
Sepherosa Ziehau [Fri, 25 May 2012 06:18:18 +0000 (14:18 +0800)]
tcp/sack: Fix off-by-one bug when updating rescue SACK information
Sepherosa Ziehau [Thu, 24 May 2012 08:09:57 +0000 (16:09 +0800)]
tcp/sack: Force out more segments allowed by "pipe" during fast recovery
If some segments are cumulatively acked or SACKed, and HighRxt equals
snd_una, one segment (new or retransmit) will be forced out even if cwnd
and pipe don't allow it. When large amount of segments are lost, i.e.
computed pipe could be large, this avoids unnecessary retransmit timeout
and could perform as good as NewReno.
Sysctl node net.inet.tcp.force_sackrxt could be tuned to burst out several
retransmits, default is 1 (should be good enough). Set this sysctl to 0,
SACK based fast recovery will obey the computed pipe.
Several unnecessary retransmit timeout graph as described above:
http://leaf.dragonflybsd.org/~sephe/no_force_sack_rexmt2_15.xpl (starts @15s)
http://leaf.dragonflybsd.org/~sephe/no_force_sack_rexmt_54.xpl (starts @54s)
Sepherosa Ziehau [Thu, 24 May 2012 05:35:36 +0000 (13:35 +0800)]
tcp/sack: Use RFC3517bis IsLost(snd_una) as fallback of early retransmit
Since we are less certain about whether is segment is lost or not when
using IsLost(snd_una), we do not send out other unSACKed segments except
the first unSACKed segment under this condition. Sending out other
unSACKed segments could be too aggressive here; just wait for another
ACK to tick out more unSACKed segments.
Sascha Wildner [Thu, 24 May 2012 18:16:10 +0000 (20:16 +0200)]
kernel: Remove some bogus casts to the own type (FINAL).
Sascha Wildner [Thu, 24 May 2012 17:26:08 +0000 (19:26 +0200)]
kernel: Remove some bogus casts to the own type.
Sascha Wildner [Thu, 24 May 2012 17:19:30 +0000 (19:19 +0200)]
kernel: Remove some bogus casts to the own type.
Sascha Wildner [Thu, 24 May 2012 08:35:00 +0000 (10:35 +0200)]
kernel: Remove some bogus casts to the own type.
Sepherosa Ziehau [Wed, 23 May 2012 09:38:30 +0000 (17:38 +0800)]
tcp/sack: Fix the condition that SACK rescue retransmit can't be done
If we have nothing left above the HighRxt, the first unSACKed segment
will be used as the SACK rescue retransmit.
Sepherosa Ziehau [Wed, 23 May 2012 09:37:39 +0000 (17:37 +0800)]
tcp: Indentation
Venkatesh Srinivas [Thu, 24 May 2012 02:15:25 +0000 (19:15 -0700)]
kernel -- CLFLUSH support
* Introduce a kernel variable, 'vmm_guest', signifying whether the
kernel is running in a virtual environment, such as KVM. This is
set based on the CPUID2.VMM flag on kernels and set automatically
on virtual kernels.
* Introduce wrappers for CLFLUSH instructions.
* Provide tunable, hw.clflush_enable, to autoenable CLFLUSH on h/w (-1)
disable always (0), or enable always (1).
Closes-bug: 2363
Reviewed-by: ftigeot@
From: David Shao, FreeBSD
Sascha Wildner [Wed, 23 May 2012 20:42:46 +0000 (22:42 +0200)]
kernel: Remove some bogus casts to the own type.
Sascha Wildner [Wed, 23 May 2012 19:30:05 +0000 (21:30 +0200)]
kernel: Remove some bogus casts to the own type.
Sascha Wildner [Wed, 23 May 2012 19:28:32 +0000 (21:28 +0200)]
kernel/linux: Fix a wrong cast (introduced in
e54488bb).
Sascha Wildner [Wed, 23 May 2012 16:36:44 +0000 (18:36 +0200)]
kernel: Remove some bogus casts to the own type.
Sascha Wildner [Wed, 23 May 2012 16:01:25 +0000 (18:01 +0200)]
kernel: Remove some bogus casts to the own type.
Sepherosa Ziehau [Wed, 23 May 2012 05:44:52 +0000 (13:44 +0800)]
tcp: Simplify "extended limited transmit" logic a little bit
Don't follow the RFC4653 or RFC3517bis's "extended limited transmit"
description verbatimly; increase cwnd once and let tcp_output() do
the job.
Sepherosa Ziehau [Wed, 23 May 2012 03:14:02 +0000 (11:14 +0800)]
tcp: Optimize SACK scoreboard records consolidation a little bit
If the SACK block and SACK scoreboard record are matched exactly,
SACK scoreboard records consolidation is not needed at all.
Sascha Wildner [Tue, 22 May 2012 13:02:01 +0000 (15:02 +0200)]
Revert "libc -- dmalloc: Call malloc_init as-needed, rather than via cc constructor."
This reverts commit
e12d3396c777165504d60d2a1408dcd7cb63660d.
Sepherosa Ziehau [Tue, 22 May 2012 08:10:21 +0000 (16:10 +0800)]
acpica: Unbreak LINT/LINT64 building
Sepherosa Ziehau [Tue, 22 May 2012 07:55:45 +0000 (15:55 +0800)]
acpi/timer: Fix return value
Magliano Andrea [Fri, 11 May 2012 13:59:11 +0000 (15:59 +0200)]
acpidb: regenerate osunixxf.c.patch
someone please take care of dfly header, if necessary;
i applied the patch by hand and pulled in a git diff
Magliano Andrea [Fri, 11 May 2012 13:58:52 +0000 (15:58 +0200)]
acpidb: add missing evglock.c to Makefile
Magliano Andrea [Fri, 11 May 2012 08:42:56 +0000 (10:42 +0200)]
Fix iasl compilation
basically sync with svn://svn.freebsd.org/base/head@220663
Magliano Andrea [Fri, 11 May 2012 08:19:52 +0000 (10:19 +0200)]
Some files overlooked on first commit...
Magliano Andrea [Fri, 11 May 2012 07:19:24 +0000 (09:19 +0200)]
Revert previous commit (wrong tentative)
and do like svn://svn.freebsd.org/base/head@220663
it doesn't seem possible with bsd Makefile infrastructure
to set source target specific flags
Magliano Andrea [Fri, 11 May 2012 06:12:10 +0000 (08:12 +0200)]
First import (compiles, seems to run correctly)
Taken from FreeBSD r222544:218590 (patch applied),
not from acpica repository.
One problem shown (no more reproducible, skew build?):
in bootverbose mode 'domain0 misses processors, should be 2, got 1'
sysctl shows hw.acpi.cpu0 only, other cpus are missing;
seems an error in evaluating C009 Method in aml code...
TODO:
* iasl compiler Makefile has to be reworked because of specific
YASL flags for new files dtparser.[yl]
* 'EVENTHANDLER_INVOKE(power_suspend)' to be integrated in acpi.c
* atomic_load_acq_64 isn't implemented (used in acpi_hpet.c)
* sc->tc.tc_quality isn't available; to be investigated
* acpi_timer_test() improved implementation not integrated
* ACPI_CAP_SMP_C3_NATIVE and CPI_CAP_PX_HW_COORD in acpivar.h
left out, as FreeBSD don't use it either
Sepherosa Ziehau [Mon, 21 May 2012 08:58:31 +0000 (16:58 +0800)]
igb: Add to x86_64 and i386 GENERIC
Sepherosa Ziehau [Mon, 21 May 2012 08:35:02 +0000 (16:35 +0800)]
LINT: Add igb(4)
Matthew Dillon [Sun, 20 May 2012 18:12:58 +0000 (11:12 -0700)]
hammer2 - Fix lost flush
* hammer2 allows the buffer cache buffers related to MODIFIED but unlocked
chains to be retired by the OS. In this situation hammer2 does not want
to bdwrite() the buffer again unless additional modifications are made,
even though the MODIFIED bit in the chain remains set throughout the
entire sequence.
* Fix a case where these additional modifications were not properly flagging
for the buffer cache buffer to be retired with a bdwrite(), causing data
loss. This is related to the DIRTYBP chain flag.
* Make further adjustments to the DIRTYBP chain flag.
* Also fix a case where the MOVED bit might not get properly set when a
block is resized. The problem was masked by the fact that a resize
only occurs on data blocks and only during a write(), so the related
buffer was being marked MODIFIED anyway. However, the resize code still
needed to be corrected.
* Add some debugging to 'hammer2 stat' to make it easier to poke around
related kernel structures.
Matthew Dillon [Sun, 20 May 2012 18:12:44 +0000 (11:12 -0700)]
Merge branches 'hammer2' and 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into hammer2
Venkatesh Srinivas [Sun, 20 May 2012 14:10:56 +0000 (07:10 -0700)]
libc -- dmalloc: Call malloc_init as-needed, rather than via cc constructor.
dmalloc requires its own _nmalloc_thr_init be called before it can service
allocations. Applications with preinit arrays were able to call malloc before
constructors ran, which caused them to crash on uninitialized allocator state.
The change uses a flag to test for allocator init state. It is also careful
to not allow _nmalloc_thr_init to be called recursively from within pthread
initialization (slglobal.masked).
Reported-by: marino@
Closes-bug: 2305
Sepherosa Ziehau [Sun, 20 May 2012 13:52:01 +0000 (21:52 +0800)]
netif: Remove no longer used e1000 layout
Sepherosa Ziehau [Wed, 25 Apr 2012 12:42:40 +0000 (20:42 +0800)]
igb: Import Intel igb-2.2.3
Local changes
- Laundry the code
- Rewrite busdma related code
- Rewrite RX path
- Enable hardware TX IP chesksum
Integration w/ DragonFly's RSS and TX path optimization will be
conducted in the repository.
Tested-with: 82576 82575EB
Sepherosa Ziehau [Thu, 19 Apr 2012 14:11:00 +0000 (22:11 +0800)]
ig_hal: Merge Intel igb-2.2.3 HAL w/ em-7.2.4 HAL
Sepherosa Ziehau [Thu, 19 Apr 2012 13:57:50 +0000 (21:57 +0800)]
e1000: Unhook from building, prepare for the new igb
Sascha Wildner [Sun, 20 May 2012 02:47:11 +0000 (04:47 +0200)]
kernel/devfs: Remove the unused devfs Makefile.
Matthew Dillon [Sat, 19 May 2012 22:21:01 +0000 (15:21 -0700)]
hammer2 - Add 'hammer2 stat'
* Add the 'hammer2 stat' directive to access inode information not
available from a normal stat.
Currently reports ncopies, data_count, inode_count, data_quota, and
inode_quota.
Matthew Dillon [Sat, 19 May 2012 22:17:03 +0000 (15:17 -0700)]
hammer2 - Get data-usage aggregation working, add INODE_GET
* Cleanup aggregation of data_count and inode_count in the inode.
data_count should now work properly (though it requires a 'sync'
if you want up-to-date information).
This allow data and inode usage for an entire sub-tree to be
retrieved from the parent directory inode. No need to run 'du'
over millions of inodes.
The new 'hammer2 stat' command can be used to access the info.
* Add the HAMMER2IOC_INODE_GET/SET ioctls to access information that
cannot be obtained from a normal stat().
Matthew Dillon [Sat, 19 May 2012 19:07:40 +0000 (12:07 -0700)]
Merge branches 'hammer2' and 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into hammer2
Venkatesh Srinivas [Sat, 19 May 2012 03:33:56 +0000 (20:33 -0700)]
kernel -- tmpfs: Convert tmpfs inode counter to per-mount field
tmpfs used a global counter under a spinlock to set inode numbers. This
should be a per-mount field, protected by the mount lock.
Matthew Dillon [Sat, 19 May 2012 02:18:14 +0000 (19:18 -0700)]
hammer2 - Flush ordering fixes
* The flush code is required to write out modified chains, not just bdwrite()
them. Otherwise the disk synchronization and volume header write will be
mis-ordered.
* Don't re-write indirect blocks that the OS had already written out. This
check is already being made for data blocks, and inode modifications are
embedded and thus must always be written out.
* This fixes issues where 'hammer2 show <device>' would find corrupt
topology during concurrent filesystem write activity. The disk media
is always supposed to be consistent.
We don't care about block-reuse cases for this debug command but we do
care that, sans block-reuse, a dump will produce a consistent topology.
Matthew Dillon [Sat, 19 May 2012 00:19:17 +0000 (17:19 -0700)]
hammer2 - general stabilization, flusher, mmap, etc
* Revamp the flush logic. Flushes now stage the blockref related to the
data written out to the media. Higher level chains save the staged
blockref instead of the current blockref.
* This allows flushes to occur concurrent with active modification of the
topology without having to restart the flush. Modifications made after
the flush has started running will remain intact and not be committed
to media until the next flush (see note).
NOTE: Currently chain deletions break this, but this is the only issue
currently.
* Fix lost chains during unmount. Deleted chains can still have the MOVED
and/or MODIFIED bits set, which add additional refs and prevents them
from being freed.
Detect when a chain is being deleted permanently (verses temporarily due
to a rename) and clean out the bits in question.
NOTE: Currently deletions are removed from the in-memory topology, which
is why the previous NOTE above is still a problem, so we will need
to fix this and to retain at least the MOVED for flushes in
progress.
* Fix data corruption related to unflagged chains which wind up not getting
flushed and also due to a bug in the indirect block management code.
* Fix a mmap() access failure for cached direct-data (less than 512 bytes).
nvextendbuf() was not being called for the direct-data case during the
write().
* Buildworld with a HAMMER2 /usr/obj now succeeds.
* 'hammer2 pfs-create <label>' now defaults to a pfstype of MASTER,
instead of requiring that the pfstype always be specified.
Matthew Dillon [Sat, 19 May 2012 00:01:15 +0000 (17:01 -0700)]
Merge branches 'hammer2' and 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into hammer2
Sascha Wildner [Fri, 18 May 2012 23:57:25 +0000 (01:57 +0200)]
amr(4): Some fixes.
* Bring in some small updates from FreeBSD.
* Add MODULE_VERSION.
* Make the interrupt handler MPSAFE. This was a porting oversight by me.
Sascha Wildner [Fri, 18 May 2012 18:10:50 +0000 (20:10 +0200)]
Fix some typos in manual pages.
Sascha Wildner [Fri, 18 May 2012 16:53:28 +0000 (18:53 +0200)]
bsd-family-tree: Sync with FreeBSD.
Sascha Wildner [Fri, 18 May 2012 11:16:32 +0000 (13:16 +0200)]
builtin.1: Bring in some enhancements from FreeBSD.
It is modeled after what they did but based on what we actually have in
our shells' source.
* Use "No**" to mark commands which exist externally but are implemented
as a script executing the builtin.
* Some further explanations and mdoc fixes.
Sascha Wildner [Fri, 18 May 2012 11:05:09 +0000 (13:05 +0200)]
builtin.1: Add two more built-in commands.
Sepherosa Ziehau [Fri, 18 May 2012 07:29:39 +0000 (15:29 +0800)]
tcp: Implement RFC4653 Non-Congestion Robustness (NCR)
It is enabled by default and can be disabled using sysctl node:
net.inet.tcp.ncr
As far as I have tested on heavily reordered network path, this
algorithm does avoid most of the spurious fast retransmits. While
on the normal network path, the fast retransmits stil could be
triggered properly.
Sepherosa Ziehau [Fri, 18 May 2012 02:33:21 +0000 (10:33 +0800)]
tcp: Improve RFC3517bis support
- Factor out tcp_fast_recovery()
- Delay fast retransmit or fast recovery for duplicated ACK which
carries data or updates receiving window, so that
o The segments sent by fast retransmit/recovery could carry
proper ack sequence and SACK information.
o Receiving window could get updated, so more new data could be
injected into the network by the fast recovery.
Matthew Dillon [Fri, 18 May 2012 03:00:27 +0000 (20:00 -0700)]
Merge branches 'hammer2' and 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into hammer2
Matthew Dillon [Fri, 18 May 2012 01:41:51 +0000 (18:41 -0700)]
hammer2 - hardlink stabilization (3), data and inode count propagation.
* Files with cached chains have to be flushed before they can be copied
to the hardlink target, because the original inode will become a
OBJTYPE_HARDLINK pointer which isn't allowed to have any sub-chains
under the inode.
* We also need to flush for the upcoming snapshot function to work properly
or dirty in-memory data will not show up in the snapshot.
* Propagate the inode and byte use count up the chain. Tie the inode count
into df's inode count (per-PFS). The byte count and quota fields are not
yet tied in.
* Adjust stat[v]fs() to return filesystem space useage using the allocation
iterator for now, to aid debugging.
* Adjust the allocation iterator to skip reserved areas at the beginning of
each 2GB storage zone.
Sascha Wildner [Thu, 17 May 2012 23:52:22 +0000 (01:52 +0200)]
kernel: Remove some bogus casts to the own type.
Sascha Wildner [Thu, 17 May 2012 23:03:10 +0000 (01:03 +0200)]
builtin.1: Sync with what we have.
Sascha Wildner [Thu, 17 May 2012 21:48:20 +0000 (23:48 +0200)]
share/man/man1/Makefile: One MLINK per line.
Sascha Wildner [Thu, 17 May 2012 20:17:53 +0000 (22:17 +0200)]
examples/rconfig: Some fixes to our installation scripts.
* Allow the script to be run in a netbooted scenario, too.
* Raise the default size of the root partition to 768M (like the
installer's default).
* While here, add some comments and whitespace.
Submitted-by: Joachim de Groot <jdegroot@web.de>
Matthew Dillon [Thu, 17 May 2012 18:34:18 +0000 (11:34 -0700)]
hammer2 - hardlink stabilization pass
* Fix another edge case where nkeybits could exceed 64, resulting in
an assertion.
Matthew Dillon [Thu, 17 May 2012 18:01:51 +0000 (11:01 -0700)]
hammer2 - hardlink stabilization pass
* Fix infinite loop in hammer2_chain_create_indirect() related to the
case where the key range is the full 64 bits, which can occur when
invisible hardlink entries are mixed in with normal entries.
* Fix the nlinks count in a couple of places.
* Don't iterate invisibile directory entries. Lookups of hardlink targets
by inode number are absolute. Normal directory entries have a collision
counter, hardlink targets do not.
Matthew Dillon [Thu, 17 May 2012 18:01:30 +0000 (11:01 -0700)]
Merge branches 'hammer2' and 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into hammer2
Sascha Wildner [Thu, 17 May 2012 14:17:22 +0000 (16:17 +0200)]
vkernel: Fix compilation with profiling support.
The vkernel is a special userland program in the regard that its Makefile
is generated by config(8), which is kind of tailored to the real kernel.
So first of all, we have to modify config(8) to detect it's a vkernel we
want to build and in this case it should not define GPROF which otherwise
activates the real kernel's profiling bits.
Then, modify libkern's mcount.c to skip kernel specific parts too.
Then, modify the vkernels' Makefiles to take into account ${PROF} (and
while we're here, ${DEBUG} too) which are set by the surrounding Makefile
which is generated by config(8).
The vkernel is now (from profiling point of view) treated like any other
userland program.
Last but not least, add some documentation about building a vkernel with
profiling support to vkernel's manpage.
To build with profiling, simply add CONFIGARGS=-p to the buildkernel
command line. It will need the config(8) program to be in /usr/obj's
btools dir, so either a buildworld with this commit needs to be done,
or config can be installed manually to /usr/sbin and nativekernel can
be used.
Tested-by: tuxillo
Sepherosa Ziehau [Thu, 17 May 2012 09:58:41 +0000 (17:58 +0800)]
tcp: Ignore TCP_NOPUSH socketopt by default
For ill optimized programs which misuses this sockopt will cause
unpredicted length of network stalling, if the total sending size
is not TCP sending segment size aligned.
sysctl node net.inet.tcp.disable_nopush controls whether TCP_NOPUSH
will take effect or not
I am not going to fight agaist the stupid programs in the wild.
DragonFly-bug: http://bugs.dragonflybsd.org/issues/2368
This is actually _not_ a bug on our side.
Matthew Dillon [Thu, 17 May 2012 08:53:11 +0000 (01:53 -0700)]
Merge branches 'hammer2' and 'master' of ssh://crater.dragonflybsd.org/repository/git/dragonfly into hammer2
Matthew Dillon [Thu, 17 May 2012 08:36:51 +0000 (01:36 -0700)]
hammer2 - Complete core hardlink support work
This implements core hardlink support for hammer2. In order to maintain the
strict bottom-up block modification hierarchy for the chains hardlinks must
be implemented with special forwarding inodes.
When a hardlink is created (nlinks 1->2) the file is replaced with a
forwarding entry and then recreated as a special hidden directory entry
indexed by its inode number at a higher directory level which is common
to all hardlinks to that file.
The forwarding entry simply specifies the inode number, thus our ability to
trivially snapshot a PFS is retained.
Since the real inode is indexed at a higher common directory locating the
real inode simply requires iterating parent directories until we find a
match.
* Default vfs.hammer2.hardlink_enable to 1 (enabled).
* Track and adjust nlinks.
* Implement OBJTYPE_HARDLINK forwarding directory entry, hidden inode,
vnode->v_data inode replacement for the nlinks 1->2 case, and hidden
inode deletion for the nlinks 1->0 case.
* The deconsolidation for the nlinks 2->1 case is not yet implemented.
Sascha Wildner [Thu, 17 May 2012 08:41:29 +0000 (10:41 +0200)]
kernel/profiling: Fix a kprintf format.