Sascha Wildner [Tue, 5 Jul 2016 19:32:59 +0000 (21:32 +0200)]
getconf(1): Add some variables for backward compatibility.
The standard requires all of these.
Sascha Wildner [Tue, 5 Jul 2016 19:26:48 +0000 (21:26 +0200)]
getconf(1): Fix typo (_POSIX2_EXPR_NEXT_MAX -> _POSIX2_EXPR_NEST_MAX).
Sascha Wildner [Tue, 5 Jul 2016 19:11:40 +0000 (21:11 +0200)]
getconf(1): Add some missing variables.
_POSIX_ADVISORY_INFO
_POSIX_RAW_SOCKETS
_XOPEN_STREAMS
Sascha Wildner [Tue, 5 Jul 2016 18:40:56 +0000 (20:40 +0200)]
getconf(1): Fix confstr variable names.
All these don't have an underscore.
Sepherosa Ziehau [Tue, 5 Jul 2016 14:59:17 +0000 (22:59 +0800)]
cat: Align output from cat(1) between when invoked with -be & -ne flags
Obtained-from: NetBSD
Submitted-by: <venture37 geeklan co uk>
DragonFly-bug: https://bugs.dragonflybsd.org/issues/2922
zrj [Thu, 30 Jun 2016 14:07:11 +0000 (17:07 +0300)]
Remove <varargs.h> from the system.
Similarly as it was done with <malloc.h>
Not standard header, just a symlink to machine/varargs.h and
seems not used by anything in the base (<stdarg.h> is preferred).
zrj [Fri, 1 Jul 2016 10:32:46 +0000 (13:32 +0300)]
Fix <machine/varargs.h> use cases.
First varargs.h depended on namespace pollution to provide typdef of __va_list
to declare va_list. Usually thorugh sys/systm.h including sys/stdarg.h
So short-circuit directly to compiler builtin in case of __GNUC__
Also remove machine/varargs.h usage from other kernel sources:
sys/kern/kern_dsched.c: Not needed (just 3 dummy functions)
sys/dev/misc/tbridge/tbridge.c: Both use just __va_smth variants
sys/kern/subr_taskqueue.c: and get those through sys/systm.h
This leaves all the kernel code using <stdarg.h> variant consistently.
zrj [Thu, 30 Jun 2016 14:56:03 +0000 (17:56 +0300)]
<stdio.h>: Hide macros that break global :: ns in cxx.
Avoid expanding macros ::(!__isthreaded ?...) to poorly written
ports that assume some specific libc/stdio.h implementation.
Will help with patching efforts to have less +<cstdio> patches
in dports using c++ codes.
zrj [Thu, 30 Jun 2016 14:51:18 +0000 (17:51 +0300)]
Move __va_size() into freestanding block.
Mainly to match varargs.h layout. No users outside these headers.
zrj [Thu, 30 Jun 2016 14:12:24 +0000 (17:12 +0300)]
<wchar.h>: Reduce namespace pollution in <wchar.h>.
zrj [Thu, 30 Jun 2016 10:17:44 +0000 (13:17 +0300)]
sys/sys: Protect len and inout parameters in _IOC definition.
This should reduce the likelihood of _IOC() macro expanding to something
that wasn't intended and would provide a more flexible interface too.
While there, remove hardcoded value for IOC_DIRMASK
Taken-from: FreeBSD
zrj [Fri, 1 Jul 2016 06:04:56 +0000 (09:04 +0300)]
rpc: Whitespace cleanup.
While there, perform license change as per FreeBSD r258581
zrj [Thu, 30 Jun 2016 09:57:55 +0000 (12:57 +0300)]
rpc: Make few headers more compatible with gcc.
Previously gcc compilers from dports installed patched versions
of rpc headers that override the system ones.
By applying small changes, headers are no longer patched and
does not require rebuilding gcc dports to account for possible
change in include/rpc headers after installworld.
While there perform some minor cleanup.
No functional change.
Sascha Wildner [Mon, 4 Jul 2016 08:50:47 +0000 (10:50 +0200)]
<pthread.h>: Include <machine/limits.h> instead of <limits.h> for ULONG_MAX.
Also include <limits.h> in a couple of files that were missing it.
This commit will break 4 ports:
devel/clanlib1
games/orbital_eunuchs_sniper
games/zatacka
sysutils/cdargs
These will be fixed in the next time.
François Tigeot [Sun, 3 Jul 2016 06:26:40 +0000 (08:26 +0200)]
drm/linux: Improve spin_unlock_irqrestore()'s implementation
Prevents compilation failures in functions not using
spin_lock_irqsave() first.
François Tigeot [Sat, 2 Jul 2016 15:31:25 +0000 (17:31 +0200)]
installer: Do not waste too many inodes on /boot
* A fully populated /boot with kernel, kernel.old, kernel.alt
and associated modules needs aproximately 2K inodes
* With a 1GB /boot partition size, default newfs parameters
allocate 128K inodes
* Reduce this amount to 15K inodes, thus making an additional
13MB of disk space available on /boot
Tomohiro Kusumi [Sun, 26 Jun 2016 09:39:42 +0000 (18:39 +0900)]
sbin/hammer: Make global PFS/accounting variables static
Tomohiro Kusumi [Sat, 25 Jun 2016 16:35:00 +0000 (01:35 +0900)]
sys/vfs/hammer: Remove validate_zone()
Code becomes less clear with this function and usage of sum of results.
Just check if the given offsets are data zones or not (unless we plan
to support non-data zones, but we don't).
François Tigeot [Thu, 30 Jun 2016 05:38:34 +0000 (07:38 +0200)]
drm: Restore DRM_DEBUG_VBLANK() calls
François Tigeot [Wed, 29 Jun 2016 19:21:22 +0000 (21:21 +0200)]
drm/i915: Use the spin_lock_irq() family of functions (2/2)
Further reducing differences with Linux 4.3.
François Tigeot [Wed, 29 Jun 2016 06:12:10 +0000 (08:12 +0200)]
drm: Use the spin_lock_irq() family of functions
Reducing differences with Linux 4.3
François Tigeot [Wed, 29 Jun 2016 06:10:13 +0000 (08:10 +0200)]
drm/i915: Use the spin_lock_irq() family of functions
Reducing differences with Linux 4.3
Matthew Dillon [Wed, 29 Jun 2016 02:14:43 +0000 (19:14 -0700)]
kernel - Enhance buffer flush and cluster_write linearity (2)
* Fix bug last commit. When looping the buffer has to be reset to
the marker or the iteration can wind up on the wrong queue.
* Also count the INVAL case in the loop instead of breaking out.
Matthew Dillon [Wed, 29 Jun 2016 01:58:58 +0000 (18:58 -0700)]
hammer2 - Fix inode destroy panic
* Fix a race in hammer2_inode_xop_destroy() when deleting an inode chain.
The parent can be ripped out from under the code before it gets both
parent and chain locked, resulting in an assertion in hammer2_chain_delete().
Properly test the linkage and retry if the parent changes.
Matthew Dillon [Wed, 29 Jun 2016 01:52:29 +0000 (18:52 -0700)]
kernel - Enhance buffer flush and cluster_write linearity
* flushbufqueues() was iterating between cpus, taking only one buffer off
of each cpu's queue. This forced non-linearly on-flush, messing up
sequential performance for HAMMER1 and HAMMER2. For HAMMER2 this also
caused physical blocks to be allocated out of order.
Add sysctl vfs.flushperqueue to specify the number of buffers to flush
per cpu before iterating the pcpu queue. Default 1024.
* cluster_write() no longer requires that a buffer be VOP_BMAP()'d
successfully in order to issue writes. This effects HAMMER2, which does
not assign physical device blocks until the logical buffer is actually
flushed to the backend device.
* Fixes non-linearity problems for buffer daemon flushbufqueues() calls,
and for cluster_write() with or without write_behind.
Matthew Dillon [Tue, 28 Jun 2016 23:12:46 +0000 (16:12 -0700)]
hammer2 - Optimize indirect block algorithm
* Pack indirect blocks for linear files significantly better.
* First level indirect block for directories reduced to 4KB (32 entries).
* For now make the first level indirect block for directories cover the
entire hash range for either inodes or directory entries (63 bits).
Matthew Dillon [Tue, 28 Jun 2016 07:26:06 +0000 (00:26 -0700)]
hammer2 - Stabilization pass
* If the HAMMER2_CHAIN_DEDUP flag is set modified_needs_new_allocation()
must return 1 to force a new allocation. This fixes a number of dirty
buffer rewrite cases that broke dedup.
* Do not try to dedup a chain flagged MODIFIED or INITIAL.
* The indirect-block deletion code in the flusher needed to also count
blockrefs if it hadn't been done yet. This fixes cases of missing
directory entries.
* For now use a transaction in hammer2_strategy_write(). We probably don't
need it due to the way the logical buffer cache is handled, but do it
anyway.
* Clean-up some of the code documentation.
* Implement sysctls for dedup and buffer invalidation enablement. dedup
is turned on by default, invalidation is turned off. Invalidation is
not currently working well.
Matthew Dillon [Mon, 27 Jun 2016 20:08:56 +0000 (13:08 -0700)]
hammer2 - Remove the hidden directory, rework deletions
* Now that inodes are separately indexed we no longer need the hidden
directory abstraction to handle unlinked-but-open files. Get rid of
ALL the hidden directory handling code.
* Rework xop_unlink and hammer2_inode_unlink_finisher(). We cannot safely
reference the inode chain's inode data to get the nlinks count. Instead,
figure it all out on the frontend using the active nlinks in the
hammer2_inode_t structure.
* Fixes hardlink removal and rename issues.
Sepherosa Ziehau [Mon, 27 Jun 2016 14:21:57 +0000 (22:21 +0800)]
ifnet: Add oqdrops statistics
Matthew Dillon [Mon, 27 Jun 2016 08:37:40 +0000 (01:37 -0700)]
hammer2 - Stabilization, fix bulkfree bugs, change 'df' output
* Automatically delete any indirect nodes which become empty. This is done
in the flusher. Verify that a rm -rf cleans everything out.
* Fix three serious bugs in the bulkfree code.
(1) A range-check of cbinfo->sstop was using '>' instead of '>=', causing
a one-element overflow during the scan and potentially corrupting
memory.
(2) The live bitmap pointer must be reloaded after calling
hammer2_chain_modify()! The old pointer points to a buffer
which must remain clean, or worse points to a buffer completely
unrelated to the hammer2 filesystem.
(3) We were zeroing the temporary bmap, but it actually needs to be
initialized properly (particularly its reserved areas). Just
zeroing it led to reserved areas being improperly marked as
available for allocation.
* Validate that the free space counter is recovered properly after a
rm -rf and bulkfree.
* Disable the modify_tid test in the bulkfree code for now and go back to
forcing a flush.
* Change 'df' reporting. I was trying to be fancy by compensating for dedup
to report how big the filesystem would be if nothing were deduped, but it
just caused confusion. We now report an unchanging total volume size and
the actual number of 16KB blocks that are fully free.
* The 'hammer2 freemap' dump now includes all indices, including those
associated with reserved areas.
Matthew Dillon [Sun, 26 Jun 2016 21:08:21 +0000 (14:08 -0700)]
kernel - Fix panic in error path of nvextendbuf()
* nvextendbuf() was not releasing bp in the error path, leading to
a hanging lock and 'locking against myself' panic later on.
Matthew Dillon [Sun, 26 Jun 2016 05:05:14 +0000 (22:05 -0700)]
hammer2 - Stabilization (data corruption)
* Move the check code errors into hammer2_chain_testcheck() and supply
additional information in the kprintf.
* Reformulate hammer2_io_newq() a bit.
* Fix bugs in the buffer invalidation path. The hammer2_io_newq() path
was improperly setting INVALOK. This path is only used by the freemap
code to pre-validate a buffer to avoid unnecessary reads. Fixed by
not setting INVALOK if IOCB_QUICK is set.
Matthew Dillon [Sun, 26 Jun 2016 04:59:55 +0000 (21:59 -0700)]
hammer2 - Update error message in hammer2_mount
* Update the error message to reflect the current default labels
when the '@LABEL' specification is missing.
Matthew Dillon [Sun, 26 Jun 2016 04:58:29 +0000 (21:58 -0700)]
hammer2 - Enhance freemap output
* Output the base data offset for each freemap line in the freemap
dump.
* Also provide more check data info in the output.
Matthew Dillon [Sun, 26 Jun 2016 03:57:07 +0000 (20:57 -0700)]
nvme - Handle full submission queue
* The submission queue is a ring and can be full even if requests are
available due to out-of-order completion. Update the submission queue's
subq_head from the completion queue status and check for a full condition.
The normal requeue signaling suffices for resume.
* Also note that we allocate maxqe requests, which is actually one more than
we can have on the ring at once. But now that we have the queue-full check,
this becomes a non-issue. Just leave it at maxqe for convenience.
* Tested by temporarily reducing maxqe to 16 and doing stuff to overload it.
Maxqe was returned to 256 for the commit.
Matthew Dillon [Sat, 25 Jun 2016 22:01:44 +0000 (15:01 -0700)]
kernel - Enhance swap allocation failure message
* Output a more appropriate message if the system wants to page to swap
and no swap is configured.
Matthew Dillon [Sat, 25 Jun 2016 21:48:34 +0000 (14:48 -0700)]
kernel - Misc bug fixes and enhancements
* Add atomic_*_64() for 64-bit-explicit calls. This way if a platform
doesn't support 64-bit atomic ops H2 will at least get a compile error.
* Fix bug in sys/mutex2.h. mtx_upgrade_try() was not setting mtx_owner
on success.
* Enhance assertion panic message in lockmgr_kernproc().
Matthew Dillon [Sat, 25 Jun 2016 17:05:24 +0000 (10:05 -0700)]
hammer2 - Stabilization, optimization
* Increase the hammer2_io.refs field to 64 bits so we can add a few more
control bits.
* Track whether invalidation is ok at the DIO level for full-sized (64KB)
data blocks. We continue to use the slightly less-capable CHAIN_DEDUP
flag for smaller data blocks (this flag gets set on frontend->backend
flush whereas the DIO level flag is only cleared when a block is actually
reused for deduplication).
* Separate vfs.hammer2.cluster_enable into cluster_read and cluster_write.
Leave cluster_read enabled with a read-ahead of 4 blocks. Disable
cluster_write for now, but still set B_CLUSTEROK in the bdwrite().
This allows the frontend to 'flush' data to the backend without
initiating disk I/O on the block device, giving us a chance to discard
the data later if it winds up being temporary.
* Remove an improper BUF_KERNPROC(dio->bp) in the case where a different
thread owns the in-progress DIO.
* Defer setting of B_INVAL | B_RELBUF to when the DIO is in lastdrop.
* Add missing brelse() in the hammer2_read_file() error path. Add missing
B_CLUSTEROK in hammer2_write_file().
* The bulkfree code now ensures that the INVALOK bit in any related DIO
for a freed block is cleared, preventing accidental invalidations on
reuse.
Sascha Wildner [Sat, 25 Jun 2016 12:58:48 +0000 (14:58 +0200)]
Stop building/installing groff's soelim(1).
We have a version in usr.bin which we use since ages, so groff's
version got built/installed just to get overwritten again when
usr.bin was installed afterwards.
François Tigeot [Fri, 24 Jun 2016 14:32:20 +0000 (16:32 +0200)]
drm/linux: Implement some spin_lock_irq* functions
They are not just simple spin_lock/spin_unlock() variants but
disable hardware interrupt processing on the current cpu.
Suggested-by: Matt Macy <mmacy@nextbsd.org>
Tomohiro Kusumi [Fri, 24 Jun 2016 14:07:45 +0000 (23:07 +0900)]
sys/vfs/hammer: Remove DEDUP_CACHE_SIZE and wrong comment
It is a tunable sysctl since
e2ef7a95.
Sepherosa Ziehau [Fri, 24 Jun 2016 03:13:16 +0000 (11:13 +0800)]
nvme: Use high frequency interrupt for CQ processing
Suggested-by: dillon@
Reviewed-by: dillon@
Sepherosa Ziehau [Fri, 24 Jun 2016 02:53:49 +0000 (10:53 +0800)]
intr: Allow drivers to register high frequency interrupt.
Only unshared interrupts will be considered, e.g. MSI, MSI-X or
unshared line interrupt.
Sascha Wildner [Thu, 23 Jun 2016 09:46:10 +0000 (11:46 +0200)]
Fix a couple of logic issues in contributed code (gcc, mpfr, tre).
* The mpfr bug was fixed in mpfr's trunk (r8705).
Report: (https://sympa.inria.fr/sympa/arc/mpfr/2013-11/msg00009.html)
* The GCC bug was fixed too. This commit applies GCC's
51aab39345ae.
* The TRE bug came in with Apple's code. No idea if there is a simple way
to report a bug in their libc to them. I've not found any.
François Tigeot [Thu, 23 Jun 2016 07:30:56 +0000 (09:30 +0200)]
drm/linux: Really implement local_irq_disable/enable
Suggested-by: Matt Macy <mmacy@nextbsd.org>
Sepherosa Ziehau [Wed, 22 Jun 2016 14:40:32 +0000 (22:40 +0800)]
intr: Force unshareble interrupt setting
Imre Vadász [Tue, 21 Jun 2016 19:01:25 +0000 (21:01 +0200)]
kern: Also detect KVM via the Hypervisor vendor ID signature.
Bill Yuan [Wed, 22 Jun 2016 19:42:20 +0000 (19:42 +0000)]
ipfw3_nat: kmalloc netmsg from M_LWKTMSG
Sascha Wildner [Tue, 21 Jun 2016 19:56:07 +0000 (21:56 +0200)]
poll.2: Adjust NAME section for ppoll().
Sascha Wildner [Tue, 21 Jun 2016 19:48:08 +0000 (21:48 +0200)]
<sys/poll.h>: Some namespace cleanup.
Include <sys/signal.h> and <sys/time.h> only when necessary.
While here, cleanup whitespace a bit.
zrj [Tue, 21 Jun 2016 17:35:14 +0000 (20:35 +0300)]
usr.bin/dirname: Accept multiple arguments as basename(1)
While there, expand examples section.
Bill Yuan [Tue, 21 Jun 2016 18:13:10 +0000 (18:13 +0000)]
ipfw3: lockless in-kernel NAT
The libalias is used in kernel space for in-kernel NAT, and its alias_link
entries are stored with LIST. so all the packets which need to be NAT will scan
against the LIST and trying to find the matched alias_link. by seperating the
libalias into context of different CPUs, the lock can be removed. and due to the
nature of NAT, the outgoing and incoming packets are possible to be handled by
different CPUs, to ensure the returning packet can be translated properly, the
newly created alias_link is required to be duplicated and inserted into contexts
of both CPUs.
e.g.
ipfw3 nat 1 config if em0
ipfw3 nat 1 all via em0
ipfw3 nat 1 show state
Matthew Dillon [Tue, 21 Jun 2016 07:31:53 +0000 (00:31 -0700)]
hammer2 - stabilization pass
* Fix a shared/exclusive deadlock. When adding a ref to a shared lock
that has already been obtained we must make a slightly different call
than the normal one because the normal one will block on a pending
exclusive request, causing a deadlock.
* Add a missing BUF_KERNPROC(). Will hopefully fix a lock ownership
assertion in the kernel that I've been hitting on heavy use.
* Looks like NFS needs to do inode number lookups on softlinks, so
add inode indexing to the softlink (and the mknod) code instead
of embedding the softlink in the directory entry.
Matthew Dillon [Tue, 21 Jun 2016 06:05:35 +0000 (23:05 -0700)]
hammer2 - Update directory mtime
* Update directory mtime on nmkdir, nlink, ncreate, nmknod, nsymlink,
nremove, nrmdir, and nrename.
Matthew Dillon [Tue, 21 Jun 2016 05:25:50 +0000 (22:25 -0700)]
hammer2 - Stabilization pass
* Fix incorrect ip->meta.iparent initializations that were messing up
NFS. These fixes are primarily in the hammer2_inode_create() path.
The 'dip' passed in is not the correct inode to retrieve ip->meta.inum
from for the new inode's iparent. Pass a second inode indicating the
proper parent directory linkage for iparent.
* Remove ip->pip entirely. Since the actual file/directory inodes are
no longer heirarchical this field only creates confusion. The two
places where we really need it can simply use ip->meta.iparent.
Also clean-out a considerable amount of code that previously dealt with
ip->pip linkages and adjustments.
* Do not try to drop data on a 1->0 transition failure, this can race
increments and cause the data to be improperly dropped.
* Do not try to drop data on lockcnt == 0 unless persist_refs is also 0.
Fixes several SMP races where chain->data was being lost improperly.
* Cleanup the APIs for recent changes in how inodes work.
* Now passes buildworld test with /usr/src and /usr/obj mounted with NFS
from a hammer2 volume.
Matthew Dillon [Tue, 21 Jun 2016 05:06:53 +0000 (22:06 -0700)]
kernel - Enhance debug.ncvp_debug debugging
* Enhance a debug kprintf for debugging a specific server-side situation
with NFS.
Matthew Dillon [Tue, 21 Jun 2016 05:01:58 +0000 (22:01 -0700)]
mountd - Turn on SO_REUSEADDR
* Turn on SO_REUSEADDR because its kinda silly to not have it on.
* Fixes startup errors if mountd is restarted, or initially fails
due to /etc/exports issues and is then restarted a little later.
Imre Vadász [Mon, 20 Jun 2016 19:30:17 +0000 (21:30 +0200)]
kernel - Fix typo in ppoll entry in sys/kern/syscalls.
Sepherosa Ziehau [Mon, 20 Jun 2016 14:23:10 +0000 (22:23 +0800)]
intr: Avoid implicit padding
Matthew Dillon [Sun, 19 Jun 2016 23:56:03 +0000 (16:56 -0700)]
procfs - Try to workaround issue to fix truss
* If PIOCWAIT is called and no process stops are present, silently return
0 rather then EINVAL.
* Appears to fix the truss issue.
Reported-by: tkusumi
Matthew Dillon [Sun, 19 Jun 2016 23:31:25 +0000 (16:31 -0700)]
kernel - Implement PIE (place independent executables)
* Implement PIE placement and sysctl. Currently disabled by default.
If the sysctl kern.elf64.pie_base_mmap is set to 1, executable code
will be mapped with a random shift.
* Also support fixed addresses if requested in the ELF header.
Submitted-by: shamaz
Testing-by: shamaz, dillon, with help from marino
Matthew Dillon [Sun, 19 Jun 2016 23:25:16 +0000 (16:25 -0700)]
hammer2 - Implement NFS export support
* Allow a hammer2 mount to be exported. Implement required functionality:
The export structure, fhtovp, vptofh, checkexp, and hammer2_vfs_vget.
Uses the recent inode indexing changes.
* Also had to write some code to reconstruct the ip->pip linkages using
iparent.
Note that if possible I would like to remove the ip->pip stuff now that
we have iparent, at least for files. It won't work for hardlinks so...
but its still in as of this writing.
Matthew Dillon [Sun, 19 Jun 2016 19:24:32 +0000 (12:24 -0700)]
hammer2 - Implement hammer2_inode_meta.iparent
* Implement the iparent field, which points to the inode number of
the parent directory. Remove the comment as this will be used
by NFS (at least for directory inodes).
Matthew Dillon [Sun, 19 Jun 2016 18:59:30 +0000 (11:59 -0700)]
hammer2 - Change XOP feed/collect locking
* Change the way the backend passes chains back to the frontend. Instead
of requiring that the chain maintain a shared lock and bumping the lock
count we use the new data retention feature to pass the chain back
unlocked.
This fixes a whole slew of deadlock issues related to multi-node
synchronization. Concurrent XOPs could previously obtain and hold
locks on chains related to different nodes in any order, depending on
when their related threads were scheduled. Now that we no longer hold
a lock in the XOP feed, the potential deadlocks should not be possible.
* Add new hammer2_chain_*() API functions for manipulating the data hold
count and remove shims that were previously used to deal with the shared
lock hacks.
* Also fix a chain->flags adjustment that wasn't atomic, fixing a tsleep()
loop that was not breaking out.
John Marino [Sun, 19 Jun 2016 07:53:12 +0000 (09:53 +0200)]
libc/_collate_lookup: Fix segfault seen on ISO-8859-5 locales
The fix for the Russian collation issue seemed to have bug in it.
Segfault was discovered by Lauri Tirkkonen of Illumos and confirmed by
Bapt@FreeBSD.org. Lauri suggested this as a fix, but as of this writing
Bapt hasn't confirmed that this is final solution for FreeBSD. Due to
personal reasons, I cannot wait for this confirmation longer. I believe
this fix is better than what is currently in place, even if it is not
the final solution.
John Marino [Sun, 19 Jun 2016 07:39:34 +0000 (09:39 +0200)]
mbsnrtowcs/wcsnrtombs: Fix EILSEQ handling
Originally reported on FreeBSD (PR 209907) by Roel Standaert, RockinRoel
noticed that DragonFly suffered the same bug. When the title functions
encounter a character that cannot be converted, they should change the
src pointer to point to the character positioned immediately after the
failed character, but no such change was performed.
YellowRabbit improved on the FreeBSD patch addressing the bug with a
new version that eliminates possible NULL pointer dereferences.
Dragonfly-bug: <https://bugs.dragonflybsd.org/issues/2920>
Imre Vadasz [Sat, 12 Dec 2015 20:26:09 +0000 (21:26 +0100)]
kernel - Implement ppoll system call with precise microseconds timeout.
* Implement a maximum timeout of 2000s, because systimer(9) just accepts an
int timeout in microseconds.
* Add kern.kv_sleep_threshold sysctl variable for tuning the threshold for
the ppoll sleep duration (in nanoseconds), below which we will
busy-loop with DELAY instead of using tsleep for waiting.
Matthew Dillon [Sun, 19 Jun 2016 05:01:00 +0000 (22:01 -0700)]
hammer2 - Start work on inode indexing - MAJOR CHANGE
A major change to how inodes work is required to allow NFS exports to be
supported and also to make mirroring operations optimal. Both needs are
met by indexing most inodes so they can be looked up by inode number. I've
tried to avoid having to do this for well over 2 years now but I finally
came to the conclusion that necessary features and efficiencies are
impossible without it.
+ For NFS exports we have to be able to lookup a file by inode number in
order to be able to translate NFS file handles.
+ Mirroring, Multi-Master, and other Multi-Node operations are extremely
inefficient when a large file or (even worse) some high-level directory
is renamed, because the synchronization code cannot be made aware of the
rename. The synchronizer winds up making a copy of the file or the
directory subhierarchy and that can be disaster if it winds up being
terrabytes.
To solve these problems we treat nearly ALL directory entries as hardlink
targets and place the hardlink target at the root of the PFS. This puts
nearly all inodes in a readily accessible place indexed by inode number.
This means we can now implement inode number lookups for NFS, and it also
means that the synchronizer does not have to copy anything when a file or
directory is renamed.
* Implement these changes by using an abbreviated version of the hardlink
target code that we already have. Get rid of the common-parent-directory
code (we will use the PFS iroot), and force almost everything to be a
hardlink.
* device nodes and softlinks are excepted. These work as they did before,
created in the directory entry itself and not as a hardlinked inode.
I might have to make these hardlink targets in the future too but at
the moment it looks avoidable.
* Unfortunately, this change creates a number of REGRESSIONS:
(1) There are now two inodes per file instead of one. The real inode
indexed in the PFS root, and the hardlink pointer in the directory
entry. This eats another 1KB.
(2) We lose a bunch of sequential layout optimizations (for the moment),
particularly when stat()ing directory entries.
(3) Inode creation and deletion ops will unfortunately cause more SMP
conflicts due to all the inodes being indexed in one place.
Matthew Dillon [Sat, 18 Jun 2016 18:17:24 +0000 (11:17 -0700)]
nvme - Remove debugging
* Remove some debugging output.
Matthew Dillon [Sat, 18 Jun 2016 17:27:43 +0000 (10:27 -0700)]
nvme - Fix composite temperature in nvmectl
* Fix composite temperature reporting in nvmectl. There are two byte
fields making up a 16-word, not two separate byte fields. The confusion
stemmed from the fact that the 16-bit word is not word-aligned (possibly
the only field in the entire spec that isn't aligned!).
* Add 'errors' directive to dump error logs.
Imre Vadász [Sat, 18 Jun 2016 13:38:19 +0000 (15:38 +0200)]
sleep.9: Make tsleep_interlock(9) example a bit more correct.
* After having called tsleep_interlock, we should still pass the flags
into the tsleep call as well (i.e. using "flags | PINTERLOCKED" for the
flags parameter, instead of just PINTERLOCKED).
Sascha Wildner [Sat, 18 Jun 2016 13:07:42 +0000 (15:07 +0200)]
Sync zoneinfo database with tzdata2016e from ftp://ftp.iana.org/tz/releases
* Africa/Cairo observes DST in 2016 from July 7 to the end of October.
Guess October 27 and 24:00 transitions. (Thanks to Steffen Thorsen.)
For future years, guess April's last Thursday to October's last
Thursday except for Ramadan.
* Locations while uninhabited now use '-00', not 'zzz', as a
placeholder time zone abbreviation. This is inspired by Internet
RFC 3339 and is more consistent with numeric time zone
abbreviations already used elsewhere. The change affects several
arctic and antarctic locations, e.g., America/Cambridge_Bay before
1920 and Antarctica/Troll before 2005.
* Asia/Baku's 1992-09-27 transition from +04 (DST) to +04 (non-DST) was
at 03:00, not 23:00 the previous day. (Thanks to Michael Deckers.)
Matthew Dillon [Sat, 18 Jun 2016 06:58:49 +0000 (23:58 -0700)]
nvme - Add nvmectl userland utility
* Add nvmectl, a general userland utility that we will use to retrieve
status and do interesting things to nvme devices.
Nominally feature the command to allow nvmeX devices to be specified
at the end, and to apply the command to all nvme devices if none are
specified.
* Implement 'nvmectl info'. This command retrieves the SMART info from
specified controllers and prints it out.
Matthew Dillon [Sat, 18 Jun 2016 06:57:25 +0000 (23:57 -0700)]
nvme - Implement ioctl support to retrieve log pages
* Implement general ioctl support
* Implement NVMEIOCGETLOG which retrieves a log page.
Matthew Dillon [Sat, 18 Jun 2016 02:55:15 +0000 (19:55 -0700)]
nvme - Fail gracefully if chip cannot be enabled
* Fail gracefully rather than lockup if the chip refuses to enable.
The admin thread is not running yet, so don't wait forever for
it to 'stop'.
Matthew Dillon [Fri, 17 Jun 2016 23:59:28 +0000 (16:59 -0700)]
nvme - Work w/qemu
* Work with qemu nvme emulation. Note that qemu nvme emulation is really
slow, much much slower than its scsi (aka ahci) emulation.
A pci_enable_busmaster() was needed.
* Remove manual PCI config to [re]-enable BIOS PCI interrupt. This was
pulled from another driver and probably is not needed.
* Fix kldunload()ing when bar4/5 is present. The wrong resource pointer
was being specified.
Sascha Wildner [Fri, 17 Jun 2016 15:54:31 +0000 (17:54 +0200)]
pathchk(1): Sync with FreeBSD.
Mainly for -P (POSIX wants -P).
Plus a couple of small bug fixes.
Sascha Wildner [Wed, 15 Jun 2016 17:01:29 +0000 (19:01 +0200)]
kernel/virtio: Some small stylistic cleanup.
Sascha Wildner [Thu, 16 Jun 2016 12:12:46 +0000 (14:12 +0200)]
ps(1): Add -A option, as specified by POSIX.
Still missing: -d, -G and -n.
Sepherosa Ziehau [Thu, 16 Jun 2016 06:16:04 +0000 (14:16 +0800)]
hyperv/vmbus: Factor out vmbus_msg_reset()
Sepherosa Ziehau [Thu, 16 Jun 2016 05:49:10 +0000 (13:49 +0800)]
hyperv/vmbus: Make sure that interrupt cputimer can be enabled.
Obtained-from: FreeBSD
Sepherosa Ziehau [Wed, 15 Jun 2016 11:19:47 +0000 (19:19 +0800)]
acpica: Interrupt resource lookup failure is fine. Add comment about it.
Sepherosa Ziehau [Wed, 15 Jun 2016 11:06:09 +0000 (19:06 +0800)]
mptable: Reduce log verbosity
Sepherosa Ziehau [Mon, 13 Jun 2016 01:58:01 +0000 (09:58 +0800)]
hyperv/vmbus: Complete vmbus initialization; interrupt cputimer is enabled
Most of the bits are obtained from FreeBSD. However, The interrupt bits
are reworked:
- Since the vmbus message/event interrupt works in the same fashion as
MSI-X, we just allocate MSI-X for them, instead of allocating IDT
vector, rolling vmbus own interrupt vector and turning the interrupt
handling inside-out. The standard and generic bus APIs are used to
allocate and setup per-cpu vmbus interrupt.
- Interrupt cputimer reuses the current per-cpu interrupt timer code.
- AutoEOI is not used, since we reuse the per-cpu interrupt timer IDT
vector and MSI IDT vector. After a brief discussion w/ Dexuan Cui,
I concluded that AutoEOI probably does not provide noticible performance
improvement but will introduce extra code complexity. We leave it off
for now.
Obtained-from: FreeBSD (mostly)
Sepherosa Ziehau [Tue, 14 Jun 2016 09:00:01 +0000 (17:00 +0800)]
cputimer: Add per-cpu handler and private data for interrupt cputimer.
Sascha Wildner [Wed, 15 Jun 2016 09:47:42 +0000 (11:47 +0200)]
Update the pciconf(8) database.
June 7, 2016 snapshot from http://pciids.sourceforge.net/
Sascha Wildner [Wed, 15 Jun 2016 09:45:34 +0000 (11:45 +0200)]
vmbus.4: Fix stupid typo (and installworld).
Sascha Wildner [Tue, 14 Jun 2016 18:13:42 +0000 (20:13 +0200)]
Add a vmbus(4) manual page (based on FreeBSD's).
Sascha Wildner [Tue, 14 Jun 2016 11:51:52 +0000 (13:51 +0200)]
kernel: Add vmbus module to the build.
Imre Vadász [Sun, 12 Jun 2016 13:02:57 +0000 (15:02 +0200)]
if_iwm - Fix channel list iteration in iwm_mvm_config_umac_scan().
Matthew Dillon [Mon, 13 Jun 2016 07:32:32 +0000 (00:32 -0700)]
docs - Update tuning.7
* Do a pass on tuning.7, getting rid of old cruft and putting in newer
information.
Sepherosa Ziehau [Mon, 13 Jun 2016 02:22:13 +0000 (10:22 +0800)]
hyperv: Initial import. It only contains non-intr cputimer.
Obtained-from: FreeBSD
Sepherosa Ziehau [Sun, 12 Jun 2016 23:53:38 +0000 (07:53 +0800)]
x86_64/timer: Xtimer is generic enough for per-cpu timer.
Sepherosa Ziehau [Sun, 12 Jun 2016 17:29:07 +0000 (01:29 +0800)]
tsc: Log the final TSC frequency
Sepherosa Ziehau [Sun, 12 Jun 2016 13:22:07 +0000 (21:22 +0800)]
kern: Update virtual machine detection a bit
Obtained-from: FreeBSD (partial)
Antonio Huete Jimenez [Sat, 11 Jun 2016 22:11:23 +0000 (15:11 -0700)]
Makefile.usr - Fix typo
Matthew Dillon [Sat, 11 Jun 2016 22:12:51 +0000 (15:12 -0700)]
hammer2 - Use B_IOISSUED
* Get rid of the hokey B_IODEBUG use case H2 had before.
* Integrate B_IOISSUED (which used to be B_IODEBUG) into hammer2 properly.
* Remove the dio crc_good_mask hack. Now that hammer2_chain is more
persistent we just use the flag in the cached hammer2_chain structure,
clearing it if we determine that the kernel had to re-issue read I/O
(at least in the full-block case).
* Has approximately the same performance as the dio crc_good_mask hack had
and is a bit safer w/regards to chain aliasing to the same physical block.
Matthew Dillon [Sat, 11 Jun 2016 22:10:22 +0000 (15:10 -0700)]
kernel - B_IODEBUG -> B_IOISSUED
* Rename this flag. It still operates the same way.
This flag is set by the kernel upon an actual I/O read into a buffer
cache buffer and may be cleared by the filesystem code to allow the
filesystem code to detect when re-reads of the block cause another I/O
or not. This allows HAMMER1 and HAMMER2 to avoid calculating the check
code over and over again if it has already been calculated.
Antonio Huete Jimenez [Sat, 11 Jun 2016 21:47:47 +0000 (14:47 -0700)]
Makefile.usr - A bit of cleanup
- Use targets instead of .if in a few checks.
- Exit on error for better scripting
Imre Vadász [Sat, 11 Jun 2016 15:54:44 +0000 (17:54 +0200)]
if_iwm - Add and use iwm_phy_db_free(), to plug phy_db memory leak.
* Memory leakage in M_DEVBUF is now at ca. 2KB for each iwm(4) module
load/unload cycle.
Taken-From: Linux iwlwifi