Sascha Wildner [Sat, 23 Jul 2022 00:54:35 +0000 (02:54 +0200)]
Update the pciconf(8) database.
July 17, 2022 snapshot from https://pci-ids.ucw.cz
Matthew Dillon [Wed, 13 Jul 2022 16:07:12 +0000 (09:07 -0700)]
libc - Correct handling of non-hex sequences in strtol*() and related
* The standard clarifies that non-hex sequences such as "0xy"
should return 0 and a pointer to "xy". Ours was returning
0 and a pointer to "y". Fixed.
* Unlikely to be any net effect on code other than improved
standards conformity.
Reported-by: Herbert J. Skuhra
Matthew Dillon [Mon, 11 Jul 2022 03:40:46 +0000 (20:40 -0700)]
fetch - Fix -T timeout operation for additional cases
* The -T timeout flag does not always timeout the program. The
implementation only tested it during the initial connection and
header fetch, not during data transfers.
* Makme the -T timeout apply to data transfers. However, any progress
on the transfer resets the timer.
* Volatilize several variables that need it, and add an interlock
to deal with alarm() races. Also, when the SIGALRM occurs, if
still enabled it will be re-armed and fire off every second thereafter
until processed, since the alarm might not catch a blocked system call
in progress.
Tomohiro Kusumi [Sun, 10 Jul 2022 08:38:07 +0000 (01:38 -0700)]
sys/vfs/hammer2: Remove unused local pmp variable
Warned on other platform.
error: variable 'pmp' set but not used [-Werror,-Wunused-but-set-variable]
Sergey Zigachev [Fri, 8 Jul 2022 17:31:27 +0000 (22:31 +0500)]
drm: retry page fault handler on buffer data in transit
Fixes Xorg crash on a connect/disconnect monitor when using amdgpu with
modesetting driver. Crash occured because buffer object was in transit
state. Added retry loop that allows up to 100 iterations allowing buffer object
to "catch up". During testing around 30-40 iterations were observed.
Co-authored-by: Matthew Dillon <dillon@apollo.backplane.com>
Sergey Zigachev [Fri, 8 Jul 2022 17:19:51 +0000 (22:19 +0500)]
Test commit
Tomohiro Kusumi [Wed, 6 Jul 2022 19:43:08 +0000 (12:43 -0700)]
usr.sbin/makefs: Cast daddr_t to off_t before multiplication
Apparently some large-file systems out there, such as my powerpc64le
Linux box, define daddr_t as a 32-bit type, which is sad and stymies
cross-building disk images. Cast daddr_t to off_t before doing
arithmetic that overflows.
taken-from FreeBSD
7ef082733bf8989797b71025ba6d597a7d17d92b
Tomohiro Kusumi [Wed, 6 Jul 2022 05:31:01 +0000 (22:31 -0700)]
usr.sbin/makefs: Allocate extra inodes in makefs when leaving free space in UFS images
By default, makefs(8) has very few spare inodes in its output images,
which is fine for static filesystems, but not so great for VM images
where many more files will be added. Make makefs(8) use the same
default settings as newfs(8) when creating images with free space --
there isn't much point to leaving free space on the image if you
can't put files there. If no free space is requested, use current
behavior of a minimal number of available inodes.
taken-from FreeBSD
afb6a168f8ee08ac74769464726c396fbef83d0b
Tomohiro Kusumi [Wed, 6 Jul 2022 04:54:34 +0000 (21:54 -0700)]
usr.sbin/makefs: Fix calculation of file sizes
When a new FS image is created we need to calculate how much space each
file is going to consume.
Fix two bugs in that logic:
1) Count the space needed for indirect blocks for large files.
1) Normally the trailing data of a file is written to a block of frag
size, 4 kB by default.
However for files that use indirect blocks a full block is allocated,
32kB by default. Take that into account.
Adjust size calculations to match what is done in ffs_mkfs routine:
* Depending on the UFS version the superblock is stored at a different
offset. Take that into account.
* Add the cylinder group block size.
* All of the above has to be aligned to the block size.
Finally, Remove "ncg" variable. It's always 1 and it was used to
multiply stuff.
taken-from FreeBSD
ecdc04d006de93eb343ce3b77208abd937d4f8ac
Matthew Dillon [Mon, 4 Jul 2022 07:40:55 +0000 (00:40 -0700)]
debug - more ncptrace enhancements
* Accumulate counts for unres, leafs, fache, and negative entries,
and print at the end.
Matthew Dillon [Mon, 4 Jul 2022 07:39:32 +0000 (00:39 -0700)]
kernel - Attempt to fix broken vfs.cache.numunres tracker (2)
* The main culprit appears to be cache_allocroot() accounting
for new root ncps differently than the rest of the module.
So anything which mounts and umounts continuously, like
dsynth, can seriously make the numbers whacky.
* Fix that and run an overnight test.
Matthew Dillon [Mon, 4 Jul 2022 05:28:00 +0000 (22:28 -0700)]
debug - add nc_generation output from ncptrace
* Adjust ncptrace to print the value of nc_generation
Matthew Dillon [Mon, 4 Jul 2022 03:54:42 +0000 (20:54 -0700)]
sh - Support writes to non-blocking descriptors
* Instead of reporting "write error on stdout", support writes
to non-blocking sockets by having xwrite() use poll() to block
when EAGAIN is returned.
* This is possibly related to such errors appearing in the dsynth
logs. Presumably (unverified), /bin/sh can wind up being executed
with descriptor 1 set to non-blocking. This works fine only as long
as the other end of the pipe is able to drain it quickly enough.
But under heavy loads, this might not happen.
Matthew Dillon [Mon, 4 Jul 2022 03:47:32 +0000 (20:47 -0700)]
kernel - check nc_generation in nlookup path
* With nc_generation now operating in a more usable manner, we can
use it in nlookup() to check for changes. When a change is detected,
the related lock will be cycled and the entire nlookup() will retry up
to debug.nlookup_max_retries, which currently defaults to 4.
* Add debugging via debug.nlookup_debug. Set to 3 for nc_generation
debugging.
* Move "Parent directory lost" kprintfs into a debugging conditional,
reported via (debug.nlookup_debug & 4).
* This fixes lookup/remove races which could sometimes cause open()
and other system calls to return EINVAL or ENOTCONN. Basically
what happened was that nlookup() wound up on a NCF_DESTROYED entry.
* A few minutes worth of a dsynth bulk does not report any random
generation number mismatches or retries, so the code in this commit
is probably very close to correct.
Matthew Dillon [Mon, 4 Jul 2022 00:10:51 +0000 (17:10 -0700)]
kernel - Change ncp->nc_generation operation
* Change nc_generation operation. Bit 0 is reserved. The field is
incremented by 2 whenever major changes are being made to the ncp
(linking, unlinking, destruction, resolve, unresolve, vnode adjustment),
and then incremented by 2 again when the operation is complete.
The caller can test for a major gen change using:
curr_gen = ncp->nc_generation & ~3;
if ((orig_gen - curr_gen) & ~1)
(retry needed)
* Allows unlocked/relocked code to determine whether the ncp has possibly
changed or not (will be used in upcoming commits).
* Adjust the kern_rename() code to use the generation numbers.
* Bit 0 will be used to check for a combination of major changes and
lock cycling inthe future.
Matthew Dillon [Sun, 3 Jul 2022 23:31:00 +0000 (16:31 -0700)]
kernel - Attempt to fix broken vfs.cache.numunres tracker
* Try to fix a mis-count that can accumulate under heavy loads.
* In cache_setvp() and cache_setunresolved(), only adjust the
unres count for namecache entries that are linked into the
topology.
Matthew Dillon [Fri, 1 Jul 2022 02:41:15 +0000 (19:41 -0700)]
manual - Silently accept any value for IPV6_V6ONLY sockopt
* Update the ip6(4) manual page with the new behavior of IPV6_V6ONLY.
Matthew Dillon [Fri, 1 Jul 2022 02:11:11 +0000 (19:11 -0700)]
kernel - Silently accept any value for IPV6_V6ONLY sockopt
* Numerous utilities including named behave horribly if the IPV6_V6ONLY
socketopt returns a failure. We were returning failure if anyone
attempted to set it to 0.
* just silently ignore the value entirely, always return success. Fixes
named, libuv (which named uses), and numerous other ports that use
libuv or otherwise stupidly mess with this socketopt.
Reported-by: tuxillo
Sascha Wildner [Sat, 18 Jun 2022 19:36:22 +0000 (21:36 +0200)]
Add a small amdgpu.4 manual page.
Submitted-by: Sergey Zigachev <s.zi@outlook.com>
Sascha Wildner [Sat, 11 Jun 2022 07:59:09 +0000 (09:59 +0200)]
<stdlib.h>: Don't expose malloc_usable_size() in POSIX environments.
Sascha Wildner [Sat, 11 Jun 2022 07:58:32 +0000 (09:58 +0200)]
Unbreak buildworld.
csh uses its own malloc_usable_size(), which, now that its prototype comes
in with <stdlib.h>, has to take a const argument, too. :-(
Matthew Dillon [Sat, 11 Jun 2022 04:38:37 +0000 (21:38 -0700)]
kernel - namecache eviction fixes
* Fix several namecache eviction issues which interfere with nlookup*()
functions.
There is an optimization where nlookup*() avoids locking intermediate
ncp's in a path whenever possible on the assumption that the ref on
the ncp will prevent eviction. This assumption fails when the machine
is under a heavy namecache load.
Errors included spurious ENOTCONN and EINVAL error codes from file
operations.
* Refactor the namecache code to not evict resolved namecache entries
which have extra refs under normal operation. This allows nlookup*()
and other functions to operate semi-lockless for intermediate elements
in a path. However, they still obtain a ref which is a cache-unfriendly
atomic operation.
This fixes numerous weird errors that occur during heavy dsynth bulk
builds.
* Also fix a bug which evicted too many resolved namecache entries when
attempting to evict unresolved entries. This should improve performance
under heavy namecache loads a bit.
Matthew Dillon [Sat, 11 Jun 2022 04:34:27 +0000 (21:34 -0700)]
world - include malloc_np.h from stdlib.h
* Include malloc_np.h from stdlib.h as per FreeBSD manual page.
Note that linux includes it from <malloc.h>, but BSD's do not
have a malloc.h so this is the easier path for now.
Matthew Dillon [Thu, 9 Jun 2022 06:25:13 +0000 (23:25 -0700)]
libc - Fix bug in recent malloc_usable_size() support
* Add missing unlock in the bigalloc check path
* Fixes miniruby deadlock and other threaded uses of malloc_usable_size()
on large memory blocks.
Sascha Wildner [Wed, 8 Jun 2022 15:06:49 +0000 (17:06 +0200)]
malloc.3: Unify RETURN VALUES. Add info about malloc_usable_size().
Sascha Wildner [Wed, 8 Jun 2022 14:41:43 +0000 (16:41 +0200)]
kernel: Add some __printflike() to satisfy -Wsuggest-attribute=format.
Sascha Wildner [Mon, 6 Jun 2022 19:20:58 +0000 (21:20 +0200)]
newfs_msdos(8): Remove a duplicate fstat() check.
An identical check is done right after it in compute_geometry_from_file().
Spotted-by: n00b659
Tomohiro Kusumi [Mon, 6 Jun 2022 10:13:33 +0000 (19:13 +0900)]
usr.sbin/makefs: Rename vnode::logical,vflushed,malloced to start with v_
Almost all vnode fields in sys/sys/vnode.h start with v_,
so follow that naming rule in makefs(8) vnode as well.
These fields are currently only used by HAMMER2.
Antonio Huete Jimenez [Sun, 5 Jun 2022 22:34:05 +0000 (00:34 +0200)]
libc: Add malloc_usable_size(3) support.
Submitted-by: @dillon
Tomohiro Kusumi [Sun, 5 Jun 2022 04:56:58 +0000 (13:56 +0900)]
usr.sbin/makefs: Fix warnings (in FreeBSD)
taken from FreeBSD
cc1a53bc1aea0675d64e9547cdca241612906592
Tomohiro Kusumi [Sat, 4 Jun 2022 18:44:48 +0000 (03:44 +0900)]
Tomohiro Kusumi [Sat, 4 Jun 2022 11:54:35 +0000 (20:54 +0900)]
usr.sbin/makefs: Add HAMMER2 support
This commit adds HAMMER2 image creation support for makefs(8).
It runs newfs_hammer2(8) and then sys/vfs/hammer2 logic in userspace
to create HAMMER2 image from a given directory.
This commit splits newfs_hammer2(8) into newfs and mkfs part simlarly
to newfs_msdos(8), so that makefs(8) can use newfs functionality.
The entire sys/vfs/hammer2 (with exception of unneeded
hammer2_{bulkfree,ccms,iocom,ioctl,msgops,synchro}.[hc] and reusable
hammer2_disk.h) is copied to usr.sbin/makefs with below modification.
It intends to have minimum amount of diff against sys/vfs/hammer2.
* Header includes are modified so that it compiles in userspace.
* VFS and other kernel functions are usually implemented as simple
stub functions in hammer2_compat.h and hammer2_buf.c, but some are
commented out.
* Kernel functions such as kprintf, kmalloc, kprintf, kstrdup, etc
are implemented using corresponding libc functions.
* Lock primitives are basically NOP, and they (should) never block
as makefs(8) is a single thread program.
* struct vnode and struct buf (the ones defined locally in makefs(8),
not sys/sys/*) have new struct members only used by HAMMER2 to
emulate VFS behavior required by HAMMER2.
* Since makefs(8) is write-only, VOP_{NRESOLVE,NCREATE,NMKDIR,NLINK,
NSYMLINK,WRITE,STRATEGY} are implemented, but other VOPs just
return EOPNOTSUPP.
* VOP_{INACTIVE,RECLAIM} may be implemented and used in future to
better emulate VFS behavior to address current limitation.
* VOP_WRITE is modified to directly call VOP_STRATEGY function.
* The XOP kernel thread is modified to act as a regular function
called from VOPs, along with simplified admin code.
It currently has following limitations.
* multi-volumes is unsupported, simply due to makefs(8) only taking 1
image file path.
* Not necessarily a limitation, but it only supports populating 1 PFS,
which is "DATA" by default. Other PFSes if any won't have anything
under the root PFS inode.
* makefs(8) process gets killed by OOM for a directory with *extremely*
large number of files, depending on available memory. This is due to
the way it currently tries to flush all chains in a single VFS_SYNC.
Supporting multiple VFS_SYNC calls by checking available memory along
the way gives chance to free unused vnodes/inodes and chains. This
may be implemented in future. This limitation is specific to HAMMER2,
as all other makefs(8) filesystems are not CoW, meaning they allow
in-place write based objects creation from a top directory to bottom
whereas HAMMER2 flushes chains in bottom-up direction.
Tomohiro Kusumi [Fri, 3 Jun 2022 11:12:25 +0000 (20:12 +0900)]
sbin/newfs_hammer2: Fix `-V 1' option
It's been broken since
0b7381572b131c74051832dc251604e7f77b5a54
which introduced multi-volumes. No one probably needed to create
version 1 after that.
Remove sanity check which isn't true when using sbin/hammer2/ondisk.c
from newfs_hammer2(8).
--
$ newfs_hammer2 -V 1 /dev/ad3
Volume /dev/ad3 size 5.00GB
checkvolu header 0
0000000140000000/
0000000140000000
newfs_hammer2: Volume count 1 must be 0
Sascha Wildner [Tue, 31 May 2022 18:59:37 +0000 (20:59 +0200)]
DRIVER_MODULE.9: Fix prototype of DRIVER_MODULE_ORDERED().
Sascha Wildner [Tue, 31 May 2022 18:41:16 +0000 (20:41 +0200)]
kernel/uaudio: Change a pointer argument 0 -> NULL.
Sascha Wildner [Tue, 31 May 2022 18:32:31 +0000 (20:32 +0200)]
kern/gtaskqueue: Fix an error message typo.
Matthew Dillon [Tue, 31 May 2022 01:18:08 +0000 (18:18 -0700)]
kernel - Add kern/subr_gtaskqueue.c
* Add the gtaskqueue API
Taken-from: FreeBSD
Matthew Dillon [Tue, 31 May 2022 00:15:40 +0000 (17:15 -0700)]
kernel - Add two more DEVMETHODs (quiesce and register)
* Add DEVMETHODs quiesce and register to help with future FreeBSD
porting work.
Matthew Dillon [Tue, 31 May 2022 00:14:03 +0000 (17:14 -0700)]
build - Change '@' symlink to 'dragonfly'
* Change the '@' symlink to 'dragonfly', making it easier for
#include overlays to chain to dragonfly headers in the future.
Matthew Dillon [Tue, 31 May 2022 00:11:29 +0000 (17:11 -0700)]
kernel - Adjust devlcass arg for DRIVER_MODULE_ORDERED() macro
* Adjust the argument to pass in &devclass instead of having the macro
add the '&'. This allows NULL to be passed in, for better FreeBSD
compatibility.
Matthew Dillon [Fri, 27 May 2022 05:34:15 +0000 (22:34 -0700)]
kernel - Fix lock order reversal in cache_resolve_mp()
* This function is a helper when path lookups cross mount
boundaries.
* Locking order between namecache records and vnodes must
be { ncp, vnode }.
* Fix a lock order reversal in cache_resolve_mp() which
was doing { vnode, ncp }. This deadlock is very rare
because mount points are almost never evicted from the
namecache. However, dsynth can trigger this bug due
to its heavy use of null mounts and high concurrent path
lookup loads.
Antonio Huete Jimenez [Sun, 29 May 2022 15:03:22 +0000 (17:03 +0200)]
kernel/if: Allow setting a description for network interfaces (closes #3306)
Taken-from: FreeBSD
Improvements-by: @aly
Sascha Wildner [Mon, 23 May 2022 04:01:48 +0000 (06:01 +0200)]
<resolv.h>: Include <netinet/in.h> for in_addr and sockaddr_in.
Sascha Wildner [Sat, 21 May 2022 19:33:35 +0000 (21:33 +0200)]
bsd-family-tree: Sync with FreeBSD.
Sascha Wildner [Sat, 21 May 2022 19:33:08 +0000 (21:33 +0200)]
Update the pciconf(8) database.
May 18, 2022 snapshot from https://pci-ids.ucw.cz
Matthew Dillon [Thu, 19 May 2022 17:43:29 +0000 (10:43 -0700)]
stress - Add t_memlock.c, t_memlockall.c
* Add simple stress test for mlock()/mlockall().
Matthew Dillon [Thu, 19 May 2022 17:30:29 +0000 (10:30 -0700)]
kernel - Implement mlockall() properly
* Implement mlockall()'s MCL_CURRENT, and generalaly reimplement mlockall()
using linux-like expectations. This generally means that the system
will do a best-effort to allocate and lock the memory associated with
the process's address space.
* Prior semantics which disallowed protection changes on locked memory have
been removed. Modern applications assume that protection changes will
work on locked memory, even if it would force a fault.
* As with linux, some license is taken and mlockall() will only force fault
any copy-on-write flagged anonymous pages at the time of the call. It
will not force a copy-on-write operation on unmodified file-backed pages
that have been mapped MAP_PRIVATE, but not yet modified (still represent
the file's actual content). Nor will it force-fault the parent process's
pages when the parent issues a fork() (which forces all anonymous pages in
both the parent and child to become copy-on-write).
Such pages can still take a write-fault and be COWd. The resulting newly
allocated page will be wired as expected.
Submitted-by: tuxillo
Testing-by: tuxillo, dillon
Antonio Huete Jimenez [Mon, 16 May 2022 00:15:16 +0000 (02:15 +0200)]
mlockall.2: Point out MCL_CURRENT is not yet implemented (refs #1921).
Antonio Huete Jimenez [Sun, 15 May 2022 17:04:10 +0000 (19:04 +0200)]
ifconfig.4, bridge.4: Fix typo
Sascha Wildner [Wed, 4 May 2022 14:11:55 +0000 (16:11 +0200)]
dsynth.1: Mark up hooks with .Pa (as they are file names).
Also, mark them up in the environment variable descriptions.
Sascha Wildner [Wed, 4 May 2022 14:03:56 +0000 (16:03 +0200)]
dsynth.1: Remove duplicated 'the'.
Sascha Wildner [Tue, 3 May 2022 01:31:11 +0000 (03:31 +0200)]
libc/stdtime: Set errno to EOVERFLOW when there is an overflow.
This fixes various issues such as gmtime() returning NULL on an
out-of-bounds time_t but not setting errno, as POSIX requires.
Also, in ctime() and ctime_r(), check the result of localtime()
before passing it to asctime{,_r}().
See:
https://github.com/eggert/tz/commit/
4d306b3a17ce5ce0b33a73a90dc713d3601ea89a
Sascha Wildner [Tue, 3 May 2022 00:46:03 +0000 (02:46 +0200)]
last(1): Fix a crash when the time_t is out of range.
Taken-from: NetBSD
Aaron LI [Sat, 30 Apr 2022 14:22:20 +0000 (22:22 +0800)]
Add base64(3) man page for b64_ntop() and b64_pton()
Reviewed-and-improved-by: swildner
Aaron LI [Sat, 30 Apr 2022 13:06:40 +0000 (21:06 +0800)]
libc/net: Fix b64_pton() for some exact-sized buffer
When decoding a base64 string with padding, b64_pton() can fail when the
output buffer is exactly the needed size. For example, decoding the
following base64 string to buffer[32] would fail:
% dd if=/dev/random bs=32 count=1 | base64
FCiWkKuhdRq3tMmtAt9CpchTTYMlIW3U3gJsavDNxZI=
This commit fixes the above issue.
Reported-by: dczheng
Obtained-from: OpenBSD
See also: FreeBSD revision 275060, https://reviews.freebsd.org/D1218
Aaron LI [Sat, 30 Apr 2022 13:01:02 +0000 (21:01 +0800)]
libc/net: Multiple minor cleanups to base64.c
* Remove unused but included headers.
* Use 'unsigned char' instead of 'u_char'.
* Properly cast 'char' to 'unsigned char'.
* Remove the debug Assert()s.
Partially obtained from OpenBSD.
Sascha Wildner [Sun, 1 May 2022 14:32:55 +0000 (16:32 +0200)]
ifnet.9: Add missing whitespace.
Aaron LI [Sun, 1 May 2022 04:59:57 +0000 (12:59 +0800)]
pf: Make ":0" (noalias) also ignore link-local IPv6 addresses
When users mark an interface to not use aliases they likely also don't
want to use the link-local IPv6 address there.
For example, with the following rule to NAT IPv6:
nat on $ext_if inet6 from $int_if:network to !$int_if:network -> ($ext_if:0)
PF was selecting the link-local address (which comes the first) for the
NAT'ed IPv6 address, which was wrong and broke the NAT setup.
This commit makes PF to ignore the link-local IPv6 addresses so that the
above NAT setup would work.
Obtained-from: FreeBSD (revision 339835, review D17633)
See also: https://lists.freebsd.org/pipermail/freebsd-pf/2014-September/007441.html
Aaron LI [Sun, 1 May 2022 03:31:41 +0000 (11:31 +0800)]
pf: Fix 'set skip on' for interface groups
Previously if an interface type (without number), e.g. "set skip on vlan"
or "set skip on em" was used, it would have the *undocumented* behavior
of matching *any* interface of that type.
Now it will only match an interface which is a member of the named group.
And thus it works with interface groups of arbitrary names, e.g., one
can "set skip on home" with "home" being the group containing interfaces
"em1" and "tun0".
This results in some changed behavior:
If you currently use "set skip" with a physical interface type (e.g.
"set skip on ix") you will need to add the interface to a group of that
name: 'ifconfig ix0 group ix' or add 'group ix' to hostname.ix0.
Interfaces cloned at runtime (e.g. lo, tap, tun, vlan) default to being
in a group named after the interface type, so for these interfaces there
will be no change in the behavior unless you have deliberately changed
groups.
Obtained-from: FreeBSD (revision 337643)
Obtained-from: OpenBSD (pf_if.c,v 1.62, 1.63)
Matthew Dillon [Fri, 29 Apr 2022 23:46:09 +0000 (16:46 -0700)]
kernel - vnode recycling, intermediate fix
* Fix a condition where vnlru (the vnode recycler) can live-
lock on unsuitable vnodes in the inactive list and stop
making progress, causing the system to block.
First, don't deactivate vnodes which the inactive scan won't
recycle. Vnodes which are in the namecache topology but not
at a leaf won't be recycled by the vnlru thread. Leave these
vnodes on the active queue. This prevents the inactive queue
from filling up with vnodes that it can't recycle.
Second, the active scan in vnlru() will now call
cache_inval_vp_quick() to attempt to make a vnode presentable
so it can be deactivated. The inactive scan also does the same
thing, because some leakage can happen anyway.
* The active scan should be able to make continuous progress
as successful cache_inval_vp_quick() calls make more and more
vnodes presentable that might have previously been internal nodes
in the namecache topology. So the active scan should be able to
achieve the desired balance between the active and inactive queue.
* This should also improve performance when constant recycling
is happening by moving more of the work to the active->inactive
transition and doing less work in the inactive->free
transition
* Add cache_inval_vp_quick(), a function which attempts to trivially
disassociate a vnode from the namecache topology and will handle
any direct children if the vnode is not at a leaf (but not recursively
on its own). The definition of 'trivially' for the children are
namecache records that can be locked non-blocking, have no additional
refs, and do not record a vnode.
* Cleanup cache_unlink_parent(). Have cache_zap() use this
function instead of rerolling the same code. The cache_rename()
code winds up being slightly more complex. And now
cache_inval_vp_quick() can use the function too.
Matthew Dillon [Fri, 29 Apr 2022 05:19:28 +0000 (22:19 -0700)]
kernel - Temporary work-around for vnode recyclement problems
* vnlru deadlocks were encountered on grok while indexing ~20 million
files in deep directory trees.
* Add vfscache_unres accounting to keep track of unresolved ncp's
at the leaves of the namecache tree. Start trimming the namecache
when the unres leaf count exceeds 1/16 maxvnodes, in addition to
the other algorithms.
* Add code in vnlru to decomission vnodes with children in the namecache
when those children are trivial (e.g. unresolved, dead, or negative
entries that can be easily locked).
Matthew Dillon [Fri, 29 Apr 2022 04:59:37 +0000 (21:59 -0700)]
hammer2 - Fix issue where deleted files sometimes linger until umount (3)
* Add missing conditional on last commit
Matthew Dillon [Thu, 28 Apr 2022 19:52:46 +0000 (12:52 -0700)]
hammer2 - Fix issue where deleted files sometimes linger until umount (2)
This is related to the issue of having to retain the inodes for deleted
files that still have live references. Even though their nlinks has
dropped to 0, such inodes must be retained and be fully operational
until the last live reference goes away. When that reference DOES
go away, we need to dispose of the inode as quickly as possible.
* The last fix wasn't good enough. Some vnodes still linger for
indefinite periods of time after a rm -rf. In addition, the last
fix attempted to clean-out inodes that might have still had dirty
buffers associated with the vnode.
* Fall-back to the method that UFS and HAMMER1 use, which is to obtain a
full ref on ip->vp using vget() (or similar) that we can cycle to force
the vnode to be inactivated.
This also entails using the inode lock in the inactive/reclaim path
to interlock the ip->vp accesss, unfortunately.
The vnode buffers and inode are now cleaned up in the inactivation
path (when nlinks is 0) instead of the reclaim path.
* Validated against a (roughly) 20 million inode distfile unpack and
another few million inodes created via grok processing.
* Add a vfs support function in the kernel called vfinalize() which
operates on a referenced vnode. This function flags the vnode
for immediate deactivation when the last ref is released.
Matthew Dillon [Tue, 26 Apr 2022 03:04:01 +0000 (20:04 -0700)]
hammer2 - Fix issue where deleted files sometimes linger until umount
* Hammer2 was using a namecache heuristic to determine if a file being
deleted was still open or not, but had not coded any sort of fall-back
if the heuristic failed.
This created an issue where the inodes for deleted files would sometimes
linger until the filesystem is unmounted (typically at system shutdown).
If the system were to crash, these inodes would remain in the media
topology forever.
This case primarily occurs when a large number of files are being
deleted.
* Replace the heuristic with a proper interlock so H2 knows with certainty
whether a file being removed still has system refs on it or not.
Matthew Dillon [Tue, 26 Apr 2022 02:21:38 +0000 (19:21 -0700)]
hammer2 - Fix bulkfree bug when multiple PFSs are mounted
* If multiple PFSs from the same block device are mounted, the bulkfree
directive can sometimes free blocks that are actually not free.
This situation can only occur:
(1) When 2 or more PFS's are mounted from the same block device.
(2) When heavy file ops occur near the start of a bulkfree.
* The problem was due to the bulkfree code only flushing the
passed-in PFS before starting the scan on the device (which might
house multiple PFSs). This can cause both scan stages to occur
without a full synchronization of all modified PFSs on the device
inbetween them.
* Fixes by ensuring that all PFSs associated with the block device
are flushed with the blockfree lock held in order to get the guarantee
back.
Matthew Dillon [Mon, 25 Apr 2022 17:45:57 +0000 (10:45 -0700)]
hammer2 - report critical bulkfree transitions
* Report critical bulkfree transitions that are not supposed to happen,
such as bulkfree detecting a fully-free block (00) that is not actually
free.
Matthew Dillon [Fri, 22 Apr 2022 05:37:21 +0000 (22:37 -0700)]
hammer2 - Fix CHECK fail path that might mangle an inode in-memory
* Allowing the wrong inode block to be entered in-memory can result in
massive filesystem confusion.
* Add a sanity check in hammer2_chain_inode_find() to validate that
the inode field in an on-media inode matches the inode number we are
trying to look up. This case is not supposed to happen, but it did on
grok's /build5 partition (likely due to a bulkfree early-termination
issue which is now being looked at).
If this test fails, we simulate a CRC CHECK failure.
* Remove hammer2_cluster_resolve() (it was replaced by hammer2_cluster_check()).
Sascha Wildner [Wed, 20 Apr 2022 20:37:05 +0000 (22:37 +0200)]
<string.h>: Don't declare timingsafe_bcmp() twice when libkern.h is included.
This unbreaks the vkernel build.
Matthew Dillon [Wed, 20 Apr 2022 19:04:43 +0000 (12:04 -0700)]
dsynth - Increase ncurses rate and other fields from 5 to 6 digits
* Increase displayed rate and other numerical fields from 5 to
6 digits. Certain dsynth actions (e.g. fetch-only) can now cause
some of these numbers to exceed 5 digits.
* And for the rest, just future-proof it.
Matthew Dillon [Wed, 20 Apr 2022 18:57:40 +0000 (11:57 -0700)]
libc - Protect dbm_*() API with a mutex
* The dbm_*() API is not thread-safe. Generally speaking, libc is
expected to be thread-safe these days.
* Protect the dbm_*() API with a mutex in the DB (aka DBM) structure.
We use available pthread_mutex_*() stubs for the locking, so they
are basically NOPs for applications not linked against pthreads.
* Also protect dbm_delete() and dbm_store() with our new sigblockall() /
sigunblockall() mechanism to prevent corruption due to ^C and
other regular signals. This mechanism implements a simple kernel/user
shared-memory counter and flag, and imposes no additional system calls
in DBM's critical paths.
Matthew Dillon [Wed, 20 Apr 2022 15:47:45 +0000 (08:47 -0700)]
kernel - Add sysctl debugging
* Add debug.sysctl to log sysctl nodes as they are requested, to help
with debugging.
Matthew Dillon [Wed, 20 Apr 2022 15:45:28 +0000 (08:45 -0700)]
dsynth - Add 'fetch-only {list/everything}*' directive (3)
* dsynth still follows dependency chains when fetching, which is
kinda what we want. However, failed fetches were blocking fetches
of packages dependant on the failed one. Since we aren't building
in fetch-only mode, fake a 'success' so everything gets fetched that
can be fetched.
Matthew Dillon [Wed, 20 Apr 2022 04:27:17 +0000 (21:27 -0700)]
dsynth - Protect threaded dbm_store() calls
* dsynth makes multi-threaded calls to dbm_store(), default
to throwing a mutex around the call to avoid dbm corruption.
(libc will also be fixed to make dbm_*() calls thread-safe in
another commit, this is a catch-all).
Matthew Dillon [Wed, 20 Apr 2022 02:49:45 +0000 (19:49 -0700)]
dsynth - Add 'fetch-only {list/everything}*' directive (2)
* Add to basic help output
Matthew Dillon [Tue, 19 Apr 2022 22:55:55 +0000 (15:55 -0700)]
dsynth - Add 'fetch-only {list/everything}*' directive
* Implements a fetch-only feature which tells dsynth to fetch all
source distributions required to build the specified ports.
If 'everything' is specified, the source distribuitons needed
to build the whole of dports will be fetched.
* Any source distributions already fetched are tested against their
checksum and re-fetched if necessary.
Sascha Wildner [Sun, 17 Apr 2022 10:08:37 +0000 (12:08 +0200)]
Update the pciconf(8) database.
April 16, 2022 snapshot from https://pci-ids.ucw.cz
Aaron LI [Fri, 15 Apr 2022 14:35:27 +0000 (22:35 +0800)]
libkern: Import timingsafe_bcmp() from FreeBSD
Will be used by WireGuard.
Obtained-from: FreeBSD
Aaron LI [Fri, 15 Apr 2022 13:52:09 +0000 (21:52 +0800)]
callout.9: Multiple minor fixes
* Fix prototype of callout_init(); it has only one argument.
(reported-by: dczheng)
* Fix '.Dt' to 'CALLOUT'
* Use 'function' instead of 'macro'; those are really functions and the
'function' reads more generic.
* Mention callout_cancel() in 'RETURN VALUES' section.
* Other styles and words adjustments.
Sascha Wildner [Tue, 5 Apr 2022 12:13:18 +0000 (14:13 +0200)]
bsd-family-tree: Sync with FreeBSD.
Sascha Wildner [Fri, 1 Apr 2022 21:14:41 +0000 (23:14 +0200)]
sys/conf/files: Remove duplicate line.
Sascha Wildner [Mon, 28 Mar 2022 14:18:44 +0000 (16:18 +0200)]
Mark up defined values better in some manual pages.
Sascha Wildner [Mon, 28 Mar 2022 14:12:04 +0000 (16:12 +0200)]
kvm_open.3: Improve markup a bit.
Sascha Wildner [Wed, 23 Mar 2022 17:21:19 +0000 (18:21 +0100)]
libc: Adjust comment in Versions.def.
Sascha Wildner [Mon, 21 Mar 2022 08:29:17 +0000 (09:29 +0100)]
printf(3)/scanf(3): Make ll and L length modifiers equivalent.
POSIX defines L only for floating point numbers (long double) and ll only
for integer numbers (long long), leaving for example %Ld or %llg
undefined. GCC allows these non-portable combinations as an extension,
i.e. -Wformat will not warn about them unless -pedantic is passed.
To avoid surprises with Linux code, go with glibc and make ll and L
equivalent. For base code, we still prefer going with the standard, of
course.
Interestingly, the alternate printf(3) implementation in libc (enabled
by defining USE_XPRINTF in the environment) already treats them equally.
In-discussion-with: zrj
Sascha Wildner [Sun, 20 Mar 2022 00:46:52 +0000 (01:46 +0100)]
libc_rtld: Reduce the amount of libc code that we compile into it.
Sascha Wildner [Sat, 19 Mar 2022 08:58:42 +0000 (09:58 +0100)]
Fix some 'any more' vs. 'anymore' confusion in manpages and messages.
Matthew Dillon [Fri, 18 Mar 2022 18:34:40 +0000 (11:34 -0700)]
hammer2 - Fix panic related to usb stick pull on mounted H2 filesystem
* Fixes at least one panic related to unexpected USB stick pulls.
There may be others.
Reported-by: Donald Allen
Sascha Wildner [Fri, 18 Mar 2022 17:56:26 +0000 (18:56 +0100)]
Sync zoneinfo database with tzdata2022a from ftp://ftp.iana.org/tz/releases
* Palestine will spring forward on 2022-03-27, not 2022-03-26.
For a detailed list of changes, see share/zoneinfo/NEWS.
Sascha Wildner [Wed, 16 Mar 2022 19:24:14 +0000 (20:24 +0100)]
Update the pciconf(8) database.
March 14, 2022 snapshot from https://pci-ids.ucw.cz
Sascha Wildner [Sun, 13 Mar 2022 10:36:54 +0000 (11:36 +0100)]
Local changes for file-5.41.
Sascha Wildner [Sun, 13 Mar 2022 10:33:13 +0000 (11:33 +0100)]
Merge branch 'vendor/FILE'
Sascha Wildner [Sun, 13 Mar 2022 10:32:44 +0000 (11:32 +0100)]
vendor/file: upgrade from 5.40 to 5.41
For details, see ChangeLog.
Matthew Dillon [Sat, 12 Mar 2022 01:18:38 +0000 (17:18 -0800)]
hammer2 - Fix excess chain structure allocations during bulkfree (3)
* Fix excess chain structure allocations during bulkfree, AGAIN.
This time for real. The algorithmic change I made was not sufficient,
because I was not backing-out the recursion all the way after hitting
a deferral. Instead, the code was re-recursing down another branch
while still really deep into the tree.
The problem was mostly triggered on the inode radix tree for filesystems
containing many inodes (like a hundred million inodes), and would lead
to a kmalloc panic due to memory exhaustion.
* Fixed for real this time. When we hit the recursion limit, the code
fully backs out of the traversal (recording its placemarkers on the way
back up), then starts again at the deepest placemarker rather than
the shallowest placemarker.
Matthew Dillon [Sat, 12 Mar 2022 01:17:06 +0000 (17:17 -0800)]
kernel - provide more information on crital section count mismatch panics
* Provide more information on crital section count mismatch panics.
Include which system call number was involved, as this data might
be lost in crash dumps due to compiler optimizations.
Matthew Dillon [Sat, 12 Mar 2022 01:09:48 +0000 (17:09 -0800)]
ipfw - Fix broken mixed network and host IP specifications in ip tables
* ipfw improperly assumes that the netmask sin_len is pre-zero'd, but
prior table entries on the same command line which specify a network
mask will leave the field non-zero for later host entries also specified
on the command line:
ipfw table 1 add 10.0.0.0/8 192.0.2.1 # 2^24 + 1 addresses
# ipfw table 1 print
10.0.0.0/8
192.0.0.0/8 <--- wrong
* Fix the issue by properly initializing netmask.sin_len to 0 when a
host IP is specified.
Submitted-by: Martin Neitzel <neitzel@marshlabs.gaertner.de>
Sascha Wildner [Wed, 2 Mar 2022 16:30:59 +0000 (17:30 +0100)]
edk2: Sync our TianoCore EDK II headers with the edk2-stable202202 tag.
Matthew Dillon [Wed, 23 Feb 2022 21:50:12 +0000 (13:50 -0800)]
mount_msdos - Add /dev prefix if necessasry
* Automatically add the /dev prefix if a relative device path is
specified. For example 'vn0' turns into '/dev/vn0'.
* Fixes gpt init sequences that short-hand the device.
Reported-by: ferz
Tomohiro Kusumi [Mon, 21 Feb 2022 16:10:05 +0000 (01:10 +0900)]
sys/vfs/hammer2: Mostly trailing whitespace cleanups
Aaron LI [Sat, 19 Feb 2022 09:53:05 +0000 (17:53 +0800)]
lpr.1: Tweak a bit about '-i' option
I made a mistake so that I lost this change in a rebase...
Reviewed-by: swildner
Aaron LI [Sat, 19 Feb 2022 09:53:05 +0000 (17:53 +0800)]
lpr(1): Fix '-i' option with optional argument
lpr(1)'s '-i' option accepts an optional argument, but the
implementation was incomplete. For example, 'lpr -i -#3' errors with:
'Bad argument to -i, number expected'.
However, because the argument to '-i' option can have a leading white
space (i.e., '-i 4'), we can't use getopt(3)'s new 'option::' feature
here. Fix the code in another way inspired by dma(8) (see
'libexec/dma/dma.c').
In addition, update the usage text as well as the man page.