François Tigeot [Sun, 23 Feb 2020 13:53:14 +0000 (14:53 +0100)]
drm/linux: Add prefetchw()
François Tigeot [Sun, 23 Feb 2020 13:52:33 +0000 (14:52 +0100)]
drm/linux: Add pagefault_disabled
Sascha Wildner [Sat, 22 Feb 2020 07:52:20 +0000 (08:52 +0100)]
realpath.3: Sort SEE ALSO.
Sascha Wildner [Sat, 22 Feb 2020 07:39:45 +0000 (08:39 +0100)]
gcore(1): Don't include <sys/user.h>. Only <errno.h> is needed.
Sascha Wildner [Sat, 22 Feb 2020 06:50:26 +0000 (07:50 +0100)]
Remove <sys/user.h> inclusion from a few files that don't need it.
Sascha Wildner [Sat, 22 Feb 2020 06:50:06 +0000 (07:50 +0100)]
<sys/user.h>: Restrict inclusion to userland and fix a comment typo.
Sascha Wildner [Sat, 22 Feb 2020 04:26:39 +0000 (05:26 +0100)]
boot: Small indent fix.
Sascha Wildner [Fri, 21 Feb 2020 17:18:13 +0000 (18:18 +0100)]
nfssvc.2/quotactl.2: Fix includes.
François Tigeot [Thu, 20 Feb 2020 13:08:10 +0000 (14:08 +0100)]
drm/i915: Use stop_machine()
François Tigeot [Thu, 20 Feb 2020 13:06:06 +0000 (14:06 +0100)]
drm/linux: Add stop_machine()
François Tigeot [Thu, 20 Feb 2020 13:00:10 +0000 (14:00 +0100)]
drm: Add linux/mempolicy.h
Matthew Dillon [Mon, 17 Feb 2020 08:07:56 +0000 (00:07 -0800)]
dsynth - Fix generic flavor scan for the Nth time
* The commit in
4986398e0bafb7 fixed one problem and created another
because it didn't quite skip past enough of the checks on the dummy
node. Skip past the last check as well so we can decend into the
true dependency.
Sascha Wildner [Sun, 16 Feb 2020 21:05:47 +0000 (22:05 +0100)]
Remove old binutils225 manual pages via 'make upgrade'.
Matthew Dillon [Sun, 16 Feb 2020 19:57:12 +0000 (11:57 -0800)]
dsynth - pkg install -U and improve debug support
* pkg install now passes the -U option to avoid trying to access remote
repos.
* Multiple -d's will turn off ncurses and log to stdout, but what we
really want to do is to log to a file. Fix thet 'single -d option'
feature to now log to 07_debug.log.
* Package dependencies on generic flavored ports (i.e. a depdency which
does not specify the flavor) were sometimes improperly flagged for
building even when the default flavor had failed.
Reported-by: zrj
Sascha Wildner [Sun, 16 Feb 2020 09:30:16 +0000 (10:30 +0100)]
Update the pciconf(8) database.
February 12, 2020 snapshot from https://pci-ids.ucw.cz
Justin C. Sherrill [Sat, 15 Feb 2020 22:01:30 +0000 (17:01 -0500)]
5.9 setup
Matthew Dillon [Sat, 15 Feb 2020 19:46:32 +0000 (11:46 -0800)]
kernel - Microoptimization, avoid dirtying vm_page_hash entry
* Avoid dirtying the vm_page_hash entry unnecessarily with a
ticks update if the existing field already has the correct value.
The VM page hash has an extreme level of SMP concurrency, so
avoiding cache coherency contention is important.
Matthew Dillon [Sat, 15 Feb 2020 19:44:57 +0000 (11:44 -0800)]
kernel - Micro-optimization, only set v_lastwrite_ts for regular files
* When mmap()ing a file SHARED/RW, only update v_lastwrite_ts
for regular files. This avoids an unnecessary exclusive lock
and related SMP contention on devices (such as /dev/lpmap).
Matthew Dillon [Sat, 15 Feb 2020 19:43:10 +0000 (11:43 -0800)]
kernel - Micro optimization for vnode exclusive lock
* Micro-optimize open(... O_RDWR) by allowing a shared vnode lock for
this case when opening a file which is not an executable.
We used to unconditionally get an exclusive lock to deal with VTEXT vs
O_RDWR races against executables, but this can cause unnecessary SMP
contention on normal files and devices opened O_RDWR which are not
executables.
zrj [Mon, 3 Feb 2020 11:11:12 +0000 (13:11 +0200)]
objformat(1): Remove incremental-dump handling.
Utility is incremental linking test/debug tool and no longer installed
in binutils-2.34. Remove handling for consistency.
zrj [Sat, 15 Feb 2020 16:22:13 +0000 (18:22 +0200)]
Retire the binutils-2.25.
zrj [Mon, 3 Feb 2020 10:22:06 +0000 (12:22 +0200)]
binutils225: Unhook from the build.
Remove makefiles and installed parts.
zrj [Mon, 3 Feb 2020 10:19:37 +0000 (12:19 +0200)]
binutils234: Hook to the buildworld as alternative set.
Requires changes in kernel and sys/boot/efi.
zrj [Mon, 3 Feb 2020 09:29:13 +0000 (11:29 +0200)]
binutils234: Add pregenerated manpages.
Only ld.1 and gprof.1 are now provided in tarball. Add all locally.
zrj [Mon, 3 Feb 2020 09:25:41 +0000 (11:25 +0200)]
binutils234: Add configs and pregenerated headers.
zrj [Mon, 3 Feb 2020 09:20:45 +0000 (11:20 +0200)]
binutils234: Add build makefiles.
zrj [Mon, 3 Feb 2020 08:55:39 +0000 (10:55 +0200)]
binutils234: Add DF READMEs and local modifications.
zrj [Sat, 15 Feb 2020 16:10:26 +0000 (18:10 +0200)]
Merge remote-tracking branch 'origin/vendor/BINUTILS234'
zrj [Mon, 3 Feb 2020 08:54:31 +0000 (10:54 +0200)]
Initial import of binutils 2.34 on vendor branch
François Tigeot [Sat, 15 Feb 2020 13:28:39 +0000 (14:28 +0100)]
drm/i915: Update base driver to
20160725
François Tigeot [Sat, 15 Feb 2020 13:27:35 +0000 (14:27 +0100)]
drm/linux: Add drain_workqueue()
François Tigeot [Sat, 15 Feb 2020 13:23:33 +0000 (14:23 +0100)]
drm/linux: Add boot_cpu_data
François Tigeot [Sat, 15 Feb 2020 12:52:46 +0000 (13:52 +0100)]
drm/linux: Improve linux/bug.h
François Tigeot [Sat, 15 Feb 2020 12:52:05 +0000 (13:52 +0100)]
drm/linux: Improve linux/fence.h
François Tigeot [Sat, 15 Feb 2020 12:51:31 +0000 (13:51 +0100)]
drm/linux: Add jiffies_to_nsecs()
François Tigeot [Sat, 15 Feb 2020 12:49:49 +0000 (13:49 +0100)]
drm/linux: Fix DEFINE_SPINLOCK()
Sascha Wildner [Sat, 15 Feb 2020 10:45:24 +0000 (11:45 +0100)]
Stop passing "-s labels" to vnconfig(4).
The option is deprecated and has no effect.
Aaron LI [Sat, 15 Feb 2020 07:56:21 +0000 (15:56 +0800)]
rcorder(8): Import rcorder-visualize.sh from FreeBSD
This script draws the dependency graph for rc scripts using dot(1).
This script was imported to FreeBSD from NetBSD.
Aaron LI [Sat, 15 Feb 2020 07:44:17 +0000 (15:44 +0800)]
mtree.8: Add examples to create /var and /usr/include hierarchies
These two examples are actually used in the Makefiles and installer.
While there, fix a markup in the COMPATIBILITY section.
Suggested-by: noob237 (Gonzalo Nemmi)
Matthew Dillon [Sat, 15 Feb 2020 05:46:15 +0000 (21:46 -0800)]
dsynth - Adjust 'Limit' display in upper right-hand corner
* Increase the Limit display from 2 to 3 digits, fixing minor
display corruption when dsynth is run with greater than 99
worker slots.
Matthew Dillon [Sat, 15 Feb 2020 05:39:32 +0000 (21:39 -0800)]
kernel - Reduce SMP contention during low-memory stress
* When memory gets low vm_page_alloc() is forced to stray into
adjacent VM page queues to find free pages. This search can
expand to the whole queue and cause massive SMP contention on
systems with many cores.
For example, if PQ_FREE has almost no pages but PQ_CACHE has
plenty of pages, the previous scan code widened its search
to the entire PQ_FREE queue (causing a ton of SMP contention)
before beginning a search of PQ_CACHE.
* The new scan code starts in PQ_FREE but once the search widens
sufficiently it will also simultaneously begin searching PQ_CACHE.
This allows the system to continue to allocate memory with minimal
contention as long as PQ_FREE or PQ_CACHE have pages.
* The new mechanism integrated a whole lot better with pageout
daemon behavior. The pageout daemon generally triggers off
the FREE+CACHE total and not (generally) off of low levels
for one or the other.
Matthew Dillon [Sat, 15 Feb 2020 05:37:32 +0000 (21:37 -0800)]
kernel - Fix rare wait*() deadlock
* It is possible for the kernel to deadlock two processes or process
threads attempting to wait*() on the same pid.
* Fix by adding a bit of magic to give ownership of the reaping
operation to one of the waiters, and causing the other waiters
to skip/reject that pid.
Matthew Dillon [Sat, 15 Feb 2020 00:04:14 +0000 (16:04 -0800)]
kernel - Recode the namecache mount transition cache
* DragonFlyBSD uses (mp, ncp/vp) tuples to track mount recursions,
which allows nullfs to not have to create shadow vnodes. However,
this means that mount points cannot be stored in the vnode structure.
Instead, DFly relies on a (mp, ncp) -> targetmp translation cache
to avoid having to scan the mountlist.
* Increase the size of this cache and convert it from a straight
single-entry prime-number mod hash to a 4-way set-associative
power-of-2 hash, and improve the hash algorithm.
* This SIGNIFICANTLY reduces lock stalls during heavliy concurrent
filesystem operations (aka bulk builds) when in the presence of
a large number of mounts (again bulk builds with [d]synth).
Matthew Dillon [Sat, 15 Feb 2020 00:01:09 +0000 (16:01 -0800)]
devfs - Clean up some SMP inefficiencies
* We don't need the devfs master lock around a setattr call. This
fixes the open(O_CREAT|O_TRUNC) path for redirects to e.g. /dev/null,
which is used all over the place in ports builds.
* The devfs spec open (open() again) path can obtain the devfs master
lock shared instead of exclusive, except in the cloning case.
This significantly reduces stalls during heavily concurrent bulk
builds.
Matthew Dillon [Fri, 14 Feb 2020 23:58:22 +0000 (15:58 -0800)]
kernel - Rejigger mount code to add vfs_flags in struct vfsops
* Rejigger the mount code so we can add a vfs_flags field to vfsops,
which mount_init() has visibility to.
* Allows nullfs to flag that its mounts do not need a syncer thread.
Previously nullfs would destroy the syncer thread after the
fact.
* Improves dsynth performance (it does lots of nullfs mounts).
Matthew Dillon [Fri, 14 Feb 2020 23:50:48 +0000 (15:50 -0800)]
kernel - Increase size of the vm_page hash table
* Increase the size of the vm_page hash table used to shortcut
page lookups during a fault. Improves the hit rate on machines
with large amounts of memory.
* Adjust the ticks overflow test from < 0 to < -1 in to avoid
getting tripped up by SMP races on the global 'ticks' variable
(which is not accessed atomically). One cpu can conceivably
update a hash ticks value while another cpu is doing a calculation
based on a stale copy of ticks.
Avoids premature vm_page_hash cache evictions due to this race.
Matthew Dillon [Fri, 14 Feb 2020 23:48:10 +0000 (15:48 -0800)]
dsynth - return max_load to 5.0x, reduce PkgDepMemoryTarget
* Restore the max_load cap to 5.0x. Instead, reduce the estimated
package dependency storage cap from PhysMem / 2 to PhysMem / 3.
* Document the effects of adding t_pw to the load calculation.
Matthew Dillon [Fri, 14 Feb 2020 06:43:21 +0000 (22:43 -0800)]
kernel - Start work on a better burst page-fault mechanic
* The vm.fault_quick sysctl is now a burst count. It still
defaults to 1 which is the same operation as before.
Performance is roughly the same with it set to 1 to 8 as
more work needs to be done to optimize pmap_enter().
Sascha Wildner [Fri, 14 Feb 2020 22:32:25 +0000 (23:32 +0100)]
Use <fcntl.h> instead of <sys/file.h> for open()'s prototype and flags.
Sascha Wildner [Fri, 14 Feb 2020 22:29:00 +0000 (23:29 +0100)]
Use our new partition id (0x6c) in several more places.
Mainly, adjust the USB img's own ID and use it in the installer's
legacy BIOS install.
While here, adjust DragonFly BSD's name in a few places (written with
a space).
Reported-by: zrj
Sascha Wildner [Fri, 14 Feb 2020 22:27:33 +0000 (23:27 +0100)]
Sync ACPICA with Intel's version
20200214.
Not much to see:
* Some improvements to sleep button handling when resuming from sleep
(which we don't support).
* New AcpiAnyGpeStatusSet() function.
* Improvements to iASL.
For detailed list, please see sys/contrib/dev/acpica/changes.txt.
François Tigeot [Fri, 14 Feb 2020 06:57:14 +0000 (07:57 +0100)]
drm: Use the Linux version of DRM_WAIT_ON()
Avoid sleeping for vblank events when not required.
François Tigeot [Fri, 14 Feb 2020 06:50:45 +0000 (07:50 +0100)]
drm/linux: Rework wait_event_xxx and finish_wait functions
* Add required task state change in finish_wait()
* Move some formerly inline code to linux_wait.c in order to avoid
problematic header interactions
zrj [Thu, 13 Feb 2020 11:59:17 +0000 (13:59 +0200)]
sys/boot: Unbreak efi build with WORLD_LDVER=ld.bfd.
Adjust 64-bit libstand.a location to use full path if available as it is
done for 32-bit loaders already.
The ld.bfd when invoked with -ffreestanding will not look for /usr/lib.
Match ldscript.x86_64 to sys/platform/pc64/conf/ldscript.x86_64.
Our ld.bfd does not support freebsd emulation target. Generic target is
enough to convert boot1.sym and loader.sym intermediates to PE32+.
While there, use something sane for section padding in ldscript.x86_64.
zrj [Thu, 13 Feb 2020 11:52:39 +0000 (13:52 +0200)]
gcc80: Unbreak ctools on OpenBSD.
For some reason cpp(1) has --traditional-cpp enforced in clang-cpp, that
does not expand macros with spaces like "FOO (blah);".
Use plain cc -E -P for compat in NXCC for now (both headers do not
depend on arch specific code or external headers, so it should be safe.
Matthew Dillon [Fri, 14 Feb 2020 06:13:57 +0000 (22:13 -0800)]
rdrand - Document massive improvement in performance
* Document the huge difference going from 512 to 16 bytes. General
system performance is improved by 9.3% on a TR3990X.
This is not entirely the fault of rdrand. It is also in a large
part due to the overhead of add_buffer_randomness().
Matthew Dillon [Fri, 14 Feb 2020 05:53:04 +0000 (21:53 -0800)]
kernel - Add more dtypes to sys/dtype.h
* Add a few from FreeBSD, and also add some placeholder types for
"encrypted" and "unspecified" partition types.
Matthew Dillon [Fri, 14 Feb 2020 05:52:29 +0000 (21:52 -0800)]
pctrack - Fix symbol table scan
* Fix the symbol table scan to properly parse kernel symbols and not
get confused by non-code symbols.
Matthew Dillon [Fri, 14 Feb 2020 05:48:23 +0000 (21:48 -0800)]
kernel - Offset the stathz systimer by 50% of the hz timer
* Offset the initial starting point of the stathz systimer by
50% of the hz timer, so they do not interfere with each other
if they happen to be set to the same frequency.
* Change the default stathz frequency to hz + 1 (101hz) so it
slides across the tick interval window.
Matthew Dillon [Fri, 14 Feb 2020 05:39:17 +0000 (21:39 -0800)]
kernel - Reduce excessive rdrand harvesting
* Our rdrand driver harvests 512 bytes on each cpu thread at a rate
of 10hz. Ryzen CPUs appear to burn about 0.73uS per word, creating
an overhead of about 460uS/sec on EACH cpu thread in the system.
When added to the even higher overhead of the add_buffer_randomness()
call, the result was a roughly 3% loss of performance across the board.
* Reduce the harvest size to 16 bytes, which honestly is still plenty
of entropy to inject.
* Change some symbolic branch targets to local branch targets in the
rdrand and padlock code to avoid generating symbols that can cause
weird output in our PC sampler (I was getting 'loop+N' and 'out+N'
while testing the above).
Matthew Dillon [Thu, 13 Feb 2020 03:39:12 +0000 (19:39 -0800)]
kernel - Improve tmpfs support
* When a file in tmpfs is truncated to a size that is not on a block
boundary, or extended (but not written) to a size that is not on a
block boundary, the nvextendbuf() and nvtruncbuf() functions must
modify the contents of the straddling buffer and bdwrite().
However, a bdwrite() for a tmpfs buffer will result in a dirty buffer
cache buffer and likely force it to be cycled out to swap relatively
soon under a modest load. This is not desirable if there is no memory
pressure present to force it out.
Tmpfs almost always uses buwrite() in order to leave the buffer 'clean'
(the underlying VM pages are dirtied instead), to prevent unecessary
paging of tmpfs data to swap when the buffer gets recycled or the vnode
cycles out.
* Add support for calling buwrite() in these functions by changing the
'trivial' boolean into a flags variable.
* Tmpfs now passes the appropriate flag, preventing the undesirable
behavior.
François Tigeot [Thu, 13 Feb 2020 19:04:51 +0000 (20:04 +0100)]
drm/linux: Add framebuffer_alloc() and framebuffer_release()
Sascha Wildner [Thu, 13 Feb 2020 18:01:09 +0000 (19:01 +0100)]
wait.2: Fix markup and comment out some references.
François Tigeot [Thu, 13 Feb 2020 06:58:07 +0000 (07:58 +0100)]
drm/linux: Implement outb()
zrj [Sun, 2 Feb 2020 11:35:20 +0000 (13:35 +0200)]
Makefile.inc1: Pass _SHLIBDIRPREFIX for btools too.
Allow to to link in with crossworld target built libraries if available.
These are not needed for DragonFly crossworld target, however helps with
bootstrapping on OpenBSD or glibc based systems where static versions
are not available or mismatches the lex(1) too much.
zrj [Sun, 2 Feb 2020 11:14:22 +0000 (13:14 +0200)]
grep(1): Disable use of unlocked IO in btools.
Our use of grep during bootstrapping process does not require it and
solves issues when bootstrapping utility on OpenBSD.
zrj [Sun, 2 Feb 2020 11:05:16 +0000 (13:05 +0200)]
find(1): Disable use of getvfsbyname(3) in btools.
Not needed by our buildworld/buildkernel infrastructure.
This allows utility to be used on OpenBSD host.
zrj [Sun, 2 Feb 2020 11:03:09 +0000 (13:03 +0200)]
find(1): Reduce bootstrap locale specific requirements.
The F_TIME2_T functionality is not available on all OSes.
While there, check for D_MD_ORDER availability and fallback to month
first if not available.
zrj [Sun, 2 Feb 2020 10:55:34 +0000 (12:55 +0200)]
find(1): Check for _ST_FLAGS_PRESENT_ availability.
zrj [Sun, 2 Feb 2020 10:53:58 +0000 (12:53 +0200)]
find(1): Check for FTS_CONST just like mtree(8).
For compatibility with OpenBSD.
zrj [Sun, 2 Feb 2020 10:36:09 +0000 (12:36 +0200)]
cp(1): Check for FTS_CONST just like mtree(8).
For compatibility with OpenBSD.
zrj [Sun, 2 Feb 2020 10:33:39 +0000 (12:33 +0200)]
cp(1): Check for _ST_FLAGS_PRESENT_ availability.
zrj [Sun, 2 Feb 2020 10:32:59 +0000 (12:32 +0200)]
cp(1): Install signal handler only if SIGINFO is available.
zrj [Sun, 2 Feb 2020 10:29:06 +0000 (12:29 +0200)]
mv(1): Check for _ST_FLAGS_PRESENT_ availability.
While there, check for MNAMELEN availability that indicates presence of
f_mntonname field in struct statfs.
zrj [Sun, 2 Feb 2020 10:28:47 +0000 (12:28 +0200)]
mv(1): Install signal handler only if SIGINFO is available.
zrj [Sun, 2 Feb 2020 10:26:14 +0000 (12:26 +0200)]
rm(1): Check for _ST_FLAGS_PRESENT_ availability.
This allows utility to be used on glibc based systems.
zrj [Sun, 2 Feb 2020 10:24:31 +0000 (12:24 +0200)]
rm(1): Check for whiteout support.
The S_IFWHT is not available on all OSes.
zrj [Sun, 2 Feb 2020 10:22:14 +0000 (12:22 +0200)]
rm(1): Install signal handler only if SIGINFO is available.
zrj [Sun, 2 Feb 2020 10:12:38 +0000 (12:12 +0200)]
gzip(1): Check for _ST_FLAGS_PRESENT_ availability.
zrj [Sun, 2 Feb 2020 10:11:05 +0000 (12:11 +0200)]
m4(1): Add const attribute to __progname.
zrj [Sun, 2 Feb 2020 10:07:45 +0000 (12:07 +0200)]
stat(1): Disable -H support in btools.
While there, limit features in btools too (used only in sys/boot/).
zrj [Sun, 2 Feb 2020 09:52:15 +0000 (11:52 +0200)]
install(1): Check for _ST_FLAGS_PRESENT_ availability.
Our <sys/stat.h> provides the definition. Unbreaks build on Linux.
zrj [Sun, 2 Feb 2020 09:50:09 +0000 (11:50 +0200)]
install(1): Use at least 6 characters for template.
Glibc mkstemp(3) requires at least 6 characters in template.
zrj [Sun, 2 Feb 2020 09:44:01 +0000 (11:44 +0200)]
install(1): Disable -N option in btools.
Our installworld has installcheck and preupgrade and we do not make use
of mtree -N option. This removes hard dependency on pwcache(3).
zrj [Sun, 2 Feb 2020 09:38:54 +0000 (11:38 +0200)]
mtree(8): Add missing checks for HAVE_STRUCT_STAT_ST_FLAGS.
This make mtree(8) usable when building on Linux host.
While there, adjust few included headers too.
zrj [Sun, 2 Feb 2020 09:34:00 +0000 (11:34 +0200)]
mtree(8): Disable -N option in btools.
Our installworld has installcheck and preupgrade and we do not make use
of mtree -N option. This removes hard dependency on pwcache(3).
Matthew Dillon [Thu, 13 Feb 2020 01:12:29 +0000 (17:12 -0800)]
vmstat - Change re, pi, po and fr from counts to bytes
* Change re, pi, po, and fr to bytes. re, pi, and po were previously
paging related transaction counts (not even page counts) which really
tells us absolutely nothing.
The values are now displayed in terms of bytes reclaimed, paged in,
paged out, and freed, with appropriate suffixes (nothing, K, M, or G).
Matthew Dillon [Thu, 13 Feb 2020 00:41:01 +0000 (16:41 -0800)]
tmpfs - Improve write clustering
* Setup bmap and max iosize parameters so the kernel's clustering
code can actually cluster 16KB tmpfs blocks together into 64KB
blocks.
* In low-memory situations the pageout daemon will flush tmpfs
pages via the VM page queues. This ultimately runs through
the tmpfs_vop_write() UIO_NOCOPY path which was previously using
cluster_awrite(). However, because other nearby buffers are
probably not present (buwrite()'s can allow buffers to be
dismissed early), there is nothing for cluster_awrite() to
latch onto to improve write granularity beyond 16KB.
Go back to using cluster_write() when SYNC and DIRECT are not
specified. This allows the clustering code to collect buffers
and flush them in larger chunks.
* Reduces low-memory tmpfs paging I/O overheads by 4x and
generally increases paging throughput to SSD-based swap by
2x-4x. Tmpfs is now able to issue a lot more 64KB I/Os when under
memory pressure.
Matthew Dillon [Thu, 13 Feb 2020 00:37:56 +0000 (16:37 -0800)]
kernel - Add vm.pageout_allow_active sysctl
* Add vm.pageout_allow_active sysctl and default to 1. The pageout
daemon scans inactive pages for work. This sysctl allows the pageout
daemon to cluster nearby active OR inactive pages with the inactive
page it found.
Default to enabled.
Matthew Dillon [Wed, 12 Feb 2020 20:47:07 +0000 (12:47 -0800)]
tmpfs - Flush and recycle pages quickly during heavy paging activity
* When the pagedaemon is operating any write()s made via tmpfs will
be forced to operate through the buffer cache via cluster_write()
or bdwrite() instead of using buwrite().
This will cause the pages to be pipelined to backing store (swap)
under these conditions, making them clean immediately to avoid
having tmpfs cause further paging pressure on the system when it
is already under paging pressure.
* In addition, the B_TTC flag is set on these buffers to attempt to
recycle the pages directly into PQ_CACHE ASAP after they are flushed.
* Implement cluster_write() operation by default to try to improve
block sizes for physical I/O.
* TMPFS currently must move pages between two VM objects when
reclaiming a vnode, and back again upon re-use. The current
VM mechanism for renaming VM pages dirties them and this can
potentially cause the paging system to thrash on the same page
under heavy vnode recycling loads.
Instead of allowing this to happen, TMPFS now frees any clean
page that have backing store assigned when moving from the backing
object, and any clean pages that were instantiated from backing
store when moving to the backing object.
Matthew Dillon [Wed, 12 Feb 2020 20:46:06 +0000 (12:46 -0800)]
kernel - Add B_TTC flag (buffer cache try-to-cache flag)
* This allows filesystems to set a flag, B_TTC, that causes the
kernel to attempt to cycle the underlying pages into PQ_CACHE
when the buffer is disposed of.
Matthew Dillon [Wed, 12 Feb 2020 20:42:36 +0000 (12:42 -0800)]
kernel - Improve vm_page_try_to_cache()
* In situations where this function is not able to cache the
page due to the page being dirtied, instead of just returning
at least ensure that it is moved to the inactive queue if it
is currently on the active queue.
Matthew Dillon [Wed, 12 Feb 2020 20:25:32 +0000 (12:25 -0800)]
kernel - Adjust vm.pageout_memuse_mode
* Generally speaking pages in the INACTIVE queue are cycled through
the queue once if they are clean, and twice if they are dirty
(cycle, clean, cycle again).
This could lead to an excessive lack of progress when paging heavily
and a lot of dirty data is present in INACTIVE, such as when a great
deal of just-written tmpfs related data is present.
* Change the default to cycle dirty pages through the queue just
once, same as clean pages.
* Only scan 1/10 of each of the 1024 ACTIVE / INACTIVE queues in
each pass. This is particularly important for the INACTIVE queue
because we want to give pages in this queue a chance to reactivate
and they might not be given that chance when the full queue is
scanned.
This became an issue when we greatly increased the number of queues
in previous SMP partitioning work (the CPU topology is mapped onto to
the queue space for initial allocation attempts in order to reduce
contention).
Matthew Dillon [Wed, 12 Feb 2020 20:20:47 +0000 (12:20 -0800)]
kernel - Require pages to be PQ_ACTIVE for quick vm_page soft refs
* Require the a VM page be ACTIVE when allowing a quick soft ref
on it for a vm_fault(). If the page is not ACTIVE, the fault will
access it normally and activate it.
Previously the page could be ACTIVE or INACTIVE.
* This ensures that pages often-referenced by vm_fault do not accidently
flow through states and get freed prematurely, causing light paging
to unnecessarily stall active processes.
Matthew Dillon [Wed, 12 Feb 2020 20:16:36 +0000 (12:16 -0800)]
dsynth - Add one more column for 'Lines'
* Increase Lines column width from 6 digits to 7.
Matthew Dillon [Wed, 12 Feb 2020 02:43:25 +0000 (18:43 -0800)]
vmstat - Adjust format slightly
* Adjust the format for the "r b w" columns slightly to keep things
aligned on modern systems.
Matthew Dillon [Tue, 11 Feb 2020 19:20:56 +0000 (11:20 -0800)]
dsynth - Adjust load calculations
* Change the load cap from 5.0 x ncpus to 4.0 x ncpus.
* Add vmtotal.t_pw to the 1-minute load. This adds in any processes
waiting on a page-fault in an attempt to not improperly increase the
job cap due to low swap-thrashing-caused load averages.
Sascha Wildner [Tue, 11 Feb 2020 15:20:02 +0000 (16:20 +0100)]
flock.2: Document the correct header (<fcntl.h>) for flock().
While here, remove some pointless quotations from the header.
Matthew Dillon [Tue, 11 Feb 2020 06:16:57 +0000 (22:16 -0800)]
dsynth - Improve auto SlowStart a bit
* When MaxWorkers is >= 16, set the SlowStart at MaxWorkers / 4 instead
of at 1 so we do not have to wait forever for it to inch up to a
stable value.
Matthew Dillon [Tue, 11 Feb 2020 06:13:13 +0000 (22:13 -0800)]
kernel - Improve pageout daemon pipelining.
* Improve the pageout daemon's ability to pipeline writes to the
swap pager. This deals with a number of low-memory situations
where the pageout daemon was stopping too early (at the minimum
free page mark).
* We don't want the pageout daemon to enforce the paging targets
after a successful pass (as this makes it impossible to actually
use the memory in question), but we DO want it to continue pipelining
if the page stats are still below the hysteresis point governed by
vm_paging_needed().