kernel - Add per-process capability-based restrictions * This new system allows userland to set capability restrictions which turns off numerous kernel features and root accesses. These restrictions are inherited by sub-processes recursively. Once set, restrictions cannot be removed. Basic restrictions that mimic an unadorned jail can be enabled without creating a jail, but generally speaking real security also requires creating a chrooted filesystem topology, and a jail is still needed to really segregate processes from each other. If you do so, however, you can (for example) disable mount/umount and most global root-only features. * Add new system calls and a manual page for syscap_get(2) and syscap_set(2) * Add sys/caps.h * Add the "setcaps" userland utility and manual page. * Remove priv.9 and the priv_check infrastructure, replacing it with a newly designed caps infrastructure. * The intention is to add path restriction lists and similar features to improve jailess security in the near future, and to optimize the priv_check code.
kernel: Cleanup <sys/uio.h> issues. The iovec_free() inline very complicates this header inclusion. The NULL check is not always seen from <sys/_null.h>. Luckily only three kernel sources needs it: kern_subr.c, sys_generic.c and uipc_syscalls.c. Also just a single dev/drm source makes use of 'struct uio'. * Include <sys/uio.h> explicitly first in drm_fops.c to avoid kfree() macro override in drm compat layer. * Use <sys/_uio.h> where only enums and struct uio is needed, but ensure that userland will not include it for possible later <sys/user.h> use. * Stop using <sys/vnode.h> as shortcut for uiomove*() prototypes. The uiomove*() family functions possibly transfer data across kernel/user space boundary. This header presence explicitly mark sources as such. * Prefer to add <sys/uio.h> after <sys/systm.h>, but before <sys/proc.h> and definitely before <sys/malloc.h> (except for 3 mentioned sources). This will allow to remove <sys/malloc.h> from <sys/uio.h> later on. * Adjust <sys/user.h> to use component headers instead of <sys/uio.h>. While there, use opportunity for a minimal whitespace cleanup. No functional differences observed in compiler intermediates.
kernel: Remove numerous #include <sys/thread2.h>. Most of them were added when we converted spl*() calls to crit_enter()/crit_exit(), almost 14 years ago. We can now remove a good chunk of them again for where crit_*() are no longer used. I had to adjust some files that were relying on thread2.h or headers that it includes coming in via other headers that it was removed from.
kernel - Refactor tty clist code * Remove all the old cruft, completely rewrite the clist code to use a single linear buffer and a FIFO mechanism. * The linear buffer just uses 16-bit elements in order to record TTY_QUOTE along with the character. * Fixes bug in last commit (lack of global locks around global clist caches) by removing the cache entirely.
kernel - Refactor tty_token, fix SMP performance issues * Remove most uses of tty_token in favor of per-tty tp->t_token. This is particularly important for removing bottlenecks related to PTYs, which are used all over the place. tty_token remains in a few places managing overall registration and global list manipulation. * tty structures are now required to be persistent. Implement a sepearate ttyinit() function. Continue to allow ttyregister() and ttyunregister() calls, but these no longer presume destruction of the structure. * Refactor ttymalloc() to take a **tty pointer and interlock allocations. Allocations are intended to be one-time. ttymalloc() only requires the tty_token for initial allocations. * Remove all critical section use that was combined with tty_token and tp->t_token. Leave only the tokens. The critical sections were hold-overs going all the way back to pre-SMP days. * syscons now gets its own token, vga_token. The ISA VGA code and the framebuffer code also now use this token instead of tty_token. * The keyboard subsystem now uses kbd_token instead of tty_token. * A few remaining serial-like devices (snp, nmdm) also get their own tokens, as well as use the now required tp->t_token. * Remove use of tty_token in the session management code. This fixes a niggling performance path since sessions almost universally go hand-in-hand with fork/exec/exit sequences. Instead we use the already-existing per-hash session token.
kernel: Remove the COMPAT_43 kernel option along with all related code. It is commented out in our default kernel config files for almost five years now, since 9466f37df5258f3bc3d99ae43627a71c1c085e7d. Approved-by: dillon Dragonfly-bug: <https://bugs.dragonflybsd.org/issues/2946>
devfs(9): Rename DEVFS_DECLARE_CLONE_BITMAP to DEVFS_DEFINE_CLONE_BITMAP. Also, add DEVFS_DECLARE_CLONE_BITMAP() for extern declarations, analogous to MALLOC_DEFINE() and MALLOC_DECLARE(). In the sound code, replace some externs with DEVFS_DECLARE_CLONE_BITMAP() and remove one unneeded extern.
kernel: Move semicolon from the definition of SYSINIT() to its invocations. This affected around 70 of our (more or less) 270 SYSINIT() calls. style(9) advocates the terminating semicolon to be supplied by the invocation too, because it can make life easier for editors and other source code parsing programs.