freebsd.git
3 years agolibthr: work around an ASAN false-positive
Alex Richardson [Mon, 2 Aug 2021 08:49:21 +0000 (09:49 +0100)]
libthr: work around an ASAN false-positive

I got the following error with an ASAN-instrument libthr:

==803==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fffffffcdb0 at pc 0x000801863396 bp 0x7ff8
READ of size 4 at 0x7fffffffcdb0 thread T0
    #0 0x801863395 in handle_signal /local/scratch/alr48/cheri/freebsd/lib/libthr/thread/thr_sig.c:262:2
    #1 0x801860da2 in thr_sighandler /local/scratch/alr48/cheri/freebsd/lib/libthr/thread/thr_sig.c:246:2

Address 0x7fffffffcdb0 is located in stack of thread T0 at offset 208 in frame
    #0 0x80186080f in thr_sighandler /local/scratch/alr48/cheri/freebsd/lib/libthr/thread/thr_sig.c:213

  This frame has 1 object(s):
    [32, 64) 'act' (line 216) <== Memory access at offset 208 overflows this variable
HINT: this may be a false positive if your program uses some custom stack

This seems like a false-positive since the line in question is
`SIGSETOR(actp->sa_mask, ucp->uc_sigmask);` and it complains about a read
operation (from the ucontext_t argument) so this indicates to me that ASAN
does not understand that thr_sighandler() is a signal handler.

Differential Revision: https://reviews.freebsd.org/D31074

3 years agoAdd build system support for ASAN+UBSAN instrumentation
Alex Richardson [Mon, 2 Aug 2021 08:48:21 +0000 (09:48 +0100)]
Add build system support for ASAN+UBSAN instrumentation

This adds two new options WITH_ASAN/WITH_UBSAN that can be set to
enable instrumentation of all binaries with AddressSanitizer and/or
UndefinedBehaviourSanitizer. This current patch is almost sufficient
to get a complete buildworld with sanitizer instrumentation but in
order to actually build and boot a system it depends on a few more
follow-up commits.

Reviewed By: brooks, kib, markj
Differential Revision: https://reviews.freebsd.org/D31043

3 years agotools/build: Don't redefine open() for the linux bootstrap
Alex Richardson [Mon, 2 Aug 2021 08:45:05 +0000 (09:45 +0100)]
tools/build: Don't redefine open() for the linux bootstrap

This is needed to bootstrap llvm-tblgen on Linux since LLVM calls
`::open(...)` which does not work if open is a statement macro.
Also stop defining O_SHLOCK/O_EXLOCK and update the only bootstrap tools
user of those flags to deal with missing definitions.

Reviewed By: jrtc27
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31226

3 years agoloader: tftp client should use server address from rootip
Toomas Soome [Mon, 2 Aug 2021 12:27:38 +0000 (15:27 +0300)]
loader: tftp client should use server address from rootip

servip is set from bootp bp_siaddr (if present) and rootip is
set immediately from servip in tha sane bootp code.

However, the common/dev_net.c does only set rootip (based on
url processing etc). Therefore, we should also use rootip in tftp
reader.

Fixes hung tftp based boot when bp_siaddr is not provided.

MFC after: 1 week

3 years agoIgnore ResourceProducer flag for:
Aleksandr Rybalko [Mon, 2 Aug 2021 10:41:14 +0000 (13:41 +0300)]
Ignore ResourceProducer flag for:
o Arm CoreLink TM CMN-600 Coherent Mesh Network controller,
o Arm CoreLink DMC-620 Dynamic Memory Controller.

Sponsored by: Ampere Computing LLC
Submitted by: Klara Inc.

3 years agovmm: Bump vmname buffer in struct vm to VM_MAX_NAMELEN + 1
Ka Ho Ng [Mon, 2 Aug 2021 09:54:40 +0000 (17:54 +0800)]
vmm: Bump vmname buffer in struct vm to VM_MAX_NAMELEN + 1

In hw.vmm.create sysctl handler the maximum length of vm name is
VM_MAX_NAMELEN. However in vm_create() the maximum length allowed is
only VM_MAX_NAMELEN - 1 chars. Bump the length of the internal buffer to
allow the length of VM_MAX_NAMELEN for vm name.

MFC after: 3 days
Reviewed by: grehan
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31372

3 years agoxen/timer: fix amd64 LINT kernel build
Roger Pau Monné [Mon, 2 Aug 2021 08:22:22 +0000 (10:22 +0200)]
xen/timer: fix amd64 LINT kernel build

On amd64 XENHVM depends on the xentimer device for PVH early startup,
so both should be added or removed together (like the current
dependency with xenpci). Fix this by adding xentimer to NOTES and
updating the comments on the config files. Note that on i386 there's
no such dependency between xentimer and XENHVM, since there's no PVH
support.

While there also fix the MINIMAL i386 build to include the xentimer,
so it keeps the same functionality as before xentimer was split from
XENHVM.

Reported by: lwhsu
PR: 257549
Fixes: ae5981274815 ('xen/timer: make xen timer optional')

3 years agoAdd missing file to sys/conf/files after 469884cf04a9b92677c7c83e229ca6b8814f8b0a .
Hans Petter Selasky [Mon, 2 Aug 2021 06:24:22 +0000 (08:24 +0200)]
Add missing file to sys/conf/files after 469884cf04a9b92677c7c83e229ca6b8814f8b0a .

Found by: vishwin@
Differential Revision: https://reviews.freebsd.org/D29921
MFC after: 1 week
Sponsored by: NVIDIA Networking

3 years agosched_ule(4): Use trylock when stealing load.
Alexander Motin [Mon, 2 Aug 2021 02:42:01 +0000 (22:42 -0400)]
sched_ule(4): Use trylock when stealing load.

On some load patterns it is possible for several CPUs to try steal
thread from the same CPU despite randomization introduced.  It may
cause significant lock contention when holding one queue lock idle
thread tries to acquire another one.  Use of trylock on the remote
queue allows both reduce the contention and handle lock ordering
easier.  If we can't get lock inside tdq_trysteal() we just return,
allowing tdq_idled() handle it.  If it happens in tdq_idled(), then
we repeat search for load skipping this CPU.

On 2-socket 80-thread Xeon system I am observing dramatic reduction
of the lock spinning time when doing random uncached 4KB reads from
12 ZVOLs, while IOPS increase from 327K to 403K.

MFC after: 1 month

3 years agosched_ule(4): Reduce duplicate search for load.
Alexander Motin [Mon, 2 Aug 2021 02:07:51 +0000 (22:07 -0400)]
sched_ule(4): Reduce duplicate search for load.

When sched_highest() called for some CPU group returns nothing, idle
thread calls it for the parent CPU group.  But the parent CPU group
also includes the CPU group we've just searched, and unless there is
a race going on, it is unlikely we find anything new this time.

Avoid the double search in case of parent group having only two sub-
groups (the most prominent case). Instead of escalating to the parent
group run the next search over the sibling subgroup and escalate two
levels up after if that fail too.  In case of more than two siblings
the difference is less significant, while searching the parent group
can result in better decision if we find several candidate CPUs.

On 2-socket 40-core Xeon system I am measuring ~25% reduction of CPU
time spent inside cpu_search_highest() in both SMT (2x20x2) and non-
SMT (2x20) cases.

MFC after: 1 month

3 years agoamd64 pmap_vm_page_alloc_check(): loose the assert
Konstantin Belousov [Sun, 1 Aug 2021 21:58:21 +0000 (00:58 +0300)]
amd64 pmap_vm_page_alloc_check(): loose the assert

Current expression checks that vm_page_alloc(9) never returns a page
belonging to the preload area.  This is not true if something was freed
from there, for instance a preloaded module was unloaded, or ucode update
freed.

Only check that we never allow to allocate a page belonging to the kernel
proper, check against _end.

Reported and tested by: dhw
Sponsored by: The FreeBSD Foundation
MFC after: 1 week

3 years agobhyve: net_backends, automatically IFF_UP tap devices
Bjoern A. Zeeb [Wed, 28 Jul 2021 22:53:25 +0000 (22:53 +0000)]
bhyve: net_backends, automatically IFF_UP tap devices

If you want communications with the outside world and tell bhyve to
create an interfaces then it should be usable as well.
Rather than relying on the sysctl net.link.tap.up_on_open automatically
try to IFF_UP the opened tap device.

MFC after: 10 days
Reviewed by: markj, grehan
Differential Revision: https://reviews.freebsd.org/D31342

3 years agoawk: document updating
Warner Losh [Sun, 1 Aug 2021 17:31:50 +0000 (11:31 -0600)]
awk: document updating

Fill in all the details to the standard process so they are hand in one
place and don't need to be re-remembered or rediscovered for the next
import.

Sponsored by: Netflix

3 years agoRELNOTES: update the runing entry on awk.
Warner Losh [Sun, 1 Aug 2021 17:07:29 +0000 (11:07 -0600)]
RELNOTES: update the runing entry on awk.

Dig up the major commits and document the coming -Ft change.

Sponsored by: Netflix

3 years agoawk: Merge 20210729 from One True Awk upstream (0592de4a)
Warner Losh [Sun, 1 Aug 2021 16:22:39 +0000 (10:22 -0600)]
awk: Merge 20210729 from One True Awk upstream (0592de4a)

July 27, 2021:
As per IEEE Std 1003.1-2008, -F "str" is now consistent with
-v FS="str" when str is null. Thanks to Warner Losh.

July 24, 2021:
Fix readrec's definition of a record. This fixes an issue
with NetBSD's RS regular expression support that can cause
an infinite read loop. Thanks to Miguel Pineiro Jr.

Fix regular expression RS ^-anchoring. RS ^-anchoring needs to
know if it is reading the first record of a file. This change
restores a missing line that was overlooked when porting NetBSD's
RS regex functionality. Thanks to Miguel Pineiro Jr.

Fix size computation in replace_repeat() for special case
REPEAT_WITH_Q. Thanks to Todd C. Miller.

Also, included the tests from upstream, though they aren't yet connected
to the tree.

Sponsored by: Netflix

3 years agoawk: bring in vendor branch from upstream 20210727
Warner Losh [Sun, 1 Aug 2021 16:02:22 +0000 (10:02 -0600)]
awk: bring in vendor branch from upstream 20210727

Changes since the last import:

July 27, 2021:
As per IEEE Std 1003.1-2008, -F "str" is now consistent with
-v FS="str" when str is null. Thanks to Warner Losh.

July 24, 2021:
Fix readrec's definition of a record. This fixes an issue
with NetBSD's RS regular expression support that can cause
an infinite read loop. Thanks to Miguel Pineiro Jr.

Fix regular expression RS ^-anchoring. RS ^-anchoring needs to
know if it is reading the first record of a file. This change
restores a missing line that was overlooked when porting NetBSD's
RS regex functionality. Thanks to Miguel Pineiro Jr.

Fix size computation in replace_repeat() for special case
REPEAT_WITH_Q. Thanks to Todd C. Miller.

Also, for the first time, import all the tests.

Sponsored by: Netflix

3 years agoudp: Fix soroverflow SOCKBUF unlocking
Konstantin Kukushkin [Sun, 1 Aug 2021 14:41:38 +0000 (07:41 -0700)]
udp: Fix soroverflow SOCKBUF unlocking

We hold the SOCKBUF_LOCK so use soroverflow_locked here.
This bug may manifest as a non-killable process stuck in [*so_rcv].

Approved by: scottl
Reviewed by: Roy Marples <roy@marples.name>
Fixes: 7045b1603bdf
MFC after:  10 days
Differential Revision: https://reviews.freebsd.org/D31374

3 years agoamd64 pmap_vm_page_alloc_check(): print more data for failed assert
Konstantin Belousov [Sun, 1 Aug 2021 13:38:17 +0000 (16:38 +0300)]
amd64 pmap_vm_page_alloc_check(): print more data for failed assert

Sponsored by: The FreeBSD Foundation
MFC after: 1 week

3 years agoFix typo in rib_unsibscribe<_locked>().
Alexander V. Chernikov [Sun, 1 Aug 2021 13:28:41 +0000 (13:28 +0000)]
Fix typo in rib_unsibscribe<_locked>().

Submitted by: Zhenlei Huang<zlei.huang at gmail.com>
Differential Revision: https://reviews.freebsd.org/D31356

3 years agoadd the time(1) command to the list of install tools
Wolfram Schneider [Sun, 1 Aug 2021 13:25:00 +0000 (13:25 +0000)]
add the time(1) command to the list of install tools

Reported by: dhw
Approved by: dhw
Differential Revision: https://reviews.freebsd.org/D31373

3 years agoCorrect section reference for examples in RFC3542
Tom Jones [Sun, 1 Aug 2021 12:52:07 +0000 (13:52 +0100)]
Correct section reference for examples in RFC3542

Reviewed by: bz, network
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D26272

3 years ago[multipath][nhops] Fix random crashes with high route churn rate.
Alexander V. Chernikov [Sun, 1 Aug 2021 09:46:05 +0000 (09:46 +0000)]
[multipath][nhops] Fix random crashes with high route churn rate.

When certain multipath route begins flapping really fast, it may
 result in creating multiple identical nexthop groups. The code
 responsible for unlinking unused nexthop groups had an implicit
 assumption that there could be only one nexthop group for the
 same combination of nexthops with weights. This assumption resulted
 in always unlinking the first "identical" group, instead of the
 desired one. Such action, in turn, produced a used-but-unlinked
 nhg along with freed-and-linked nhg, ending up in random crashes.

Similarly, it is possible that multiple identical nexthops gets
 created in the case of high route churn, resulting in the same
 problem when deleting one of such nexthops.

Fix by matching the nexthop/nexhop group pointer when deleting the item.

Reported by: avg
MFC after: 1 week

3 years agocam: enable kern.cam.da.enable_uma_ccbs by default
Edward Tomasz Napierala [Sun, 1 Aug 2021 09:40:38 +0000 (09:40 +0000)]
cam: enable kern.cam.da.enable_uma_ccbs by default

This makes the da(4) driver use UMA for its CCBs by default,
like ada(4) already does.  Please let me know via email
if you notice any suspicious kernel messages,

Reviewed By: imp
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D31257

3 years ago`make buildworld' with time logging for each stage
Wolfram Schneider [Sun, 1 Aug 2021 09:10:49 +0000 (09:10 +0000)]
`make buildworld' with time logging for each stage

PR:   257141
Reviewed by: sjg,emaste
Approved by: emaste
Differential Revision: https://reviews.freebsd.org/D31154

3 years agoloader: cstyle cleanup of libsa/lseek.c
Toomas Soome [Sun, 1 Aug 2021 07:07:32 +0000 (10:07 +0300)]
loader: cstyle cleanup of libsa/lseek.c

Clean up lseek.c, no functional changes intended. This is pre-patch
for open file list rewrite.

MFC after: 1 week

3 years agoloader.conf(5): mention "efi" option for "console" parameter
Li-Wen Hsu [Sat, 31 Jul 2021 22:41:49 +0000 (06:41 +0800)]
loader.conf(5): mention "efi" option for "console" parameter

PR: 213467
Reviewed by: imp
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D31368

3 years agoawk: use awkgram.tab.h consistently
Warner Losh [Sat, 31 Jul 2021 22:17:44 +0000 (16:17 -0600)]
awk: use awkgram.tab.h consistently

yacc makes awkgram.h. However, one true awk includes awkgram.tab.h, so
we link to for the builds. Make sure that we consistently link to it.
Also, restore the awkgram.tab.h dependency to maketab. It should not
have been deleted, despite apparently making meta build on stable/12
work. The important missing arc was proctab.c's dependence on
awkgram.tab.h.

MFC After: 1 day (build breakage)
Fixes: c50c8502cb629571f35089690d6e9a9bc4d60813
Sponsored by: Netflix

3 years agoLinuxKPI: fix bug in le32p_replace_bits()
Bjoern A. Zeeb [Thu, 29 Jul 2021 21:27:21 +0000 (21:27 +0000)]
LinuxKPI: fix bug in le32p_replace_bits()

Fix a bug that slipped in in 90707c4e44de03ea36be183ef2226601c66169cb
using the correct field in le32p_replace_bits().

MFC after: 3 days
Reviewed by: hselasky
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31352

3 years agoawk: Fix dependencies
Warner Losh [Sat, 31 Jul 2021 21:41:29 +0000 (15:41 -0600)]
awk: Fix dependencies

proctab.c is generated from awktab.h, so needs to depend on it.
maketab does not depend on awktab.h, and gets the maketab.c dependency
automatically, so remove them both.

Normally, these don't matter. However, for a meta build, they can cause
us to build maketab twice (once host, once for target) resulting in a
binary that can't run on the host due to proctab.c racing maketab in
parallel legs. In stable/12, this was a reliably lost race, while in
main I've been unable to trigger the race at all (maybe due to dirdep
changes making main more robust).

MFC After: 1 day (build breakage)
Reported by: kp
Sponsored by: Netflix

3 years agoigb: clean up igb_txrx comments
Kevin Bowling [Sat, 31 Jul 2021 15:04:25 +0000 (08:04 -0700)]
igb: clean up igb_txrx comments

Reviewed by: grehan
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D31227

3 years agoigc: sync igc_txrx with igb(4)
Kevin Bowling [Sat, 31 Jul 2021 15:00:16 +0000 (08:00 -0700)]
igc: sync igc_txrx with igb(4)

Reviewed by: grehan
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31227

3 years agoAdd pmap_vm_page_alloc_check()
Konstantin Belousov [Sat, 10 Jul 2021 19:53:41 +0000 (22:53 +0300)]
Add pmap_vm_page_alloc_check()

which is the place to put MD asserts about allocated pages.

On amd64, verify that allocated page does not belong to the kernel
(text, data) or early allocated pages.

Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D31121

3 years agoamd64: do not assume that kernel is loaded at 2M physical
Konstantin Belousov [Sat, 10 Jul 2021 19:48:02 +0000 (22:48 +0300)]
amd64: do not assume that kernel is loaded at 2M physical

Allow any 2M aligned contiguous location below 4G for the staging
area location.  It should still be mapped by loader at KERNBASE.

The assumption kernel makes about loader->kernel handoff with regard to
the MMU programming are explicitly listed at the beginning of hammer_time(),
where kernphys is calculated.  Now kernphys is the variable instead of
symbol designating the physical address.

Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D31121

3 years agoBump the FreeBSD version after making FPU sections thread-safe in the LinuxKPI.
Hans Petter Selasky [Sat, 31 Jul 2021 13:40:35 +0000 (15:40 +0200)]
Bump the FreeBSD version after making FPU sections thread-safe in the LinuxKPI.

Differential Revision: https://reviews.freebsd.org/D29921
MFC after: 1 week
Sponsored by: NVIDIA Networking

3 years agoLinuxKPI: Make FPU sections thread-safe and use the NOCTX flag.
Hans Petter Selasky [Sat, 31 Jul 2021 13:32:52 +0000 (15:32 +0200)]
LinuxKPI: Make FPU sections thread-safe and use the NOCTX flag.

Reviewed by: kib
Submitted by: greg@unrelenting.technology
Differential Revision: https://reviews.freebsd.org/D29921
MFC after: 1 week
Sponsored by: NVIDIA Networking

3 years agoawk: Document deprecated behavior of hex constants and locales.
Warner Losh [Sat, 31 Jul 2021 05:31:00 +0000 (23:31 -0600)]
awk: Document deprecated behavior of hex constants and locales.

FreeBSD will convert "0x12" from hex and print it as 18. Other awks will
convert it to 0. This extension has been removed upstream, and will be
removed in FreeBSD 14.0.

FreeBSD used to set the locale on startup, and make the ranges use that
locale. This lead to weird results like "[A-Z]" matching lower case
characters in some locales. This bug has been fixed.

MFC After: 3 days
Sponsored by: Netflix

3 years agoawk: Flag -Ft as deprecated behavior
Warner Losh [Sat, 31 Jul 2021 05:19:58 +0000 (23:19 -0600)]
awk: Flag -Ft as deprecated behavior

Upstream is poised to deprecate the -Ft wart in one true awk. None of
the other awks do this, and the gawk maintainer says that he's had no
requests for it in gawk in 30 years maintaining it. github can find a
few instances of it in the wild. As such, warn that it's deprecated and
will go away in the future.

MFC After: 3 days
Sponsored by: Netflix

3 years agoacpica: Import ACPICA 20210730
Jung-uk Kim [Sat, 31 Jul 2021 00:05:50 +0000 (20:05 -0400)]
acpica: Import ACPICA 20210730

(cherry picked from commit 34cfdff1f386b2d7bf0a8ea873acf604753991e6)

3 years agoclock_gettime: Add Linux aliases for CLOCK_*
Warner Losh [Fri, 30 Jul 2021 23:11:43 +0000 (17:11 -0600)]
clock_gettime: Add Linux aliases for CLOCK_*

Linux standardized what we call CLOCK_{REALTIME,MONOTONIC}_FAST as
CLOCK_{REALTIME,MONOTONIC}_COARSE. In addition, Linux spells
CLOCK_UPTIME as CLOCK_BOOTTIME.

Add aliases to time.h and document these new aliases in
clock_gettime(2).

Reviewed by: vangyzen, kib (prior), dchagin (prior)
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30988

3 years agotime.h: reduce CLOCK_ namespace pollution, move to _clock_id.h
Warner Losh [Fri, 30 Jul 2021 23:10:56 +0000 (17:10 -0600)]
time.h: reduce CLOCK_ namespace pollution, move to _clock_id.h

Attempt to comply with the strict namespace pollution requirements of
_POSIX_C_SOURCE. Add guards to limit visitbility of CLOCK_ and TIMER_
defines as appropriate. Only define the CLOCK_ variables relevant to the
specific standards. Move all the sharing to sys/_clock_id.h and make
time.h and sys/time.h both include that rather than copy due to the
now large number of clocks and compat defines.

Please note: The old time.h previously used these newer dates:
CLOCK_REALTIME 199506
CLOCK_MONOTONIC 200112
CLOCK_THREAD_CPUTIME_ID 200112
CLOCK_PROCESS_CPUTIME_ID 200112

but glibc defines all of these for 199309. glibc uses this date for all
these values, however, only CLOCK_REALTIME was in IEEE 1003.1b. Add a
comment about this to document it. A large number of programs and
libraries assume that these will be defined for _POSIX_C_SOURCE =
199309.

In addition, leak CLOCK_UPTIME_FAST for the pocl package until it can be
updated to use a simple CLOCK_MONOTONIC.

Reviewed by: kib
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D31056

3 years agoRELNOTES: Put the old descripiton back
Warner Losh [Fri, 30 Jul 2021 23:07:52 +0000 (17:07 -0600)]
RELNOTES: Put the old descripiton back

After looking at the RELNOTES files on the stable branches, restore the
old description and update the file to match the format it's supposed to
have.

Sponsored by: Netflix

3 years agonanobsd: adopt dhcpd to latest conventions
Warner Losh [Fri, 30 Jul 2021 22:55:43 +0000 (16:55 -0600)]
nanobsd: adopt dhcpd to latest conventions

Adopt the dhcpd build to use nanobsd-build top level directory that
other nanobsd builds are using.

Sponsored by: Netflix

3 years agocxgbei: Round up the maximum PDU data length by the MSS for TXDATAPLEN_MAX.
John Baldwin [Thu, 29 Jul 2021 21:17:45 +0000 (14:17 -0700)]
cxgbei: Round up the maximum PDU data length by the MSS for TXDATAPLEN_MAX.

Recent firmware versions round down the value passed here by the MSS
and subsequently mishandle transmitted PDUs larger than the rounded
down value.

Reported by: Jithesh Arakkan @ Chelsio
Sponsored by: Chelsio Communications

3 years agoUPDATING: fix incorrect hash
Kristof Provost [Fri, 30 Jul 2021 18:00:47 +0000 (20:00 +0200)]
UPDATING: fix incorrect hash

Pointed out by: lwhsu

3 years agobus: Convert to the new interceptor scheme
Mark Johnston [Fri, 30 Jul 2021 19:15:27 +0000 (15:15 -0400)]
bus: Convert to the new interceptor scheme

This was missed in commit a90d053b8422.

Fixes: a90d053b8422
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation

3 years agoinet6_option_space is deprecated, refer to inet6_opt_init instead
Tom Jones [Fri, 30 Jul 2021 13:23:39 +0000 (14:23 +0100)]
inet6_option_space is deprecated, refer to inet6_opt_init instead

Reviewed by: bz, hrs
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D26273

3 years agoUPDATING: Document the removal of DIOCGETSTATESNV
Kristof Provost [Wed, 14 Jul 2021 13:51:36 +0000 (15:51 +0200)]
UPDATING: Document the removal of DIOCGETSTATESNV

MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")

3 years agopf: remove DIOCGETSTATESNV
Kristof Provost [Tue, 6 Jul 2021 11:13:24 +0000 (13:13 +0200)]
pf: remove DIOCGETSTATESNV

While nvlists are very useful in maximising flexibility for future
extensions their performance is simply unacceptably bad for the
getstates feature, where we can easily want to export a million states
or more.

The DIOCGETSTATESNV call has been MFCd, but has not hit a release on any
branch, so we can still remove it everywhere.

Reviewed by: mjg
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D31099

3 years agoipmi(4): Add more watchdog error checks.
Alexander Motin [Fri, 30 Jul 2021 03:39:04 +0000 (23:39 -0400)]
ipmi(4): Add more watchdog error checks.

Add request submission status checks before checking req->ir_compcode,
otherwise it may be zero just because of initialization.

Add checks for req->ir_compcode errors in ipmi_reset_watchdog() and
ipmi_set_watchdog().  In first case explicitly check for 0x80, which
means timer was not previously set, that I found happening after BMC
cold reset.  This change makes watchdog timer to recover instead of
permanently ignoring reset errors after BMC reset or upgraded.

MFC after: 2 weeks
Sponsored by:   iXsystems, Inc.

3 years agocoretemp(4): Switch to smp_rendezvous_cpus().
Alexander Motin [Fri, 30 Jul 2021 03:16:22 +0000 (23:16 -0400)]
coretemp(4): Switch to smp_rendezvous_cpus().

Use of smp_rendezvous_cpus() instead of sched_bind() allows to not
block indefinitely if target CPU is running some thread with higher
priority, while all we need is single rdmsr/wrmsr instruction call.
I guess it should also be much cheaper than full thread migration.

MFC after: 2 weeks
Sponsored by: iXsystems, Inc.

3 years agoAdd interceptors for atomic operations on userspace memory
Mark Johnston [Fri, 30 Jul 2021 01:05:03 +0000 (21:05 -0400)]
Add interceptors for atomic operations on userspace memory

Implement them for KASAN.  KCSAN interceptors are left unimplemented for
now.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation

3 years agoSimplify kernel sanitizer interceptors
Mark Johnston [Mon, 19 Jul 2021 20:09:42 +0000 (16:09 -0400)]
Simplify kernel sanitizer interceptors

KASAN and KCSAN implement interceptors for various primitive operations
that are not instrumented by the compiler.  KMSAN requires them as well.
Rather than adding new cases for each sanitizer which requires
interceptors, implement the following protocol:
- When interceptor definitions are required, define
  SAN_NEEDS_INTERCEPTORS and SANITIZER_INTERCEPTOR_PREFIX.
- In headers that declare functions which need to be intercepted by a
  sanitizer runtime, use SANITIZER_INTERCEPTOR_PREFIX to provide
  declarations.
- When SAN_RUNTIME is defined, do not redefine the names of intercepted
  functions.  This is typically the case in files which implement
  sanitizer runtimes but is also needed in, for example, files which
  define ifunc selectors for intercepted operations.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation

3 years agogeom_vfs: Pre-allocate event for g_vfs_destroy.
John Baldwin [Fri, 30 Jul 2021 00:09:23 +0000 (17:09 -0700)]
geom_vfs: Pre-allocate event for g_vfs_destroy.

When an active g_vfs is orphaned due to an underlying disk going away
the destroy is deferred until the filesystem is unmounted in
g_vfs_done().  However, g_vfs_done() is invoked from a non-sleepable
context and cannot use M_WAITOK to allocate the event.  Instead,
allocate the event in g_vfs_orphan() and save it in the softc to be
retrieved by the last call to g_vfs_done().

Reported by: Jithesh Arakkan @ Chelsio
Reviewed by: imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D31354

3 years agoUse a more specific type for geom_disk.d_event.
John Baldwin [Thu, 29 Jul 2021 23:34:46 +0000 (16:34 -0700)]
Use a more specific type for geom_disk.d_event.

Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D31353

3 years agocxgbei: Wait for socket to close in icl_cxgbei_conn_close.
John Baldwin [Thu, 29 Jul 2021 23:34:46 +0000 (16:34 -0700)]
cxgbei: Wait for socket to close in icl_cxgbei_conn_close.

This ensures the TOE has finished processing any in-flight received
data before returning to the caller.  The caller assumes it is safe to
free any open tasks or transfers (and associated buffers) after this
function returns.

Previously, data placed directly via DDP could be written to buffers
after the caller had freed the buffers.

Reported by: Jithesh Arakkan @ Chelsio
Sponsored by: Chelsio Communications

3 years agoClean up orphaned indirdep dependency structures after disk failure.
Kirk McKusick [Thu, 29 Jul 2021 23:31:16 +0000 (16:31 -0700)]
Clean up orphaned indirdep dependency structures after disk failure.

During forcible unmount after a disk failure there is a bug that
causes one or more indirdep dependency structures to fail to be
deallocated. Until we manage to track down why they fail to get
cleaned up, this code tracks them down and eliminates them so that
the unmount can succeed.

Reported by:  Peter Holm
Help from:    kib
Reviewed by:  Chuck Silvers
Tested by:    Peter Holm
MFC after:    7 days
Sponsored by: Netflix

3 years agoDiagnotic improvement to soft dependency structure management.
Kirk McKusick [Thu, 29 Jul 2021 23:11:58 +0000 (16:11 -0700)]
Diagnotic improvement to soft dependency structure management.

The soft updates diagnotic code keeps a list for each type of soft
update dependency. When a new block is allocated for a file it is
initially tracked by a "newblk" dependency. The "newblk" dependency
eventually becomes either an "allocdirect" dependency or an "indiralloc"
dependency. The diagnotic code failed to move the "newblk" from the list
of "newblk"s to its new type list.

No functional change intended.

Reviewed by:  Chuck Silvers (as part of a larger change)
Tested by:    Peter Holm (as part of a larger change)
Sponsored by: Netflix

3 years agoamd64: do not touch low memory in AP startup unless we used legacy boot
Konstantin Belousov [Thu, 29 Jul 2021 13:18:19 +0000 (16:18 +0300)]
amd64: do not touch low memory in AP startup unless we used legacy boot

This fixes several ommisions in 48216088b1157a22b955

Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D31343

3 years agoamd64: stop doing special allocation for the AP startup trampoline
Konstantin Belousov [Thu, 29 Jul 2021 00:22:35 +0000 (03:22 +0300)]
amd64: stop doing special allocation for the AP startup trampoline

There is no reason now why do we need to allocate trampoline page very
early in the boot process.  The only requirement for the page is that
it is below 1M to be usable by the real mode during init.  This can be
handled by vm_alloc_contig() when we do the startup.

Also assert that startup trampoline fits into single page.  In principle
we can do multi-page allocation if needed, but it is not.

Move the alloc_ap_trampoline() function and the boot_address variable to
i386/mp_machdep.c.  Keep existing mechanism of early alloc on i386.

Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D31343

3 years agoawk: Note awk upgrades.
Warner Losh [Thu, 29 Jul 2021 21:41:16 +0000 (15:41 -0600)]
awk: Note awk upgrades.

Note the high level differences with the latest one true awk
import. This list may grow as we learn more troublesome areas.

Updated description of the format fo the file to match the file.

I'll likely merge this change (and any followups) by direct commit to
stable/13 and stable/12 in a couple of weeks.

Sponsored by: Netflix

3 years agoLinuxKPI: bitfield.h cleanup
Bjoern A. Zeeb [Thu, 29 Jul 2021 21:24:35 +0000 (21:24 +0000)]
LinuxKPI: bitfield.h cleanup

Add a missing tab and remove an unnecessary return.
No functional changes.

MFC after: 3 days

3 years agohwpmc: remove static POWER8 definitions
Leandro Lupori [Thu, 29 Jul 2021 17:37:32 +0000 (14:37 -0300)]
hwpmc: remove static POWER8 definitions

After b48a2770d48b, static POWER8 definitions became unnecessary,
as all of them (and much more) are already present in libpmc's
PMU events.

Submitted by: Leonardo Bianconi <leonardo.bianconi@eldorado.org.br> (initial version)
Reviewed by: kbowling, mhorne
Sponsored by: Instituto de Pesquisas Eldorado (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D31334

3 years agox86 __vdso_gettc: add O_CLOEXEC flag to open
Konstantin Belousov [Thu, 29 Jul 2021 01:26:38 +0000 (04:26 +0300)]
x86 __vdso_gettc: add O_CLOEXEC flag to open

of the /dev/hpet and /dev/hv_tsc devices, to not leak internal libc
filedescriptors on exec.

Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D31344

3 years agoamd64: Set GS.base before calling init_secondary() on APs
Mark Johnston [Thu, 29 Jul 2021 14:22:37 +0000 (10:22 -0400)]
amd64: Set GS.base before calling init_secondary() on APs

KMSAN instrumentation requires thread-local storage to track
initialization state for function parameters and return values.  This
buffer is accessed as part of each function prologue.  It is provided by
the KMSAN runtime, which looks up a pointer in the current thread's
structure.

When KMSAN is configured, init_secondary() is instrumented, but this
means that GS.base must be initialized first, otherwise the runtime
cannot safely access curthread.  Work around this by loading GS.base
before calling init_secondary(), so that the runtime can at least check
curthread == NULL and return a pointer to some dummy storage.  Note that
init_secondary() still must reload GS.base after calling lgdt(), which
loads a selector into %gs, which in turn clears the base register.

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31336

3 years agoamd64: Set MSR_KGSBASE to 0 during AP startup
Mark Johnston [Thu, 29 Jul 2021 14:14:05 +0000 (10:14 -0400)]
amd64: Set MSR_KGSBASE to 0 during AP startup

There is no reason to initialize it to anything else, and this matches
initialization of the BSP.  No functional change intended.

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31336

3 years agolink_elf_obj: Invoke fini callbacks
Mark Johnston [Thu, 29 Jul 2021 13:46:25 +0000 (09:46 -0400)]
link_elf_obj: Invoke fini callbacks

This is required for KASAN: when a module is unloaded, poisoned regions
(e.g., pad areas between global variables) are left as such, so if they
are reused as KLDs are loaded, false positives can arise.

Reported by: pho, Jenkins
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31339

3 years agolibc/locale: Use O_CLOEXEC when opening locale tables
Mark Johnston [Thu, 29 Jul 2021 13:14:50 +0000 (09:14 -0400)]
libc/locale: Use O_CLOEXEC when opening locale tables

Reviewed by: kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation

3 years agolinux(4): Eliminate a now unused includes after futexes refactoring.
Dmitry Chagin [Thu, 29 Jul 2021 09:56:39 +0000 (12:56 +0300)]
linux(4): Eliminate a now unused includes after futexes refactoring.

MFC after: 2 weeks

3 years agolinux(4): Add a comment about wait/requeue pi operations.
Dmitry Chagin [Thu, 29 Jul 2021 09:55:59 +0000 (12:55 +0300)]
linux(4): Add a comment about wait/requeue pi operations.

MFC after: 2 weeks

3 years agolinux(4): Handle incorrect FUTEX_CLOCK_REALTIME option bit.
Dmitry Chagin [Thu, 29 Jul 2021 09:55:33 +0000 (12:55 +0300)]
linux(4): Handle incorrect FUTEX_CLOCK_REALTIME option bit.

Return ENOSYS if the FUTEX_CLOCK_REALTIME option bit is specified for an
inappropriate futex operation.

MFC after: 2 weeks

3 years agolinux(4): Handle FUTEX_LOCK_PI2 oeration.
Dmitry Chagin [Thu, 29 Jul 2021 09:55:02 +0000 (12:55 +0300)]
linux(4): Handle FUTEX_LOCK_PI2 oeration.

FUTEX_LOCK_PI2 was added to support clock selection as FUTEX_LOCK_PI uses a
CLOCK_REALTIME based absolute value since it was implemented, but it does not
require that the FUTEX_CLOCK_REALTIME bit is set, because that was introduced
later.

MFC after: 2 weeks

3 years agolinux(4): Use variable name not type for sizeof() to calculate storage size.
Dmitry Chagin [Thu, 29 Jul 2021 09:54:32 +0000 (12:54 +0300)]
linux(4): Use variable name not type for sizeof() to calculate storage size.

MFC after: 2 weeks

3 years agolinux(4): Move len variable initialization to the appropriate place.
Dmitry Chagin [Thu, 29 Jul 2021 09:54:16 +0000 (12:54 +0300)]
linux(4): Move len variable initialization to the appropriate place.

MFC after: 2 weeks

3 years agolinux(4): Use linux_tdfind() in get_robust_list.
Dmitry Chagin [Thu, 29 Jul 2021 09:53:59 +0000 (12:53 +0300)]
linux(4): Use linux_tdfind() in get_robust_list.

In the Linux emulation layer linux_tdfind() has a special purpose to
handle glibc specific TID mangling and we should use it instead of tdfind().

MFC after: 2 weeks

3 years agolinux(4): Eliminate unnecessary error initialization.
Dmitry Chagin [Thu, 29 Jul 2021 09:53:41 +0000 (12:53 +0300)]
linux(4): Eliminate unnecessary error initialization.

MFC after: 2 weeks

3 years agolinux(4): Eliminate unnecessary head initialization.
Dmitry Chagin [Thu, 29 Jul 2021 09:53:25 +0000 (12:53 +0300)]
linux(4): Eliminate unnecessary head initialization.

MFC after: 2 weeks

3 years agolinux(4): style, wrap too long line.
Dmitry Chagin [Thu, 29 Jul 2021 09:53:07 +0000 (12:53 +0300)]
linux(4): style, wrap too long line.

MFC after: 2 weeks

3 years agolinux(4): Eliminating remnants of futex sdt.
Dmitry Chagin [Thu, 29 Jul 2021 09:52:36 +0000 (12:52 +0300)]
linux(4): Eliminating remnants of futex sdt.

MFC after: 2 weeks

3 years agolinux(4): Eliminating an accidental comment.
Dmitry Chagin [Thu, 29 Jul 2021 09:51:56 +0000 (12:51 +0300)]
linux(4): Eliminating an accidental comment.

MFC after: 2 weeks

3 years agolinux(4): Handle special case for regular futex in handle_futex_death().
Dmitry Chagin [Thu, 29 Jul 2021 09:51:39 +0000 (12:51 +0300)]
linux(4): Handle special case for regular futex in handle_futex_death().

Handle some races in handle_futex_death() which can prevents a wakeup of
potential waiters which can cause these waiters to block forever.

Differential Revision: https://reviews.freebsd.org/D31280
MFC after: 2 weeks

3 years agolinux(4): Futex address must be 32-bit aligned.
Dmitry Chagin [Thu, 29 Jul 2021 09:50:58 +0000 (12:50 +0300)]
linux(4): Futex address must be 32-bit aligned.

Linux futex documentation explicitly states that EINVAL is returned if
the futex is not 4-byte aligned. Check futex alignment as a Linux do
and return EINVAL.

Differential Revision: https://reviews.freebsd.org/D31279
MFC after: 2 weeks

3 years agolinux(4): Finish cf8d74e3fe63.
Dmitry Chagin [Thu, 29 Jul 2021 09:50:43 +0000 (12:50 +0300)]
linux(4): Finish cf8d74e3fe63.

Add forgotten val3_compare initialization in case of time64 futex.

MFC after: 2 weeks

3 years agolinux(4): Replace casuword32 by casueword32.
Dmitry Chagin [Thu, 29 Jul 2021 09:50:11 +0000 (12:50 +0300)]
linux(4): Replace casuword32 by casueword32.

Follow the r349951 (30b3018d), add check to react to stops and requests
to terminate between retries.

Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D31254
MFC after: 2 weeks

3 years agolinux(4): Implement pi futexes using umtx.
Dmitry Chagin [Thu, 29 Jul 2021 09:49:42 +0000 (12:49 +0300)]
linux(4): Implement pi futexes using umtx.

Differential Revision: https://reviews.freebsd.org/D31240
MFC after: 2 weeks

3 years agolinux(4): Replace copyin() by fueword32() in handle_futex_death().
Dmitry Chagin [Thu, 29 Jul 2021 09:48:59 +0000 (12:48 +0300)]
linux(4): Replace copyin() by fueword32() in handle_futex_death().

According to fetch(9) fueword facility designed to fetch atomically
small amount of data from user space.

Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D31239
MFC after: 2 weeks

3 years agoumtx: Add new pi_futex type.
Dmitry Chagin [Thu, 29 Jul 2021 09:48:34 +0000 (12:48 +0300)]
umtx: Add new pi_futex type.

Differential Revision: https://reviews.freebsd.org/D31250
MFC after: 2 weeks

3 years agoumtx: Split do_unlock_pi on two counterparts.
Dmitry Chagin [Thu, 29 Jul 2021 09:47:39 +0000 (12:47 +0300)]
umtx: Split do_unlock_pi on two counterparts.

The umtx_pi_frop() will be used by Linux emulation layer.

Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D31238
MFC after: 2 weeks

3 years agoumtx: Expose some of the pi umtx structures and API to the rest of the kernel.
Dmitry Chagin [Thu, 29 Jul 2021 09:46:58 +0000 (12:46 +0300)]
umtx: Expose some of the pi umtx structures and API to the rest of the kernel.

Differential Revision: https://reviews.freebsd.org/D31237
MFC after: 2 weeks

3 years agolinux(4): Eliminate unused includes.
Dmitry Chagin [Thu, 29 Jul 2021 09:46:35 +0000 (12:46 +0300)]
linux(4): Eliminate unused includes.

MFC after: 2 weeks

3 years agolinux(4): Reimplement futexes using umtx.
Dmitry Chagin [Thu, 29 Jul 2021 09:43:48 +0000 (12:43 +0300)]
linux(4): Reimplement futexes using umtx.

Differential Revision: https://reviews.freebsd.org/D31236
MFC after: 2 weeks

3 years agoumtx: Add umtxq_requeue Linux emulation layer extension.
Dmitry Chagin [Thu, 29 Jul 2021 09:43:07 +0000 (12:43 +0300)]
umtx: Add umtxq_requeue Linux emulation layer extension.

Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D31235
MFC after: 2 weeks

3 years agoumtx: Add bitset conditional wakeup functionality.
Dmitry Chagin [Thu, 29 Jul 2021 09:42:49 +0000 (12:42 +0300)]
umtx: Add bitset conditional wakeup functionality.

The bitset is a Linux emulation layer extension. This 32-bit mask, in which at
least one bit must be set, is used to select which threads should be woken up.

The bitset is stored in the umtx_q structure, which is used to enqueue the waiter
into the umtx waitqueue. Put the bitset into the hole, that appeared on LP64 due
to data alignment, to prevent the growth of the struct umtx_q.

Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D31234
MFC after: 2 weeks

3 years agoumtx: Expose some of the umtx structures and API to the rest of the kernel.
Dmitry Chagin [Thu, 29 Jul 2021 09:42:17 +0000 (12:42 +0300)]
umtx: Expose some of the umtx structures and API to the rest of the kernel.

Differential Revision: https://reviews.freebsd.org/D31233
MFC after: 2 weeks

3 years agoumtx: Expose struct abs_timeout to the rest of the kernel.
Dmitry Chagin [Thu, 29 Jul 2021 09:41:58 +0000 (12:41 +0300)]
umtx: Expose struct abs_timeout to the rest of the kernel.

Add umtx_ prefix to all abs_timeout facility and add declaration for it.
For consistency with others abs_timeout mark inline abs_timeout_init2.

Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D31249
MFC after: 2 weeks

3 years agoumtx: Split umtx.h on two counterparts.
Dmitry Chagin [Thu, 29 Jul 2021 09:41:29 +0000 (12:41 +0300)]
umtx: Split umtx.h on two counterparts.

To prevent umtx.h polluting by future changes split it on two headers:
umtx.h - ABI header for userspace;
umtxvar.h - the kernel staff.

While here fix umtx_key_match style.

Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D31248
MFC after: 2 weeks

3 years agofreebsd32: Remove the unnecessary spaces.
Dmitry Chagin [Thu, 29 Jul 2021 09:40:36 +0000 (12:40 +0300)]
freebsd32: Remove the unnecessary spaces.

Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D31247
MFC after: 2 weeks

3 years agofreebsd32: Remove unused umtx.h include.
Dmitry Chagin [Thu, 29 Jul 2021 09:40:08 +0000 (12:40 +0300)]
freebsd32: Remove unused umtx.h include.

Differential Revision: https://reviews.freebsd.org/D31246
MFC after: 2 weeks

3 years agofreebsd32: Eliminate spaces at end of line.
Dmitry Chagin [Thu, 29 Jul 2021 09:39:30 +0000 (12:39 +0300)]
freebsd32: Eliminate spaces at end of line.

Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D31245
MFC after: 2 weeks

3 years agoFix mac_veriexec version mismatch
Wojciech Macek [Thu, 29 Jul 2021 09:02:43 +0000 (11:02 +0200)]
Fix mac_veriexec version mismatch

mac_veriexec sets its version to 1, but the mac_veriexec_shaX modules which depend on it expect MAC_VERIEXEC_VERSION = 2.
Be consistent and use MAC_VERIEXEC_VERSION everywhere.
This unbreaks loading of mac_veriexec modules at boot time.

Authored by:  Kornel Duleba <mindal@semihalf.com>
Obtained from:  Semihalf
Sponsored by:  Stormshield
Differential Revision:  https://reviews.freebsd.org/D31268

3 years agoAdd missing arm64 ID registers
Andrew Turner [Wed, 28 Jul 2021 19:00:36 +0000 (19:00 +0000)]
Add missing arm64 ID registers

These may contain values we export to userpsace.

Sponsored by: The FreeBSD Foundation