7 hours agonfsstat: delete unused fields master github/master
asomers [Sat, 24 Oct 2020 05:52:29 +0000 (05:52 +0000)]
nfsstat: delete unused fields

Ever since r192762 nfsstat has included a few fields whose values were
always 0. They were copied from OpenBSD, but have never been used on
FreeBSD. Don't display them.

Reviewed by: rmacklem
Sponsored by: Axcient
Differential Revision:

11 hours agonvme: Remove compat code for older kernels
imp [Sat, 24 Oct 2020 01:59:01 +0000 (01:59 +0000)]
nvme: Remove compat code for older kernels

Remove code that supported pre-2011 kernels. CTLTYPE_S64 was defined
in rev 217616. All supported branches have it, so remove its compat
definition as OBE.

11 hours agocache: batch updates to numcache in case of mass removal
mjg [Sat, 24 Oct 2020 01:14:52 +0000 (01:14 +0000)]
cache: batch updates to numcache in case of mass removal

11 hours agocache: refactor alloc/free
mjg [Sat, 24 Oct 2020 01:14:17 +0000 (01:14 +0000)]
cache: refactor alloc/free

This in particular centralizes manipulation of numcache.

11 hours agocache: fold branch prediction into cache_ncp_canuse
mjg [Sat, 24 Oct 2020 01:13:47 +0000 (01:13 +0000)]
cache: fold branch prediction into cache_ncp_canuse

11 hours agocache: fix some typos
mjg [Sat, 24 Oct 2020 01:13:16 +0000 (01:13 +0000)]
cache: fix some typos

11 hours agocache: drop write-only vars
mjg [Sat, 24 Oct 2020 01:13:02 +0000 (01:13 +0000)]
cache: drop write-only vars

13 hours agowarnx: fix needless static
imp [Sat, 24 Oct 2020 00:03:11 +0000 (00:03 +0000)]
warnx: fix needless static

I noticed after the review that these shouldn't be static. Remove the
'static' from them, otherwise concurrent calls to warn* might see a
similar but to the original.

13 hours agowarnx: Save errno across calls that might change it.
imp [Fri, 23 Oct 2020 23:56:00 +0000 (23:56 +0000)]
warnx: Save errno across calls that might change it.

When warn() family of functions is being used after err_set_file() has
been set to, for example, /dev/null, errno is being clobbered,
rendering it unreliable after, for example, procstat_getpathname()
when it is supposed to emit a warning. Then the errno is changed to
Inappropriate ioctl for device, destroying the original value (via
calls to fprintf()functions).

Submitted by: Juraj Lutter
Differential Revision:

14 hours agoOnly use ASAN when using the in-tree compiler
brooks [Fri, 23 Oct 2020 22:27:45 +0000 (22:27 +0000)]
Only use ASAN when using the in-tree compiler

When building FreeBSD 11 on a FreeBSD 12 system with
CROSS_TOOLCHAIN=llvm10 we end up trying to link against the packaged
version of the sanitizer library.  This resulted in a requirement for
getentropy(3) which is not present in FreeBSD 11.

Reviewed by: emaste
MFC after: 3 days
Differential Revision:

15 hours agoMove the iommu stubs to a generic place, so they are available on all the
br [Fri, 23 Oct 2020 21:27:48 +0000 (21:27 +0000)]
Move the iommu stubs to a generic place, so they are available on all the

This allows to not depend on the IOMMU macro in AHCI driver.

Requested by: kib
Suggested by: andrew
Reviewed by: kib
Sponsored by: Innovate DSbD
Differential Revision:

18 hours agoxhci: Handle the case when MSI-X BAR is the same as IO BAR.
kib [Fri, 23 Oct 2020 18:18:45 +0000 (18:18 +0000)]
xhci: Handle the case when MSI-X BAR is the same as IO BAR.

PCIe allows for MSI-X BAR to be either dedicated, or MSI-X Table may
be co-located in some functional BAR.  In the later case xhci(4) is
unable to allocate active resource for the table because BAR is
already activated.

Handle it by checking for this special case, and not try to alloc
resource if MSI-X BAR is IO.

Reported and tested by: emaste
Reviewed by: emaste, hselasky
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision:

20 hours agolibelf: add compression header support
emaste [Fri, 23 Oct 2020 16:35:23 +0000 (16:35 +0000)]
libelf: add compression header support

GNU and Oracle libelf implementations added support for section
compression, intended to reduce the size of DWARF debug info (which
might be an order of magnitude larger than the code).

There are two compressed ELF section formats:

1. Old GNU - sections are renmaed to start with 'z'.  Section contains
   a magic number, uncompressed size, and compressed data.

2. Oracle and New GNU - compressed sections use the SHF_COMPRESSED flag.
   The compression header contains the compression type, uncompressed
   size, and uncompressed alignment.

The second style is preferred and this change implements only that one.

Submitted by: Tiger Gao <>
Reviewed by: markj
MFC after: 2 weeks
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
Differential Revision:

21 hours agocache: reduce memory waste in struct namecache
mjg [Fri, 23 Oct 2020 15:56:22 +0000 (15:56 +0000)]
cache: reduce memory waste in struct namecache

The previous scheme for calculating the total size was doing sizeof
on the struct and then adding the wanted space for the buffer.

nc_name is at offset 58 while sizeof(struct namecache) is 64.
With CACHE_PATH_CUTOFF of 39 bytes and 1 byte of padding we were
allocating 104 bytes for the entry and never accounting for the 6
byte padding, wasting that space.

21 hours agovfs: drop spurious cache_purge on rmdir
mjg [Fri, 23 Oct 2020 15:50:49 +0000 (15:50 +0000)]
vfs: drop spurious cache_purge on rmdir

The removed directory gets cache_purged which is sufficient to remove any entries
related to the parent.

Note only tmpfs, ufs and zfs are patched.

21 hours agovfs: stop taking the interlock in vnode reclaim
mjg [Fri, 23 Oct 2020 15:49:18 +0000 (15:49 +0000)]
vfs: stop taking the interlock in vnode reclaim

It no longer protects any of tested fields, keeping all the checks racy.

While here make vtryrecycle drop the vnode on its own. Avoids an additional
lock trip.

21 hours agontb: Fix the 32-bit build after r366969
markj [Fri, 23 Oct 2020 15:12:06 +0000 (15:12 +0000)]
ntb: Fix the 32-bit build after r366969

Reported by: Jenkins
MFC with: r366969

22 hours agortsold: Remove an incorrect __unused annotation
markj [Fri, 23 Oct 2020 14:56:17 +0000 (14:56 +0000)]
rtsold: Remove an incorrect __unused annotation

MFC after: 1 week

22 hours agoAdd some missing nv(9) MLINKS
markj [Fri, 23 Oct 2020 14:25:48 +0000 (14:25 +0000)]
Add some missing nv(9) MLINKS

MFC after: 1 week

22 hours agontb: Add Intel Xeon Gen3 support
markj [Fri, 23 Oct 2020 14:16:52 +0000 (14:16 +0000)]
ntb: Add Intel Xeon Gen3 support

The NTB hardware starting with Skylake has some changes to the register
map and the doorbell interface.  Add a new NTB_XEON_GEN3 device type and
use it to conditionalize driver logic that differs from the existing
Xeon code.

Reviewed by: vangyzen
Discussed with: cem, Bret Ketchum <>
MFC after: 1 month
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision:

22 hours agontb: Fix an assertion to permit >= 32 doorbells
markj [Fri, 23 Oct 2020 14:15:58 +0000 (14:15 +0000)]
ntb: Fix an assertion to permit >= 32 doorbells

MFC after: 1 week
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.

25 hours agoImprove prctl(2) debug.
trasz [Fri, 23 Oct 2020 12:00:30 +0000 (12:00 +0000)]
Improve prctl(2) debug.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision:

25 hours agoAdd /proc/sys/kernel/ngroups_max to linprocfs(4). The id(1) command
trasz [Fri, 23 Oct 2020 11:57:55 +0000 (11:57 +0000)]
Add /proc/sys/kernel/ngroups_max to linprocfs(4).  The id(1) command
seems to use it - it works fine without it, but still.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision:

27 hours agoUdpate calendar man-page to mention the search path added in r366962.
se [Fri, 23 Oct 2020 10:00:56 +0000 (10:00 +0000)]
Udpate calendar man-page to mention the search path added in r366962.

Calendar files in /usr/lcoal/share/calendar take precedence over files in
the base system. They can be provided by a port or package, but since such
a port has not been committed, yet, no specific port name is suggested.

In fact, multiple ports could exist (e.g. per locale) without conflicting
with each other.

27 hours agoAdd search of LOCALBASE/share/calendar for calendars supplied by a port.
se [Fri, 23 Oct 2020 09:22:23 +0000 (09:22 +0000)]
Add search of LOCALBASE/share/calendar for calendars supplied by a port.

Calendar files in LOCALBASE override similarily named ones in the base
system. This could easily be changed if the base system calendars should
have precedence, but it could lead to a violation of POLA since then the
port's files were ignored unless those in base have been deleted.

There was no definition of _PATH_LOCALBASE in paths.h, but verbatim uses
of /usr/local existed for _PATH_DEFPATH. Use _PATH_LOCALBASE here to ease
a consistent modification of this prefix.

Reviewed by: imp, pfg
Differential Revision:

28 hours agoFix for loading cuse.ko via rc.d . Make sure we declare the cuse(3)
hselasky [Fri, 23 Oct 2020 08:44:53 +0000 (08:44 +0000)]
Fix for loading cuse.ko via rc.d . Make sure we declare the cuse(3)
module by name and not only by the version information, so that
"kldstat -q -m cuse" works.

Found by: Goran Mekic <>
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking

30 hours agoConditionally compile struct vm_phys_seg's md_first field. This field is
alc [Fri, 23 Oct 2020 06:24:38 +0000 (06:24 +0000)]
Conditionally compile struct vm_phys_seg's md_first field.  This field is
only used by arm64's pmap.

Reviewed by: kib, markj, scottph
Differential Revision:

34 hours agocxgbe(4): Fix min/max typo in r366958.
np [Fri, 23 Oct 2020 02:24:43 +0000 (02:24 +0000)]
cxgbe(4):  Fix min/max typo in r366958.

35 hours agocxgbe(4): refine the values reported in if_ratelimit_query.
np [Fri, 23 Oct 2020 01:36:54 +0000 (01:36 +0000)]
cxgbe(4): refine the values reported in if_ratelimit_query.

- Get the number of classes from chip_params.
- Get the number of ethofld tids from the firmware.
- Do not let tcp_ratelimit allocate all traffic classes.

Sponsored by: Chelsio Communications

36 hours agoHandle CPL_RX_DATA on active TLS sockets.
jhb [Fri, 23 Oct 2020 00:23:54 +0000 (00:23 +0000)]
Handle CPL_RX_DATA on active TLS sockets.

In certain edge cases, the NIC might have only received a partial TLS
record which it needs to return to the driver.  For example, if the
local socket was closed while data was still in flight, a partial TLS
record might be pending when the connection is closed.  Receiving a
RST in the middle of a TLS record is another example.  When this
happens, the firmware returns the the partial TLS record as plain TCP
data via CPL_RX_DATA.  Handle these requests by returning an error to
OpenSSL (via so_error for KTLS or via an error TLS record header for
the older Chelsio OpenSSL interface).

Reported by: Sony Arpita Das @ Chelsio
Reviewed by: np
MFC after: 2 weeks
Sponsored by: Chelsio Communications
Differential Revision: Revision:

40 hours agoNegotiate iSCSIProtocolLevel of 2 (RFC 7144) in initiator.
mav [Thu, 22 Oct 2020 20:26:27 +0000 (20:26 +0000)]
Negotiate iSCSIProtocolLevel of 2 (RFC 7144) in initiator.

It does not change anything immediately, but allows further support of
Command Priority, Status Qualifier and new task management functions.

MFC after: 1 month
Sponsored by: iXsystems, Inc.

40 hours agonetmap: fix mutex double unlock bug
vmaffione [Thu, 22 Oct 2020 20:21:11 +0000 (20:21 +0000)]
netmap: fix mutex double unlock bug

Submitted by:  brian90013
MFC after: 3 days

41 hours agoloader: revert r342161 and r342151
tsoome [Thu, 22 Oct 2020 20:02:02 +0000 (20:02 +0000)]
loader: revert r342161 and r342151

We are using asize property from pool label and we do not depend
on partition data to find last two pool labels and to validate LBA for disk IO.

This does allow us to re-enable support for partitionless disk setups.

41 hours agovfs: prevent avoidable evictions on mkdir of existing directories
mjg [Thu, 22 Oct 2020 19:28:12 +0000 (19:28 +0000)]
vfs: prevent avoidable evictions on mkdir of existing directories

mkdir -p /foo/bar/baz will mkdir each path component and ignore EEXIST.

The NOCACHE lookup will make the namecache unnecessarily evict the existing entry,
and then fallback to the fs lookup routine eventually leading namei to return an
error as the directory is already there.

For invocations like mkdir -p /usr/obj/usr/src/sys/GENERIC/modules this triggers
fallbacks to the slowpath for concurrently executing lookups.

Tested by: pho
Discussed with: kib

41 hours agostablerestart(5): Fix some issues reported by mandoc
gbe [Thu, 22 Oct 2020 19:25:01 +0000 (19:25 +0000)]
stablerestart(5): Fix some issues reported by mandoc

- New sentence, new line

41 hours agocache: assert the created entry does not point to itself
mjg [Thu, 22 Oct 2020 19:22:34 +0000 (19:22 +0000)]
cache: assert the created entry does not point to itself

41 hours agopnfsserver(4): Fix some issues reported by mandoc
gbe [Thu, 22 Oct 2020 19:19:42 +0000 (19:19 +0000)]
pnfsserver(4): Fix some issues reported by mandoc

- new sentence, new line

42 hours agosocket(9): Remove duplicate word 'is is'
gbe [Thu, 22 Oct 2020 18:45:49 +0000 (18:45 +0000)]
socket(9): Remove duplicate word 'is is'

MFC after: 1 week

43 hours agoFix typo
glebius [Thu, 22 Oct 2020 18:00:07 +0000 (18:00 +0000)]
Fix typo

43 hours agoMicro-optimize uma_small_alloc(). Replace bzero(..., PAGE_SIZE) by
alc [Thu, 22 Oct 2020 17:47:51 +0000 (17:47 +0000)]
Micro-optimize uma_small_alloc().  Replace bzero(..., PAGE_SIZE) by
pagezero().  Ultimately, they use the same method for bulk zeroing, but
the generality of bzero() requires size and alignment checks that
pagezero() does not.

Eliminate an unnecessary #include.

Reviewed by: emaste, markj
MFC after: 1 week
Differential Revision:

43 hours agoAdd a new CCP device ID found on my Ryzen 5 3600XT.
jkim [Thu, 22 Oct 2020 17:46:55 +0000 (17:46 +0000)]
Add a new CCP device ID found on my Ryzen 5 3600XT.

MFC after: 1 week

43 hours agoif_vxlan(4): csum_flags_to_inner_flags takes the tunnel protocol as a parameter.
np [Thu, 22 Oct 2020 17:05:55 +0000 (17:05 +0000)]
if_vxlan(4): csum_flags_to_inner_flags takes the tunnel protocol as a parameter.

No functional change.

2 days agoCompile fix for MIPS, MIPS64, POWERPC and POWERPC64.
hselasky [Thu, 22 Oct 2020 12:22:08 +0000 (12:22 +0000)]
Compile fix for MIPS, MIPS64, POWERPC and POWERPC64.
Add missing include files.

Differential Revision:
Reviewed by: melifaro@
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking

2 days agoFix for colliding change (r366917).
hselasky [Thu, 22 Oct 2020 10:36:16 +0000 (10:36 +0000)]
Fix for colliding change (r366917).

Differential Revision:
Reviewed by: melifaro@
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking

2 days agoFix for monotolithic kernel builds using device lagg(4).
hselasky [Thu, 22 Oct 2020 10:29:27 +0000 (10:29 +0000)]
Fix for monotolithic kernel builds using device lagg(4).

Differential Revision:
Reviewed by: melifaro@
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking

2 days agoAdd support for IP over infiniband, IPoIB, to lagg(4). Currently only
hselasky [Thu, 22 Oct 2020 09:47:12 +0000 (09:47 +0000)]
Add support for IP over infiniband, IPoIB, to lagg(4). Currently only
the failover protocol is supported due to limitations in the IPoIB
architecture. Refer to the lagg(4) manual page for how to configure
and use this new feature. A new network interface type,
IFT_INFINIBANDLAG, has been added, similar to the existing

ifconfig(8) has been updated to accept a new laggtype argument when
creating lagg(4) network interfaces. This new argument is used to
distinguish between ethernet and infiniband type of lagg(4) network
interface. The laggtype argument is optional and defaults to
ethernet. The lagg(4) command line syntax is backwards compatible.

Differential Revision:
Reviewed by: melifaro@
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking

2 days agosysv_sem: semusz depends on semume.
kib [Thu, 22 Oct 2020 09:28:11 +0000 (09:28 +0000)]
sysv_sem: semusz depends on semume.

Size of the per-process semaphore undo structure (semusz) depends on
the number of the per-process undos.  If kern.ipc.semume is adjusted,
semusz must be adjusted as well, and it makes no sense to delegate
adjustment to user.  Make it automatic.

Reported and tested by: Olef <>
PR: 250361
Reviewed by: jhb, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision:

2 days agoImplement mbuf hashing routines for IP over infiniband, IPoIB.
hselasky [Thu, 22 Oct 2020 09:17:56 +0000 (09:17 +0000)]
Implement mbuf hashing routines for IP over infiniband, IPoIB.
No functional change intended.

Differential Revision:
Reviewed by: melifaro@
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking

2 days agoFactor out generic IP over infiniband, IPoIB, definitions and code
hselasky [Thu, 22 Oct 2020 09:09:53 +0000 (09:09 +0000)]
Factor out generic IP over infiniband, IPoIB, definitions and code
into net/if_infiniband.c and net/infiniband.h . No functional change

Differential Revision:
Reviewed by: melifaro@
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking

2 days agocxgbe(4): fix the size of the iq/eq maps.
np [Thu, 22 Oct 2020 08:40:25 +0000 (08:40 +0000)]
cxgbe(4): fix the size of the iq/eq maps.

The firmware can allocate ingress and egress context ids anywhere from
its configured range.  Size the iq/eq maps to match the entire range
instead of assuming that the firmware always allocates the first
available context id.

Reported by: Baptiste Wicht @ Verisign
MFC after: 1 week
Sponsored by: Chelsio Communications

2 days ago[hwpmc] Fix call chain capture for ARM64
gonzo [Thu, 22 Oct 2020 05:07:25 +0000 (05:07 +0000)]
[hwpmc] Fix call chain capture for ARM64

Use ELR register value instead of LR for PMC_TRAPFRAME_TO_PC macro since
it's the former that indicates PC if the interrupted execution thread.

This fixes a bug where pmcstat lost the leaf function of the call chain
and started with the second function in the chain.

Although this change is an improvement over the previous logic there is still
posibility for incomplete data: if the leaf function does not have stack
variables and does not call any other functions compiler would not generate
a stack frame for it and the FP value would point to the caller's frame, so
instead of the actual "caller1 -> caller2 -> leaf" chain only
"caller1 -> leaf" would be captured.

Sponsored by: Ampere Computing
Submitted by: Klara, Inc.

2 days ago[armv8crypto] Fix cryptodev probe logic in armv8crypto
gonzo [Thu, 22 Oct 2020 04:49:14 +0000 (04:49 +0000)]
[armv8crypto] Fix cryptodev probe logic in armv8crypto

Add missing break to prevent falling through to the default case statement
and returning EINVAL for all session configs.

Sponsored by: Ampere Computing
Submitted by: Klara, Inc.

2 days agoPass lower 3 bits of sector_count for FPDMA commands.
mav [Thu, 22 Oct 2020 03:30:39 +0000 (03:30 +0000)]
Pass lower 3 bits of sector_count for FPDMA commands.

When this code was written those bits were N/A, but now the lowest bit
is Rebuild Assist Recovery Control (RARC).

MFC after: 1 month

2 days agoImport tzdata 2020c
philip [Thu, 22 Oct 2020 01:05:34 +0000 (01:05 +0000)]
Import tzdata 2020c


MFC after: 1 day

2 days agommap(2): Document guard size for MAP_STACK and related EINVAL.
kib [Wed, 21 Oct 2020 21:40:33 +0000 (21:40 +0000)]
mmap(2): Document guard size for MAP_STACK and related EINVAL.

Based on submission by: emaste
Reviewed by: emaste, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision:

2 days agoAdd support for stacked VLANs (IEEE 802.1ad, AKA Q-in-Q).
melifaro [Wed, 21 Oct 2020 21:28:20 +0000 (21:28 +0000)]
Add support for stacked VLANs (IEEE 802.1ad, AKA Q-in-Q).

802.1ad interfaces are created with ifconfig using the "vlanproto" parameter.
Eg., the following creates a 802.1Q VLAN (id #42) over a 802.1ad S-VLAN
(id #5) over a physical Ethernet interface (em0).

ifconfig vlan5 create vlandev em0 vlan 5 vlanproto 802.1ad up
ifconfig vlan42 create vlandev vlan5 vlan 42 inet

VLAN_MTU, VLAN_HWCSUM and VLAN_TSO capabilities should be properly
supported. VLAN_HWTAGGING is only partially supported, as there is
currently no IFCAP_VLAN_* denoting the possibility to set the VLAN
EtherType to anything else than 0x8100 (802.1ad uses 0x88A8).

Submitted by: Olivier Piras
Sponsored by: RG Nets
Differential Revision:

2 days agocxgbe(4): display correct tid range for T6 based -SO cards.
np [Wed, 21 Oct 2020 20:42:29 +0000 (20:42 +0000)]
cxgbe(4): display correct tid range for T6 based -SO cards.

Reported by: Chelsio QA
MFC after: 1 week
Sponsored by: Chelsio Communications

2 days agoMake linux(4) warn about unsupported socket(2) types.
trasz [Wed, 21 Oct 2020 18:45:48 +0000 (18:45 +0000)]
Make linux(4) warn about unsupported socket(2) types.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision:

2 days agontb_tool: ubuf is too small to hold a human readable 64 bit value
vangyzen [Wed, 21 Oct 2020 17:11:57 +0000 (17:11 +0000)]
ntb_tool: ubuf is too small to hold a human readable 64 bit value

ubuf buffer is too small. It should be 18 if a NULL is not needed,
or 19 to hold the NULL terminator for the full 64-BIT value plus
the 0x prefix.

Submitted by:
Reviewed by: markj mav
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
Differential Revision:

2 days agocol(1): Add EXAMPLES section
fernape [Wed, 21 Oct 2020 16:30:34 +0000 (16:30 +0000)]
col(1): Add EXAMPLES section

Add a small example.
Cross reference clean up for colcrt, nroff and tbl.

Reviewed by: gbe@, bcr@
Approved by: gbe@
Differential Revision:

2 days agovmapbuf: don't smuggle address or length in buf
brooks [Wed, 21 Oct 2020 16:00:15 +0000 (16:00 +0000)]
vmapbuf: don't smuggle address or length in buf

Instead, add arguments to vmapbuf.  Since this argument is
always a pointer use a type of void * and cast to vm_offset_t in
vmapbuf.  (In CheriBSD we've altered vm_fault_quick_hold_pages to
take a pointer and check its bounds.)

In no other situtation does b_data contain a user pointer and vmapbuf
replaces b_data with the actual mapping.

Suggested by: jhb
Reviewed by: imp, jhb
Obtained from: CheriBSD
MFC after: 1 week
Sponsored by: DARPA
Differential Revision:

2 days agoAdd dtrace SDT probe ipfw:::rule-matched.
ae [Wed, 21 Oct 2020 15:01:33 +0000 (15:01 +0000)]
Add dtrace SDT probe ipfw:::rule-matched.

It helps to reduce complexity with debugging of large ipfw rulesets.
Also define several constants and translators, that can by used by
dtrace scripts with this probe.

Reviewed by: gnn
Obtained from: Yandex LLC
MFC after: 2 weeks
Sponsored by: Yandex LLC
Differential Revision:

3 days agocache: drop the spurious slash_prefixed argument
mjg [Wed, 21 Oct 2020 05:57:25 +0000 (05:57 +0000)]
cache: drop the spurious slash_prefixed argument

3 days agoMove list_cloners to libifconfig
freqlabs [Wed, 21 Oct 2020 05:27:25 +0000 (05:27 +0000)]
Move list_cloners to libifconfig

Move list_cloners() from ifconfig(8) to libifconfig(3) where it can be
reused by other consumers.

Reviewed by: kp
Differential Revision:

3 days agoImprove FPU Tag Word reconstruction on i386 to indicate register states.
kib [Wed, 21 Oct 2020 00:15:12 +0000 (00:15 +0000)]
Improve FPU Tag Word reconstruction on i386 to indicate register states.

Improve the code reconstructing en_tw in struct fpreg32 from FXSAVE
results so that all register states are indicated correctly.  The
previous code unconditionally mapped non-empty register state to
'normalized value' constant.  The new code explicitly distinguishes
the 'zero value' and 'special value' constants as well.  This improves
consistency between real FSAVE and translation from FXSAVE, and
ensures that tests using PT_GETFPREGS can rely on a single correct
value independently of the underlying implementation.

PR: 250454
Sponsored by: The FreeBSD Foundation
Obtained from: Moritz Systems
Submitted by: Michał Górny <>
Discussed with: emaste
MFC after: 1 week
Differential revision:

3 days agogeom_ctl.c: remove stale header files
rew [Tue, 20 Oct 2020 20:59:13 +0000 (20:59 +0000)]
geom_ctl.c: remove stale header files

- Remove "opt_geom.h", no kernel options are used.

- Remove <sys/sysctl.h>, no sysctl functionality is used here.

- Remove <sys/bio.h>, requirements for bio moved out in r112534.

- Remove <sys/lock.h> and <sys/mutex.h>, last used by DROP_GIANT() and
  PICKUP_GIANT(), which were removed in r115624.

- Remove <sys/disk.h> and <sys/kernel.h>, not used.

Reviewed by: phk, kevans (mentor)
Approved by: phk, kevans (mentor)
Differential Revision:

3 days agoarm64: add uhci to GENERIC
emaste [Tue, 20 Oct 2020 20:11:29 +0000 (20:11 +0000)]
arm64: add uhci to GENERIC

uhci is (or, can be) used by VMware ESXi-Arm.

PR: 250308
Reported by: Vincent Milum Jr
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation

3 days agoAdd a kernel crypto driver using assembly routines from OpenSSL.
jhb [Tue, 20 Oct 2020 17:50:18 +0000 (17:50 +0000)]
Add a kernel crypto driver using assembly routines from OpenSSL.

Currently, this supports SHA1 and SHA2-{224,256,384,512} both as plain
hashes and in HMAC mode on both amd64 and i386.  It uses the SHA
intrinsics when present similar to aesni(4), but uses SSE/AVX
instructions when they are not.

Note that some files from OpenSSL that normally wrap the assembly
routines have been adapted to export methods usable by 'struct
auth_xform' as is used by existing software crypto routines.

Reviewed by: gallatin, jkim, delphij, gnn
Sponsored by: Netflix
Differential Revision:

3 days agoFix linprocfs(4) /proc/self/mem semantics to more closely match Linux.
trasz [Tue, 20 Oct 2020 17:24:29 +0000 (17:24 +0000)]
Fix linprocfs(4) /proc/self/mem semantics to more closely match Linux.
Steam's Anti-Cheat might depend on it.

PR: 248223
Analyzed by: Alex S <>
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision:

3 days agoFix potential race condition in linux stat(2).
trasz [Tue, 20 Oct 2020 17:19:10 +0000 (17:19 +0000)]
Fix potential race condition in linux stat(2).

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision:

3 days agoMove generated OpenSSL assembly routines into the kernel sources.
jhb [Tue, 20 Oct 2020 17:00:43 +0000 (17:00 +0000)]
Move generated OpenSSL assembly routines into the kernel sources.

Sponsored by: Netflix

3 days agoUse a template assembly file to generate the embedded MFS.
jhb [Tue, 20 Oct 2020 16:48:45 +0000 (16:48 +0000)]
Use a template assembly file to generate the embedded MFS.

This uses the .incbin directive to pull in the MFS image contents.
Using assembly directly ensures that symbols can be defined with the
name and properties (such as .size) desired without having to rename
symbols, etc. via a second objcopy invocation.  Since it is compiled
by the C compiler driver, it also avoids the need for all of the
EMBEDFS* make variables.

Suggested by: jrtc27
Reviewed by: kib, markj
Obtained from: CheriBSD
MFC after: 2 weeks
Sponsored by: DARPA
Differential Revision:

3 days agorealpath(1): Add EXAMPLES section.
fernape [Tue, 20 Oct 2020 13:15:26 +0000 (13:15 +0000)]
realpath(1): Add EXAMPLES section.

Add a small example for this simple command.

Approved by: manpages (gbe@)
Differential Revision:

3 days agocompress(1): Add EXAMPLES section
fernape [Tue, 20 Oct 2020 13:05:25 +0000 (13:05 +0000)]
compress(1): Add EXAMPLES section

Add 5 examples showing basic usage.

Approved by: manpages (gbe@)
Differential Revision:

4 days agoufs: catch up with removal of thread argument from VOP_INACTIVE
mjg [Tue, 20 Oct 2020 09:46:20 +0000 (09:46 +0000)]
ufs: catch up with removal of thread argument from VOP_INACTIVE

4 days agoBump __FreeBSD_version after VOP VPTOCNP and INACTIVE changes
mjg [Tue, 20 Oct 2020 07:19:44 +0000 (07:19 +0000)]
Bump __FreeBSD_version after VOP VPTOCNP and INACTIVE changes

4 days agovfs: drop the de facto curthread argument from VOP_INACTIVE
mjg [Tue, 20 Oct 2020 07:19:03 +0000 (07:19 +0000)]
vfs: drop the de facto curthread argument from VOP_INACTIVE

4 days agovfs: drop spurious cred argument from VOP_VPTOCNP
mjg [Tue, 20 Oct 2020 07:18:27 +0000 (07:18 +0000)]
vfs: drop spurious cred argument from VOP_VPTOCNP

4 days agoFurther refinements of ptsname_r(3) interface:
delphij [Tue, 20 Oct 2020 01:29:45 +0000 (01:29 +0000)]
Further refinements of ptsname_r(3) interface:

 - Hide ptsname_r under __BSD_VISIBLE for now as the specification
   is not finalized at this time.
 - Keep sorted.
 - Avoid the interposing of ptsname_r(3) from an user application
   from breaking ptsname(3) by making the implementation a static
   method and call the static function from ptsname(3) instead.

Reported by: kib
Reviewed by: kib, jilles
MFC after: 2 weeks
Differential Revision:

4 days agoFix build: only set iommu buswide flag if IOMMU code is included.
br [Mon, 19 Oct 2020 22:32:36 +0000 (22:32 +0000)]
Fix build: only set iommu buswide flag if IOMMU code is included.

Sponsored by: Innovate DSbD

4 days agoAdd IOMMU_BUSWIDE ahci quirk.
br [Mon, 19 Oct 2020 21:27:27 +0000 (21:27 +0000)]
Add IOMMU_BUSWIDE ahci quirk.

Some controllers use PCI function 1 as the requester ID for DMA transfers,
but the controllers are not PCI multifunction.

Set the iommu buswide flag for them. This should instruct an IOMMU driver
to use the same translation rule for all the devices and functions of
a bus.

This was discovered on the ARM Neoverse N1 System Development Platform

Bug reference:

Reported by: andrew
Reviewed by: kib, mav
Sponsored by: Innovate DSbD
Differential Revision:

4 days agocxgbe(4): Updates to the drop features from r366532.
np [Mon, 19 Oct 2020 21:11:49 +0000 (21:11 +0000)]
cxgbe(4): Updates to the drop features from r366532.

MFC after: 1 week
Sponsored by: Chelsio Communications

4 days agobuild vmware modules on arm64
emaste [Mon, 19 Oct 2020 20:43:29 +0000 (20:43 +0000)]
build vmware modules on arm64

pvscsi and vmxnet3 build and work.  Exclude vmci for now as it contains
x86-specific assembly.

Reported by: Vincent Milum Jr
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation

4 days agoDestroy cloned interfaces at netif stop, netif restart and shutdown.
cy [Mon, 19 Oct 2020 20:37:38 +0000 (20:37 +0000)]
Destroy cloned interfaces at netif stop, netif restart and shutdown.
This is especially important during shutdown because a child interface
of lagg with WOL enabled will not enable WOL at interface shutdown and
thus no WOL to wake up the device (and machine).

PR: 158734, 109980
Reported by: Antonio Huete Jimenez <tuxillo at>
Marat N.Afanasyev <marat at>
reviewed by: kp
MFC after: 1 week
Differential Revision:

4 days agoFix fallout from r366811.
trasz [Mon, 19 Oct 2020 20:26:37 +0000 (20:26 +0000)]
Fix fallout from r366811.

PR: 250442
Reported by: lwhsu
Reviewed by: mav
MFC after: 2 weeks
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision:

4 days agoRe-enable receive flow control for TOE TLS sockets.
jhb [Mon, 19 Oct 2020 20:08:50 +0000 (20:08 +0000)]
Re-enable receive flow control for TOE TLS sockets.

Flow control was disabled during initial TOE TLS development to
workaround a hang (and to match the Linux TOE TLS support for T6).
The rest of the TOE TLS code maintained credits as if flow control was
enabled which was inherited from before the workaround was added with
the exception that the receive window was allowed to go negative.
This negative receive window handling (rcv_over) was because I hadn't
realized the full implications of disabling flow control.

To clean this up, re-enable flow control on TOE TLS sockets.  The
existing TPF_FORCE_CREDITS workaround is sufficient for the original
hang.  Now that flow control is enabled, remove the rcv_over
workaround and instead assert that the receive window never goes
negative matching plain TCP TOE sockets.

Reviewed by: np
MFC after: 2 weeks
Sponsored by: Chelsio Communications
Differential Revision:

4 days agocxgbe(4): Fix page fault in t4_get_lb_stats with 2 port T5 cards.
np [Mon, 19 Oct 2020 20:08:47 +0000 (20:08 +0000)]
cxgbe(4): Fix page fault in t4_get_lb_stats with 2 port T5 cards.

PR: 250449
Reported by: freqlabs@
MFC after: 1 week
Sponsored by: Chelsio Communications

4 days agoFix a couple of bugs for asym crypto introduced in r359374.
jhb [Mon, 19 Oct 2020 20:04:03 +0000 (20:04 +0000)]
Fix a couple of bugs for asym crypto introduced in r359374.

- Check for null pointers in the crypto_drivers[] array when checking
  for empty slots in crypto_select_kdriver().

- Handle the case where crypto_kdone() is invoked on a request where
  krq_cap is NULL due to not finding a matching driver.

Reviewed by: markj
Sponsored by: Chelsio Communications
Differential Revision:

4 days agoEnable SUBDIR_PARALLEL for lib/googletest
arichardson [Mon, 19 Oct 2020 19:51:03 +0000 (19:51 +0000)]
Enable SUBDIR_PARALLEL for lib/googletest

This saves a few seconds in a parallel build since we can build the
gtest_main and gmock subdirectories in parallel.

Reviewed By: ngie
Differential Revision:

4 days agoMajor improvement to build parallelism for googletest internal tests
arichardson [Mon, 19 Oct 2020 19:50:57 +0000 (19:50 +0000)]
Major improvement to build parallelism for googletest internal tests

Currently the googletest internal tests build after the matching library.
However, each of these is serialized at the top level makefile.
Additionally some of the tests (e.g. the gmock-matches-test) take up to
90 seconds to build with clang -O2. Having to wait for this test to
complete before continuing to the next directory seriously slows down the
parllelism of a -j32 build.
Before this change running `make -C lib/googletest -j32 -s` in buildenv
took 202 seconds, now it's 153 due to improved parallelism.

Reviewed By: emaste (no objection)
Differential Revision:

4 days agonullfs: ensure correct lock is taken after bypass.
kib [Mon, 19 Oct 2020 19:23:22 +0000 (19:23 +0000)]
nullfs: ensure correct lock is taken after bypass.

If lower VOP relocked the lower vnode, it is possible that nullfs
vnode was reclaimed meantime.  In this case nullfs vnode no longer
shares lock with lower vnode, which breaks locking protocol.

Check for the condition and acquire nullfs vnode lock if detected.

Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week

4 days agovgonel(): avoid recursing into VOP_INACTIVE().
kib [Mon, 19 Oct 2020 19:20:23 +0000 (19:20 +0000)]
vgonel(): avoid recursing into VOP_INACTIVE().

It is a common pattern for filesystems' VOP_INACTIVE() implementation
to forcibly reclaim the vnode when its state is final.  For instance,
UFS vnode with zero link count is removed, and since it is
inactivated, the last open reference on it is dropped.

On the other hand, vnode might get spurious usecount reference for
many reasons.  If the spurious reference exists while vgonel() checks
for active state of the vnode, it would recurse into VOP_INACTIVE().

Fix it by checking and not doing inactivation when vgone() was called
from inactive VOP.

Reported and tested by: pho
Discussed with: mjg
Sponsored by: The FreeBSD Foundation
MFC after: 1 week

4 days agouma: fix KTR message after r366840
emaste [Mon, 19 Oct 2020 18:54:44 +0000 (18:54 +0000)]
uma: fix KTR message after r366840

Reported by: bz
Sponsored by: The FreeBSD Foundation

4 days agocache: promote negative entries based on more than one hit
mjg [Mon, 19 Oct 2020 18:51:51 +0000 (18:51 +0000)]
cache: promote negative entries based on more than one hit

During tinderbox and similar workloads negative entries get at least one
hit before they get evicted. In the current scheme this avoidably promotes

Be conservative and stick to 2 hits for now.

4 days agoCheck TF_TOE not the tod pointer to determine if TOE is active.
jhb [Mon, 19 Oct 2020 18:24:06 +0000 (18:24 +0000)]
Check TF_TOE not the tod pointer to determine if TOE is active.

The TF_TOE flag is the check used in the rest of the network stack to
determine if TOE is active on a socket.  There is at least one path in
the cxgbe(4) TOE driver that can leave the tod pointer non-NULL on a
socket not using TOE.

Reported by: Sony Arpita Das <>
Reviewed by: np
Sponsored by: Chelsio Communications
Differential Revision:

4 days agoMark asymmetric cryptography via OCF deprecated for 14.0.
jhb [Mon, 19 Oct 2020 18:21:41 +0000 (18:21 +0000)]
Mark asymmetric cryptography via OCF deprecated for 14.0.

Only one MIPS-specific driver implements support for one of the
asymmetric operations.  There are no in-kernel users besides
/dev/crypto.  The only known user of the /dev/crypto interface was the
engine in OpenSSL releases before 1.1.0.  1.1.0 includes a rewritten
engine that does not use the asymmetric operations due to lack of

Reviewed by: cem, markj
MFC after: 1 week
Sponsored by: Chelsio Communications
Differential Revision:

4 days agoProperly clear PCB_KERNNPX in fpu_kern_leave().
jhb [Mon, 19 Oct 2020 17:35:45 +0000 (17:35 +0000)]
Properly clear PCB_KERNNPX in fpu_kern_leave().

PR: 250423
Reported by: CI
Tested by: lwhsu

4 days agoicmp6: Count packets dropped due to an invalid hop limit
markj [Mon, 19 Oct 2020 17:07:19 +0000 (17:07 +0000)]
icmp6: Count packets dropped due to an invalid hop limit

Pad the icmp6stat structure so that we can add more counters in the
future without breaking compatibility again, last done in r358620.
Annotate the rarely executed error paths with __predict_false while

Reviewed by: bz, melifaro
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision:

4 days agolink_elf_obj: Colour VM objects
markj [Mon, 19 Oct 2020 16:57:59 +0000 (16:57 +0000)]
link_elf_obj: Colour VM objects

This will cause the VM to back sufficiently large .text sections, such
as those in zfs.ko or amdgpu.ko on amd64, with superpage mappings when

Reviewed by: alc, kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision:

4 days agouma: Respect uk_reserve in keg_drain()
markj [Mon, 19 Oct 2020 16:57:40 +0000 (16:57 +0000)]
uma: Respect uk_reserve in keg_drain()

When a reserve of free items is configured for a zone, the reserve must
not be reclaimed under memory pressure.  Modify keg_drain() to simply
respect the reserved pool.

While here remove an always-false uk_freef == NULL check (kegs that
shouldn't be drained should set _NOFREE instead), and make sure that the
keg_drain() KTR statement does not reference an uninitialized variable.

Reviewed by: alc, rlibby
Sponsored by: The FreeBSD Foundation
Differential Revision: