Colin Percival [Sun, 17 May 2020 21:54:59 +0000 (21:54 +0000)]
Add /etc/autofs/special_efs to EC2 AMIs
Since Amazon Elastic File System is only available within AWS, it seems
more appropriate to have this added only in EC2 AMIs rather than
"polluting" non-EC2 images with it.
Reviewed by: gjb
MFC after: 7 days
Relnotes: Amazon EFS filesystems can be automounted by enabling autofs
and placing "/efs -efs" into /etc/auto_master.
Sponsored by: https://www.patreon.com/cperciva
Differential Revision: https://reviews.freebsd.org/D24791
Emmanuel Vadot [Sun, 17 May 2020 20:14:49 +0000 (20:14 +0000)]
linuxkpi: Add offsetofend macro
This calculate the offset of the end of the member in the given struct.
Needed by DRM in Linux v5.3
Sponsored-by: The FreeBSD Foudation
Differential Revision: https://reviews.freebsd.org/D24849
Emmanuel Vadot [Sun, 17 May 2020 20:12:16 +0000 (20:12 +0000)]
linuxkpi: Add __mutex_init
Same as mutex_init, the lock_class_key argument seems to be only used for
debug in Linux, simply ignore it for now.
Needed by DRM in Linux v5.3
Sponsored-by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24848
Emmanuel Vadot [Sun, 17 May 2020 20:09:11 +0000 (20:09 +0000)]
linuxkpi: Add atomic_dec_and_mutex_lock
This function decrement the counter and if the result is 0 it acquires
the mutex and returns 1, if not it simply returns 0.
Needed by DRM from Linux v5.3
Sponsored-by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24847
Alexander V. Chernikov [Sun, 17 May 2020 15:32:36 +0000 (15:32 +0000)]
Remove redundant checks for nhop validity.
Currently NH_IS_VALID() simly aliases to RT_LINK_IS_UP(), so we're
checking the same thing twice.
In the near future the implementation of this check will be simpler,
as there are plans to introduce control-plane interface status monitoring
similar to ipfw interface tracker.
Fedor Uporov [Sun, 17 May 2020 14:52:54 +0000 (14:52 +0000)]
Add BE architectures support.
Author of most initial version: pfg (https://reviews.freebsd.org/D23259)
Reviewed by: pfg
MFC after: 3 months
Differential Revision: https://reviews.freebsd.org/D24685
Fedor Uporov [Sun, 17 May 2020 14:10:46 +0000 (14:10 +0000)]
Restrict the max runp and runb return values in case of extents mapping.
This restriction already present in case of indirect mapping, do the same
in case of extents.
PR: 246182
Reported by: Teran McKinney
MFC after: 2 weeks
Fedor Uporov [Sun, 17 May 2020 14:03:13 +0000 (14:03 +0000)]
Fix incorrect inode link count check in case of rename.
The check was incorrect because the directory inode link count have
min value 2 after dir_nlink extfs feature introduction.
Fedor Uporov [Sun, 17 May 2020 14:00:54 +0000 (14:00 +0000)]
Add inode bitmap tail initialization.
Make ext2fs compatible with changes introduced in e2fsprogs v1.45.2.
Now the tail of inode bitmap is filled with 0xff pattern explicitly during
bitmap initialization phase to avoid e2fsck error like:
"Padding at end of inode bitmap is not set."
Alan Somers [Sun, 17 May 2020 02:41:50 +0000 (02:41 +0000)]
Reenable sys.geom.class.gate.ggate_test.ggated in CI
Should be fixed by r360613
PR: 244737
Reported by: lwhsu
Adrian Chadd [Sat, 16 May 2020 21:59:41 +0000 (21:59 +0000)]
[ath_rate_sample] Fix correct status when completing frames with short failures.
My preivous logic was a bit wrong. This caused transmissions that failed due
to a mix of short and long retries to count intermediate rates as OK if the
LONG retry count indicated some retries had made it to this intermediate rate,
but the SHORT retry count was the one that caused the whole transmit to fail.
Now status is passed in again - and this is the status for the whole transmission -
and then update_stats() does some quick math to see if the current transmission
series hit its long retry count or not before updating things as a success
or failure.
Jilles Tjoelker [Sat, 16 May 2020 19:38:58 +0000 (19:38 +0000)]
sh/tests: Fix keywords on newly added test
Michael Tuexen [Sat, 16 May 2020 19:26:39 +0000 (19:26 +0000)]
Ensure that an stcb is not dereferenced when it is about to be
freed.
This issue was found by SYZKALLER.
MFC after: 3 days
Adrian Chadd [Sat, 16 May 2020 18:49:37 +0000 (18:49 +0000)]
[ath] Flip athratestats to use two columns for now.
Yeah I have too many rates on the screen now...
Colin Percival [Sat, 16 May 2020 18:37:48 +0000 (18:37 +0000)]
Move the devmatch rc.d script before netif in the boot process.
Prior to this change, using lagg to aggregate wired and wireless networks
was broken in the (relatively common) case where wifi drivers + firmware
are loaded by devmatch, since the interface didn't exist at the time when
the lagg interface was being created.
Suggested by: imp
MFC after: 3 days
Pawel Biernacki [Sat, 16 May 2020 17:05:44 +0000 (17:05 +0000)]
sysctl: fix setting net.isr.dispatch during early boot
Fix another collateral damage of r357614: netisr is initialised way before
malloc() is available hence it can't use sysctl_handle_string() that
allocates temporary buffer. Handle that internally in
sysctl_netisr_dispatch_policy().
PR: 246114
Reported by: delphij
Reviewed by: kib
Approved by: kib (mentor)
Differential Revision: https://reviews.freebsd.org/D24858
Jilles Tjoelker [Sat, 16 May 2020 16:29:23 +0000 (16:29 +0000)]
sh: Fix double INTON with vfork
The shell maintains a count of the number of times SIGINT processing has
been disabled via INTOFF, so SIGINT processing resumes when all disables
have enabled again (INTON).
If an error occurs in a vfork() child, the processing of the error enables
SIGINT processing again, and the INTON in vforkexecshell() causes the count
to become negative.
As a result, a later INTOFF may not actually disable SIGINT processing. This
might cause memory corruption if a SIGINT arrives at an inopportune time. As
of r360452, it causes the shell to abort when it would unsafely allocate or
free memory in certain ways.
Note that various places such as errors in non-special builtins
unconditionally reset the count to 0, so the problem might still not always
be visible.
PR: 246497
Reported by: jbeich
MFC after: 2 weeks
Conrad Meyer [Sat, 16 May 2020 14:33:08 +0000 (14:33 +0000)]
cam: ANSIfy 0-argument function definitions
No functional change.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D24854
Hans Petter Selasky [Sat, 16 May 2020 14:27:50 +0000 (14:27 +0000)]
Implement synchronize_srcu_expedited() in the LinuxKPI.
Differential Revision: https://reviews.freebsd.org/D24798
MFC after: 1 week
Sponsored by: Mellanox Technologies
Adrian Chadd [Sat, 16 May 2020 06:09:24 +0000 (06:09 +0000)]
[ath] ok ok, fix the indenting now that I have 5 column packet sizes.
Now things line up nicely again. There's a lot of them, and I don't have a long
enough screen right now, but they at least line up right.
Adrian Chadd [Sat, 16 May 2020 05:07:45 +0000 (05:07 +0000)]
[ath_rate_sample] Limit the tx schedules for A-MPDU ; don't take short retries
into account and remove the requirement that the MCS rate is "higher" if we're
considering a new rate.
Ok, another fun one.
* In order for reliable non-software retried higher MCS rates, the TX schedules
(inconsistently!) use hard-coded lower rates at the end of the schedule.
Now, hard-coded is a problem because (a) it means that aggregate formation
is limited by the SLOWEST rate, so I never formed large AMDU frames for
3 stream rates, and (b) if the AP disables lower rates as base rates, it
complains about "unknown rix" every frame you transmit at that rate.
So, for now just disable the third and fourth schedule entry for AMPDUs.
Now I'm forming 32k and 64k aggregates for the higher density MCS rates
much more reliably.
It would be much nicer if the rate schedule stuff wasn't fixed but instead
I'd just populate ath_rc_series[] when I fetch the rates. This is all a
holdover of ye olde pre-11n stuff and I really just need to nuke it.
But for now, ye hack.
* The check for "is this MCS rate better" based on MCS itself is just garbage.
It meant things like going MCS0->7 would be fine, and say 0->8->16 is fine,
(as they're equivalent encoding but 1,2,3 spatial streams), BUT it meant
going something like MCS7->11 would fail even though it's likely that
MCS11 would just be better, both for EWMA/BER and throughput.
So for now just use the average tx time. The "right" way for this comparison
would be to compare PHY bitrates rather than MCS / rate indexes, but I'm not
yet there. The bit rates ARE available in the PHY index, but honestly
I have a lot of other cleaning up to here before I think about that.
* Don't include the RTS/CTS retry count (and thus time) into the average tx time
caluation. It just makes temporarily failures make the rate look bad by
QUITE A LOT, as RTS/CTS exchanges are (a) long, and (b) mostly irrelevant
to the actual rate being tried. If we keep hitting RTS/CTS failures then
there's something ELSE wrong on the channel, not our selected rate.
Kyle Evans [Sat, 16 May 2020 04:52:29 +0000 (04:52 +0000)]
procctl(2): correct a minor cut-n-pasto
This is clearly describing PROC_PROTMAX_FORCE_DISABLE, rather than
PROC_ASL_FORCE_DISABLE.
Submitted by: sigsys@gmail.com
Justin Hibbits [Sat, 16 May 2020 03:52:30 +0000 (03:52 +0000)]
elftoolchain: Add powerpc64 definition to elftoolchain config
powerpc is already in place, but powerpc64 is needed separately.
Christian S.J. Peron [Sat, 16 May 2020 03:45:15 +0000 (03:45 +0000)]
Add BSM record conversion for a number of syscalls:
- thr_kill(2) and thr_exit(2) generally (no argument auditing here.
- A set of syscalls for the process descriptor family, specifically:
pdfork(2), pdgetpid(2) and pdkill(2)
For these syscalls, audit the file descriptor. In the case of pdfork(2)
a pointer to an integer (file descriptor) is passed in as an argument.
We audit the post initialized file descriptor (not the random garbage
that would have been passed in). We will also audit the child process
which was created from the fork operation (similar to what is done for
the fork(2) syscall).
pdkill(2) we audit the signal value and fd, and finally pdgetpid(2)
just the file descriptor:
- Following is a sample of the produced audit trails:
header,111,11,pdfork(2),0,Sat May 16 03:07:50 2020, + 394 msec
argument,0,0x39d,child PID
argument,2,0x2,flags
argument,1,0x8,fd
subject,root,root,0,root,0,924,0,0,0.0.0.0
return,success,925
header,79,11,pdgetpid(2),0,Sat May 16 03:07:50 2020, + 394 msec
argument,1,0x8,fd
subject,root,root,0,root,0,924,0,0,0.0.0.0
return,success,0
trailer,79
header,135,11,pdkill(2),0,Sat May 16 03:07:50 2020, + 395 msec
argument,1,0x8,fd
argument,2,0xf,signal
process_ex,root,root,0,root,0,925,0,0,0.0.0.0
subject,root,root,0,root,0,924,0,0,0.0.0.0
return,success,0
trailer,135
MFC after: 1 week
Justin Hibbits [Sat, 16 May 2020 03:33:28 +0000 (03:33 +0000)]
powerpc/qoriq: Add more devices to config for desktop usage
The most likely users of the QORIQ64 config nowadays are users of AmigaOne
X5000 systems, which are desktops. They need a framebuffer and
keyboard/mouse, so add these to the config so it works by default once
drm-current-kmod is installed.
Ed Maste [Sat, 16 May 2020 02:29:10 +0000 (02:29 +0000)]
libalias: retire cuseeme support
The CU-SeeMe videoconferencing client and associated protocol is at this
point a historical artifact; there is no need to retain support for this
protocol today.
Reviewed by: philip, markj, allanjude
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24790
Adrian Chadd [Sat, 16 May 2020 01:56:06 +0000 (01:56 +0000)]
[ath_rate_sample] Fix logic for determining whether to bump up an MCS rate.
* Fix formatting, cause reasons;
* Put back the "and the chosen rate is within 90% of the current rate" logic;
* Ensure the best rate and the current rate aren't the same; this ...
* ... fixes the packets_since_switch[] tracking to actually conut how many
frames since the rate switched, so now I know how stable stuff is; and
* Ensure that MCS can go up to a higher MCS at this or any other spatial stream.
My previous quick hack attempt was doing > rather than >= so you had to go
to both a higher root MCS rate (0..7) and spatial stream. Eg, you couldn't
go from MCS0 (1ss) to MCS8 (2ss) this way.
The best rate and switching rate logic still have a bunch more work to do
because they're still quite touchy when it comes to average tx time but at least
now it's choosing higher rates correctly when it wants to try a higher rate.
Tested:
* AR9380, STA mode
Colin Percival [Sat, 16 May 2020 01:50:28 +0000 (01:50 +0000)]
Send Lid status notification via devd from acpi_lid_status_update.
Some laptops don't send ACPI "lid status changed" notifications upon
opening the lid if the system was currently suspended. In r358219
this was partially fixed, updating the "lid_status" variable upon
resume even if there is no "status changed" notification from ACPI.
Unfortunately the fix in r358219 did not include notifying userland
via devd; this causes problems on systems using upowerd (e.g. KDE),
since upowerd remembers the most recent devd notification about the
lid status rather than querying the sysctl to get the current status.
This showed up as two symptoms when KDE's "When laptop lid closed: Sleep"
option is set:
1. 50% of the time, closing the lid would not trigger S3 sleep.
2. 50% of the time, plugging/unplugging AC power would trigger S3 sleep.
PR: 246477
MFC after: 3 days
Mark Johnston [Sat, 16 May 2020 00:28:12 +0000 (00:28 +0000)]
pf: Add a new zone for per-table entry counters.
Right now we optionally allocate 8 counters per table entry, so in
addition to memory consumed by counters, we require 8 pointers worth of
space in each entry even when counters are not allocated (the default).
Instead, define a UMA zone that returns contiguous per-CPU counter
arrays for use in table entries. On amd64 this reduces sizeof(struct
pfr_kentry) from 216 to 160. The smaller size also results in better
slab efficiency, so memory usage for large tables is reduced by about
28%.
Reviewed by: kp
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24843
Christian S.J. Peron [Fri, 15 May 2020 23:44:52 +0000 (23:44 +0000)]
Fix typo that snuck in
Reported by: Jose Luis Duran
MFC after: 1 week
John Baldwin [Fri, 15 May 2020 22:56:59 +0000 (22:56 +0000)]
Don't remove ubsec(4) manual page for WITHOUT_USB=yes.
In head this manpage has been removed entirely, but ubsec(4) is a PCI
device and not a USB device.
MFC after: 1 week
John Baldwin [Fri, 15 May 2020 22:55:49 +0000 (22:55 +0000)]
Remove Doxyfile for sys/dev/ubsec since it has been removed.
John Baldwin [Fri, 15 May 2020 22:55:28 +0000 (22:55 +0000)]
Remove the ubsecstats tool since ubsec(4) has been removed.
Reported by: markj
Christian S.J. Peron [Fri, 15 May 2020 20:29:41 +0000 (20:29 +0000)]
Bump revision date to today.
MFC after: 1 week
Christian S.J. Peron [Fri, 15 May 2020 20:24:08 +0000 (20:24 +0000)]
Remove references to pdwait4(2). This syscall was never implemented
and its presence just creates confusion.
Discussed with: cem
MFC after: 1 week
Adrian Chadd [Fri, 15 May 2020 20:03:53 +0000 (20:03 +0000)]
[ath] [ath_rate_sample] le oops, trim out an #if 1 that I didn't fully delete.
Cool, so now I know it's about 3 weeks between starting on freebsd coding
and breaking the build again. Queue dunce cap.
Adrian Chadd [Fri, 15 May 2020 18:51:20 +0000 (18:51 +0000)]
[ath] [ath_rate] Extend ath_rate_sample to better handle 11n rates and aggregates.
My initial rate control code was .. suboptimal. I wanted to at least get MCS
rates sent, but it didn't do anywhere near enough to handle low signal level links
or remotely keep accurate statistics.
So, 8 years later, here's what I should've done back then.
* Firstly, I wasn't at all tracking packet sizes other than the two buckets
(250 and 1600 bytes.) So, extend it to include 4096, 8192, 16384, 32768 and
65536. I may go add 2048 at some point if I find it's useful.
This is important for a few reasons. First, when forming A-MPDU or AMSDU
aggregates the frame sizes are larger, and thus the TX time calculation
is woefully, increasingly wrong. Secondly, the behaviour of 802.11 channels
isn't some fixed thing, both due to channel conditions and radios themselves.
Notably, there was some observations done a few years ago on 11n chipsets
which noticed longer aggregates showed an increase in failed A-MPDU sub-frame
reception as you got further along in the transmit time. It could be due to
a variety of things - transmitter linearity, channel conditions changing,
frequency/phase drift, etc - but the observation was to potentially form
shorter aggregates to improve BER.
* .. and then modify the ath TX path to report the length of the aggregate sent,
so as the statistics kept would line up with the correct bucket.
* Then on the rate control look-up side - i was also only using the first frame
length for an A-MPDU rate control lookup which isn't good enough here.
So, add a new method that walks the TID software queue for that node to
find out what the likely length of data available is. It isn't ALL of the
data in the queue because we'll only ever send enough data to fit inside the
block-ack window, so limit how many bytes we return to roughly what ath_tx_form_aggr()
would do.
* .. and cache that in the first ath_buf in the aggregate so it and the eventual
AMPDU length can be returned to the rate control code.
* THEN, modify the rate control code to look at them both when deciding which bucket
to attribute the sent frame on. I'm erring on the side of caution and using the
size bucket that the lookup is based on.
Ok, so now the rate lookups and statistics are "more correct". However, MCS rates
are not the same as 11abg rates in that they're not a monotonically incrementing
set of faster rates and you can't assume that just because a given MCS rate fails,
the next higher one wouldn't work better or be a lower average tx time.
So, I had to do a bunch of surgery to the best rate and sample rate math.
This is the bit that's a WIP.
* First, simplify the statistics updates (update_stats()) to do a single pass on
all rates.
* Next, make sure that each rate average tx time is updated based on /its/ failure/success.
Eg if you sent a frame with { MCS15, MCS12, MCS8 } and MCS8 succeeded, MCS15 and MCS
12 would have their average tx time updated for /their/ part of the transmission,
not the whole transmission.
* Next, EWMA wasn't being fully calculated based on the /failures/ in each of the
rate attempts. So, if MCS15, MCS12 failed above but MCS8 didn't, then ensure
that the statistics noted that /all/ subframes failed at those rates, rather than
the eventual set of transmitted/sent frames. This ensures the EWMA /and/ average
TX time are updated correctly.
* When picking a sample rate and initial rate, probe rates aroud the current MCS
but limit it to MCS0..7 /for all spatial streams/, rather than doing crazy things
like hitting MCS7 and then probing MCS8 - MCS8 is basically MCS0 but two spatial
streams. It's a /lot/ slower than MCS7. Also, the reverse is true - if we're at
MCS8 then don't probe MCS7 as part of it, it's not likely to succeed.
* Fix bugs in pick_best_rate() where I was /immediately/ choosing the highest MCS
rate if there weren't any frames yet transmitted. I was defaulting to 25% EWMA and
.. then each comparison would accept the higher rate. Just skip those; sampling
will fill in the details.
So, this seems to work a lot better. It's not perfect; I'm still seeing a lot of
instability around higher MCS rates because there are bursts of loss/retransmissions
that aren't /too/ bad. But i'll keep iterating over this and tidying up my hacks.
Ok, so why this still something I'm poking at? rather than porting minstrel_ht?
ath_rate_sample tries to minimise airtime, not maximise throughput. I have
extended it with an EWMA based on sub-frame success/failures - high MCS rates
that have partially successful receptions still show super short average frame
times, but a /lot/ of retransmits have to happen for that to work.
So for MCS rates I also track this EWMA and ensure that the rates I'm choosing
don't have super crappy packet failures. I don't mind not getting lower
peak throughput versus minstrel_ht; instead I want to see if I can make "minimise
airtime" work well.
Tested:
* AR9380, STA mode
* AR9344, STA mode
* AR9580, STA/AP mode
Michael Reifenberger [Fri, 15 May 2020 17:37:08 +0000 (17:37 +0000)]
Introduce sysputpage() to display large page size with human readable format.
Using UI units allows to fit larger numbers in columns.
Stop calling v_page_size - this is a value that doesn't change at runtime.
Renamed WINDOW *wnd to *wd to avoid conflict with global *wnd variable.
Use bit-shift to convert page size to byte.
PR: 246458
Submitted by: ota@j.email.ne.jp
MFC after: 2 weeks
Differential Revision: D24834
Conrad Meyer [Fri, 15 May 2020 15:54:22 +0000 (15:54 +0000)]
vmm(4), bhyve(8): Expose kernel-emulated special devices to userspace
Expose the special kernel LAPIC, IOAPIC, and HPET devices to userspace
for use in, e.g., fallback instruction emulation (when userspace has a
newer instruction decode/emulation layer than the kernel vmm(4)).
Plumb the ioctl through libvmmapi and register the memory ranges in
bhyve(8).
Reviewed by: grehan
Differential Revision: https://reviews.freebsd.org/D24525
Michael Tuexen [Fri, 15 May 2020 14:06:37 +0000 (14:06 +0000)]
Allow only IPv4 addresses in sendto() for TCP on AF_INET sockets.
This problem was found by looking at syzkaller reproducers for some other
problems.
Reviewed by: rrs
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D24831
Randall Stewart [Fri, 15 May 2020 14:00:12 +0000 (14:00 +0000)]
This fixes several skyzaller issues found with the
help of Michael Tuexen. There was some accounting
errors with TCPFO for bbr and also for both rack
and bbr there was a FO case where we should be
jumping to the just_return_nolock label to
exit instead of returning 0. This of course
caused no timer to be running and thus the
stuck sessions.
Reported by: Michael Tuexen and Skyzaller
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D24852
Konstantin Belousov [Fri, 15 May 2020 13:53:10 +0000 (13:53 +0000)]
Improve comment for compat32 handling of sysctl hw.pagesizes.
Explain why truncation works as intended.
Reformat.
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Konstantin Belousov [Fri, 15 May 2020 13:52:39 +0000 (13:52 +0000)]
Revert r361077 to recommit with proper message.
Konstantin Belousov [Fri, 15 May 2020 13:50:08 +0000 (13:50 +0000)]
Implement RTLD_DEEPBIND.
PR: 246462
Tested by: Martin Birgmeier <d8zNeCFG@aon.at>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D24841
Andrew Turner [Fri, 15 May 2020 13:33:48 +0000 (13:33 +0000)]
Remove arm64_idcache_wbinv_range as it's unused.
Sponsored by: Innovate UK
Hans Petter Selasky [Fri, 15 May 2020 12:47:39 +0000 (12:47 +0000)]
Assign process group of the TTY under the "proctree_lock".
This fixes a race where concurrent calls to doenterpgrp() and
leavepgrp() while TIOCSCTTY is executing may result in tp->t_pgrp
changing value so that tty_rel_pgrp() misses clearing it to NULL. For
more details refer to the use of pgdelete() in the kernel.
No functional change intended.
Panic backtrace:
__mtx_lock_sleep() # page fault due to using destroyed mutex
tty_signal_pgrp()
tty_ioctl()
ptsdev_ioctl()
kern_ioctl()
sys_ioctl()
amd64_syscall()
MFC after: 1 week
Sponsored by: Mellanox Technologies
Benedict Reuschling [Fri, 15 May 2020 12:04:39 +0000 (12:04 +0000)]
Fix SYNPOSIS section to point to the proper include directive.
netgraph(3) points to #include <netgraph/netgraph.h>, which is kernel only.
The man page refers to the user-space part of the netgraph module, which is
located in <netgraph.h>.
Submitted by: lutz_donnerhacke.de
Approved by: bcr
Differential Revision: https://reviews.freebsd.org/D23814
Konstantin Belousov [Fri, 15 May 2020 11:58:01 +0000 (11:58 +0000)]
Implement RTLD_DEEPBIND.
PR: 246462
Tested by: Martin Birgmeier <d8zNeCFG@aon.at>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D24841
Aleksandr Fedorov [Fri, 15 May 2020 11:03:27 +0000 (11:03 +0000)]
bhyve: Fix processing of netgraph backend options.
After r360820, additional parameters are passed through the argument 'opts', and the name of the backend through the argument 'devname'. So, there is no need to skip the backend name from the 'opts' argument.
Conrad Meyer [Fri, 15 May 2020 03:54:25 +0000 (03:54 +0000)]
ObsoleteFiles: pdwait4.2.gz
A belated follow-up to r320058.
Ryan Moeller [Thu, 14 May 2020 23:38:11 +0000 (23:38 +0000)]
jail: Add exec.prepare and exec.release command hooks
This change introduces new jail command hooks that run before and after any
other actions.
The exec.prepare hook can be used for example to invoke a script that checks
if the jail's root exists, creating it if it does not. Since arbitrary
variables in jail.conf can be passed to the command, it can be pretty useful
for templating jails.
An example use case for exec.release would be to remove the filesystem of an
ephemeral jail.
The names "prepare" and "release" are borrowed from the names of similar hooks
in libvirt.
Reviewed by: jamie, manpages, mmacy
Approved by: mmacy (mentor)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D24829
Kyle Evans [Thu, 14 May 2020 23:20:58 +0000 (23:20 +0000)]
pf tests: fix up a couple WARNS= 6 nits
common_init_tbl is only used within this single CU, so it should be marked
static.
WARNS=6 also complained about the var defined by
`ATF_TC_WITH_CLEANUP(getastats);` being unused, which turns out to be
because it's not been hooked up in ATF_TP_ADD_TCS. kp@ did not immediately
recall any reason for this, and the case passes on my local system, so hook
it up.
Note that I've not yet set WARNS= 6 here. Investigation is underway to see
if we can feasibly default WARNS to 6 for src builds to catch directories
too deep to inherit a WARNS from the top-level subdirectories' Makefile.inc.
Those particular WARNS settings will be subsequently removed as they become
redundant with a more-global default.
MFC after: 1 week
Peter Grehan [Thu, 14 May 2020 22:18:12 +0000 (22:18 +0000)]
Hide host CPUID 0x15 TSC/Crystal ratio/freq info from guest
In recent Linux (5.3+) and OpenBSD (6.6+) kernels, and with hosts that
support CPUID 0x15, the local APIC frequency is determined directly
from the reported crystal clock to avoid calibration against the 8254
timer.
However, the local APIC frequency implemented by bhyve is 128MHz, where
most h/w systems report frequencies around 25MHz. This shows up on
OpenBSD guests as repeated keystrokes on the emulated PS2 keyboard
when using VNC, since the kernel's timers are now much shorter.
Fix by reporting all-zeroes for CPUID 0x15. This allows guests to fall
back to using the 8254 to calibrate the local APIC frequency.
Future work could be to compute values returned for 0x15 that would
match the host TSC and bhyve local APIC frequency, though all dependencies
on this would need to be examined (for example, Linux will start using
0x16 for some hosts).
PR: 246321
Reported by: Jason Tubnor (and tested)
Reviewed by: jhb
Approved by: jhb, bz (mentor)
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D24837
Konstantin Belousov [Thu, 14 May 2020 21:12:08 +0000 (21:12 +0000)]
Add memalign(3), mostly for glibc compatibility.
Reviewed by: emaste, imp (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D24307
Konstantin Belousov [Thu, 14 May 2020 20:17:09 +0000 (20:17 +0000)]
Fix r361037.
Reorder flag manipulations and use barrier to ensure that the program
order is followed by compiler and CPU, for unlocked reader of so_state.
In collaboration with: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D24842
Jakub Wojciech Klama [Thu, 14 May 2020 19:57:52 +0000 (19:57 +0000)]
Import lib9p
7ddb1164407da19b9b1afb83df83ae65a71a9a66.
Approved by: trasz
MFC after: 1 month
Sponsored by: Conclusive Engineering (development), vStack.com (funding)
Mark Johnston [Thu, 14 May 2020 17:56:44 +0000 (17:56 +0000)]
Fix the i386 build after r361033.
Reported by: Jenkins
Konstantin Belousov [Thu, 14 May 2020 17:54:08 +0000 (17:54 +0000)]
Fix spurious ENOTCONN from closed unix domain socket other' side.
Sometimes, when doing read(2) over unix domain socket, for which the
other side socket was closed, read(2) returns -1/ENOTCONN instead of
EOF AKA zero-size read. This is because soreceive_generic() does not
lock socket when testing the so_state SS_ISCONNECTED|SS_ISCONNECTING
flags. It could end up that we do not observe so->so_rcv.sb_state bit
SBS_CANTRCVMORE, and then miss SS_ flags.
Change the test to check that the socket was never connected before
returning ENOTCONN, by adding all state bits for connected.
Reported and tested by: pho
In collaboration with: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D24819
Kyle Evans [Thu, 14 May 2020 17:52:29 +0000 (17:52 +0000)]
inetd(8): Add comments to all examples
Submitted by: debdrup (with some minor changes by kevans)
Reviewed by: bcr (manpages)
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D24818
Ed Maste [Thu, 14 May 2020 17:19:07 +0000 (17:19 +0000)]
ObsoleteFiles.inc: use date (not xxxx) for ubsec removal
Mark Johnston [Thu, 14 May 2020 16:07:27 +0000 (16:07 +0000)]
Call acpi_pxm_set_proximity_info() slightly earlier on x86.
This function is responsible for setting pc_domain in each pcpu
structure. Call it from the main function that starts APs, rather than
a separate SYSINIT. This makes it easier to close the window where
UMA's per-CPU slab allocator may be called while pc_domain is
uninitialized. In particular, the allocator uses pc_domain to allocate
domain-local pages, so allocations before this point end up using domain
0 for everything.
Reviewed by: kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24757
Mark Johnston [Thu, 14 May 2020 16:06:54 +0000 (16:06 +0000)]
Allocate UMA per-CPU counters earlier.
Otherwise anything counted before SI_SUB_VM_CONF is discarded. However,
it is useful to be able to see stats from allocations done early during
boot.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24756
Mark Johnston [Thu, 14 May 2020 15:49:37 +0000 (15:49 +0000)]
Assert that page table traversal functions don't operate on superpages.
Reviewed by: kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D24828
Benedict Reuschling [Thu, 14 May 2020 09:18:50 +0000 (09:18 +0000)]
Add new stats(7) man page and hook it up to the build.
This man page contains stat utilities that are available in
the base system. This is a better approach than looking them
up via "apropos stat" or similar commands.
Thanks to Daniel Ebdrup Jensen for writing the original page
and incorporating the feedback given.
Submitted by: Daniel Ebdrup Jensen
Reviewed by: 0mp, allanjude, brueffer, bcr
Approved by: bcr
MFC after: 3 days
Relnotes: yes (new stats(7) man page)
Differential Revision: https://reviews.freebsd.org/D24417
Adrian Chadd [Thu, 14 May 2020 05:01:18 +0000 (05:01 +0000)]
[ath] Extend the colours to 4, not 2.
There's 8 bins in the upcoming changeset to ath/ath_rate, so I need
more colours. Yeah, I know.
Brandon Bergren [Thu, 14 May 2020 04:00:35 +0000 (04:00 +0000)]
[PowerPC] Fix wrong instructions in _savegpr_X.
We were accidentally using stfd instead of stw in our SAVEGPR macro.
This has almost certainly been causing crashes when compiling with -Os.
Reviewed by: jhibbits (in irc)
MFC after: 3 days
Sponsored by: Tag1 Consulting, Inc.
Kyle Evans [Thu, 14 May 2020 03:30:27 +0000 (03:30 +0000)]
certctl: follow-up to r361022, prune blacklist as well
Otherwise, removals from the blacklist may not get processed as they should.
While we're here, restructure these to not bother with mkdir(1) if we've
already tested them to exist.
MFC after: 3 days
Kyle Evans [Thu, 14 May 2020 03:25:12 +0000 (03:25 +0000)]
certctl(8): don't completely nuke $CERTDESTDIR
It's been reported/noted that a well-timed `certctl rehash` will completely
obliterate $CERTDESTDIR, which may get used by ports or system
administrators. While we can't guarantee the certctl semantics when other
non-certctl-controlled bits live here, we should make some amount of effort
to play nice.
Pruning all existing links, which we'll subsequently rebuild as needed, is
sufficient for our needs. This can still be destructive, but it's perhaps
less likely to cause issues.
I also note that we should probably be pruning /etc/ssl/blacklisted upon
rehash as well.
Reported by: cem's dovecot server
MFC after: 3 days
Conrad Meyer [Thu, 14 May 2020 03:01:23 +0000 (03:01 +0000)]
vfs_extattr: Allow extattr names up to the full max
Extattr names are allowed to be 255 bytes -- not 254 bytes plus trailing
NUL. Provide a 256 buffer so that copyinstr() has room for the trailing
NUL.
Re-enable test for maximal name lengths.
PR: 208965
Reported by: asomers
Reviewed by: asomers
Differential Revision: https://reviews.freebsd.org/D24584
Li-Wen Hsu [Wed, 13 May 2020 20:37:46 +0000 (20:37 +0000)]
Only skip sys.net.if_clone_test.epair_stress in CI env
PR: 246443
Sponsored by: The FreeBSD Foundation
Li-Wen Hsu [Wed, 13 May 2020 20:36:38 +0000 (20:36 +0000)]
Temporarily skip sys.net.if_bridge_test.stp in CI as it always times out
PR: 244229
Sponsored by: The FreeBSD Foundation
Li-Wen Hsu [Wed, 13 May 2020 19:29:14 +0000 (19:29 +0000)]
Temporarily skip sys.net.if_clone_test.epair_stress
This case timed out so often
PR: 246443
Sponsored by: The FreeBSD Foundation
Warner Losh [Wed, 13 May 2020 19:17:35 +0000 (19:17 +0000)]
Add nvd alias back to nda now that it actually works.
Warner Losh [Wed, 13 May 2020 19:17:28 +0000 (19:17 +0000)]
Reimplement aliases in geom
The alias needs to be part of the provider instead of the geom to work
properly. To bind the DEV geom, we need to look at the provider's names and
aliases and create the dev entries from there. If this lives in the GEOM, then
it won't propigate down the tree properly. Remove it from geom, add it provider.
Update geli, gmountver, gnop, gpart, and guzip to use it, which handles the bulk
of the uses in FreeBSD. I think this is all the providers that create a new name
based on their parent's name.
John Baldwin [Wed, 13 May 2020 18:36:02 +0000 (18:36 +0000)]
Trim a few more things I missed from xform_enc.h.
An extern declaration for the now-removed Blowfish encryption
transform, and an include of the DES header.
John Baldwin [Wed, 13 May 2020 18:35:02 +0000 (18:35 +0000)]
Remove unused header for DES.
The NFS port doesn't use any of the DES functions.
Kyle Evans [Wed, 13 May 2020 18:07:37 +0000 (18:07 +0000)]
kernel: provide panicky version of __unreachable
__builtin_unreachable doesn't raise any compile-time warnings/errors on its
own, so problems with its usage can't be easily detected. While it would be
nice for this situation to change and compilers to at least add a warning
for trivial cases where local state means the instruction can't be reached,
this isn't the case at the moment and likely will not happen.
This commit adds an __assert_unreachable, whose intent is incredibly clear:
it asserts that this instruction is unreachable. On INVARIANTS builds, it's
a panic(), and on non-INVARIANTS it expands to __unreachable().
Existing users of __unreachable() are converted to __assert_unreachable,
to improve debuggability if this assumption is violated.
Reviewed by: mjg
Differential Revision: https://reviews.freebsd.org/D23793
Jessica Clarke [Wed, 13 May 2020 17:20:51 +0000 (17:20 +0000)]
riscv: Fix pmap_protect for superpages
When protecting a superpage, we would previously fall through to the
non-superpage case and read the contents of the superpage as PTEs,
potentially modifying them and trying to look up underlying VM pages that
don't exist if they happen to look like PTEs we would care about. This led
to nginx causing an unexpected page fault in pmap_protect that panic'ed the
kernel. Instead, if we see a superpage, we are done for this range and
should continue to the next.
Reviewed by: markj, jhb (mentor)
Approved by: markj, jhb (mentor)
Differential Revision: https://reviews.freebsd.org/D24827
Adrian Chadd [Wed, 13 May 2020 16:36:42 +0000 (16:36 +0000)]
[ath] Prepare for .. more sample rate control entries
This is in preparation for me bumping how many size buckets are used
for ath_rate_sample statistics.
* Bump buffer size to 64k
* Don't waste 4 lines per bucket size, condense it to two
* Alternate colours; my logic made everything after the first two just
be black. Oops.
Emmanuel Vadot [Wed, 13 May 2020 07:49:12 +0000 (07:49 +0000)]
linuxkpi: Add EBADRQC to errno.h
This is used in the amdgpu driver from Linux 5.2
Sponsored-by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24807
Andriy Gapon [Wed, 13 May 2020 07:47:56 +0000 (07:47 +0000)]
linuxkpi: print stack trace in WARN_ON macros
Reviewed by: hselasky, kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D24779
Andriy Gapon [Wed, 13 May 2020 06:26:30 +0000 (06:26 +0000)]
snd_hda: fix typos related to quirks set via 'config' tunable
One wrong quirk bit, one wrong variable name.
MFC after: 1 week
Andriy Gapon [Wed, 13 May 2020 06:24:54 +0000 (06:24 +0000)]
sound/hda: newer AMD devices still require the same PCIe snoop
So, replicate the ATI vendor snoop configuration for the AMD vendor.
I think that this should fix a number of cases where users currently
have to resort to polling or disabling MSI.
MFC after: 1 week
Kyle Evans [Wed, 13 May 2020 02:17:27 +0000 (02:17 +0000)]
inetd(8): Provide HTTP proxy example using netcat
One of the fortunes that are included in freebsd-tips talks about how
the superserver can be used to proxy connections with netcat, but there are
no examples provided. This commit adds an example with comment explaining
what it does.
Submitted by: debdrup
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D24800
Warner Losh [Wed, 13 May 2020 00:18:44 +0000 (00:18 +0000)]
Make the ata probe* and xpt* routines aprobe* and axpt* respectively.
Often, in traiging core files, one only has a traceback of where a
panic occurred. We have probe* and xpt* routines that live in both the
scsi and ata layers with identical names. To make one or the other
stand out, prefix all the probe and xpt routines in ata with an
'a'. I've left the scsi ones alone since they were there first and are
more numerous. I also rejected using #define to do this as being too
confusing. I chose this method because the CAM name for the probe
device was already 'aprobe'.
Normally, this doesn't matter because file scope protects one from
interfering with the other. However, due to the indirect nature of
CAM's state machine, you don't know if the following traceback is
SCSI or ATA:
xpt_done
probedone
xpt_done_process
xpt_done_td
fork_exit
nvme and mmc already have unique names.
MFC: 1 week
Differential revision: https://reviews.freebsd.org/D24825
Adrian Chadd [Wed, 13 May 2020 00:05:11 +0000 (00:05 +0000)]
[ath] [ath_rate] Add some extra data into the rate control lookup.
Right now (well, since I did this in 2011/2012) the rate control code
makes some super bad choices for 11n aggregates/rates, and it tracks
statistics even more questionably.
It's been long enough and I'm now trying to use it again daily, so let's
start by:
* telling the rate control code if it's an aggregate or not;
* being clearer about the TID - yes it can be extracted from the
ath_buf but this way it can be overridden by the caller without
changing the TID itself.
(This is for doing experiments with voice/video QoS at some point..)
* Return an optional field to limit how long the aggregate is in
microseconds. Right now the rate control code supplies a rate table
and the ath aggr form code will look at the rate table and limit
the aggregate size to 4ms at the slowest rate. Yeah, this is pretty
terrible.
* Add some more TODO comments around handling txpower, rate and
handling filtered frames status so if I continue to have spoons for
this I can go poke at it.
Warner Losh [Tue, 12 May 2020 23:46:52 +0000 (23:46 +0000)]
Kill trailing newline while I'm here...
Warner Losh [Tue, 12 May 2020 22:44:51 +0000 (22:44 +0000)]
Refine the history of uname. It appeared in 4.4BSD. It was not in v7 unix. It
was one of the additions in PWB, and appeared in System III and later commercial
versions of Unix. The different args to uname weren't aded until System III. Add
a quick note to note the late entry into the BSD fork of Unix since PWB
otherwise implies a pre-fork date.
Jilles Tjoelker [Tue, 12 May 2020 21:59:21 +0000 (21:59 +0000)]
sh/tests: Test some obscure cases with aliasing keywords
Andrew Turner [Tue, 12 May 2020 21:00:13 +0000 (21:00 +0000)]
Fix the name reported when the core supports a 64-bit CCIDX
Konstantin Belousov [Tue, 12 May 2020 18:17:57 +0000 (18:17 +0000)]
Make include/malloc.h usable again.
Lot of third-party Linux code uses #include <malloc.h>, expecting to
find the malloc extensions there. Instead of trying to fight them,
accept that attempt to deprecate the header causes more troubles than
solves potential portability issues, and provide our jemalloc
extensions.
PR: 155429
Reviewed by: imp, jhibbits, dab, hselasky, philip, emaste, jilles
Exp-run by: antoine (PR 245366)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D24297
Konstantin Belousov [Tue, 12 May 2020 18:12:20 +0000 (18:12 +0000)]
Clear namespace pollution in include/malloc_np.h
Do not include stdbool.h, it makes the header incompatible with some
third-party code that typedefs bool manually.
Remove inclusion of strings.h, which typically conflicts with the use
of symbol 'index'.
Separate inclusion of sys/cdefs.h is not needed because sys/types.h
already handles that.
Exp-run by: antoine (PR 245366)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D24297
Andrew Gallatin [Tue, 12 May 2020 17:18:44 +0000 (17:18 +0000)]
IPv6: Fix a panic in the nd6 code with unmapped mbufs.
If the neighbor entry for an IPv6 TCP session using unmapped
mbufs times out, IPv6 will send an icmp6 dest. unreachable
message. In doing this, it will try to do a software checksum
on the reflected packet. If this is a TCP session using unmapped
mbufs, then there will be a kernel panic.
To fix this, just free packets with unmapped mbufs, rather
than sending the icmp.
Reviewed by: np, rrs
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D24821
Mark Johnston [Tue, 12 May 2020 17:05:55 +0000 (17:05 +0000)]
Re-enable proc_test:symbol_lookup after r360979.
PR: 244732
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Mark Johnston [Tue, 12 May 2020 17:00:47 +0000 (17:00 +0000)]
librtld_db: Fix shlib mapping offsets.
kve_offset gives the offset into the backing file, which is not what we
want since different segments may map the same page. Use the base of
the mapping to determine the offset exported by librtld_db instead.
PR: 244732
Reported by: Jenkins, Nicolò Mazzucato <nicomazz97@gmail.com>
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Ed Maste [Tue, 12 May 2020 16:38:28 +0000 (16:38 +0000)]
libalias: fix potential memory disclosure from ftp module
admbugs: 956
Submitted by: markj
Reported by: Vishnu Dev TJ working with Trend Micro Zero Day Initiative
Security: FreeBSD-SA-20:13.libalias
Security: CVE-2020-7455
Security: ZDI-CAN-10849
Ed Maste [Tue, 12 May 2020 16:33:04 +0000 (16:33 +0000)]
libalias: validate packet lengths before accessing headers
admbugs: 956
Submitted by: ae
Reported by: Lucas Leong (@_wmliang_) of Trend Micro Zero Day Initiative
Reported by: Vishnu working with Trend Micro Zero Day Initiative
Security: FreeBSD-SA-20:12.libalias
Mark Johnston [Tue, 12 May 2020 16:10:07 +0000 (16:10 +0000)]
rtwn: Add a USB ID for the TP-Link TL-WN727N.
PR: 246417
Submitted by: Viktor G. <viktor@netgate.com>
MFC after: 1 week
Eric van Gyzen [Tue, 12 May 2020 15:22:40 +0000 (15:22 +0000)]
Remove tests for obsolete compilers in the build system
Assume gcc is at least 6.4, the oldest xtoolchain in the ports tree.
Assume clang is at least 6, which was in 11.2-RELEASE. Drop conditions
for older compilers.
Reviewed by: imp (earlier version), emaste, jhb
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D24802
Andrew Gallatin [Tue, 12 May 2020 14:01:12 +0000 (14:01 +0000)]
IPv6: sync IP_NO_SND_TAG_RL support from IPv4
The IP_NO_SND_TAG_RL flag to ip{,6}_output() means that the packets
being sent should bypass hardware rate limiting. This is typically used
by modern TCP stacks for rexmits.
This support was added to IPv4 in r352657, but never added to IPv6, even
though rack and bbr call ip6_output() with this flag.
Reviewed by: rrs
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D24822