Hasso Tepper [Mon, 2 Jul 2007 06:43:31 +0000 (06:43 +0000)]
Nuke USB_GET_SC and USB_GET_SC_OPEN macros.
Matthew Dillon [Mon, 2 Jul 2007 06:34:26 +0000 (06:34 +0000)]
Because the objcache caches up to two magazines on each cpu some pretty
bad degenerate conditions will be hit if the cluster limit is set too small
or the magazine size is set too large.
Detect the problem and reduce the magazine size to compensate. If we hit
the minimum magazine size (16), increase the cluster limit to compensate.
Report the corrections on the console.
We also have the option of stealing magazines from other cpus, or reducing
the magazine size even further to handle extreme cases.
This should solve most of the objcache issues when ncpus is set to 31.
Matthew Dillon [Mon, 2 Jul 2007 06:30:26 +0000 (06:30 +0000)]
Put a timeout on the umtx_sleep() in the idle loop and add conditional
debugging code to try to detect races.
Matthew Dillon [Mon, 2 Jul 2007 05:38:28 +0000 (05:38 +0000)]
Exhaust the virtual kernel network interface even if we cannot allocate
mbufs, otherwise we stop getting interrupts and the packets just build up
in the TUN device.
Matthew Dillon [Mon, 2 Jul 2007 04:19:14 +0000 (04:19 +0000)]
sigwinch has to run with the big giant lock so use the DragonFly
interrupt API to handle it rather then running it directly from the
signal handler. Fixes a panic.
Matthew Dillon [Mon, 2 Jul 2007 03:44:12 +0000 (03:44 +0000)]
Increase SMP_MAXCPU to 31. Can't do 32 (boo hoo!) because spinlocks need
a bit in the cpumask.
Add DELAY()'s (usleep()'s) in the AP startup code where we spin on the MP
lock. This greatly improves startup speed when you specify 31 cpus.
Fix a bug in the kqueue interrupt init. The kqueue code was not registering
its interrupt soon enough which could cause the timer to stop generating
interrupts.
Matthew Dillon [Mon, 2 Jul 2007 02:37:05 +0000 (02:37 +0000)]
Add an option (-n ncpus) to specify the number of cpus a virtual kernel
should simulate. The virtual kernel must be built with options SMP.
1-32 cpus may be simulated regardless of the number of real cpus. The
virtual kernel will create a thread for each cpu.
Matthew Dillon [Mon, 2 Jul 2007 02:23:00 +0000 (02:23 +0000)]
The real-kernel madvise and mcontrol system calls handle SMP interactions
when manipulating virtual page tables. Construct a new set of functions
for the virtual kernel to take advantage of this.
Add a cpu cache mask to the pmap structure for the virtual kernel which
allows us to invalidate per-cpu page table mappings simply by clearing
the mask. Also reload PT1pde after a successful cache hit if the cpu
mask bit is found to be 0.
Redo most of the PTE handling code for the virtual kernel. Use the new
invalidation function set and carefully deal with race conditions between
cpus. Race conditions are far more serious with a SMP virtual kernel then
with a real kernel because there are effectively two levels of page table
caching instead of one, since the real kernel maintains a separate pmap
for each VM space under the virtual kernel's control in addition to the
standard TLB interactions.
Matthew Dillon [Mon, 2 Jul 2007 02:14:32 +0000 (02:14 +0000)]
The kernel perfmon support (options PERFMON) was trying to initialize its
device way too early in the boot sequence, resulting in a panic on SMP
boxes. Move initialization to a bit later in the boot sequence.
Reported-by: Thomas Nikolajsen <sinknull@crater.dragonflybsd.org>
Dragonfly-bug: <http://bugs.dragonflybsd.org/issue714>
Matthew Dillon [Mon, 2 Jul 2007 01:47:22 +0000 (01:47 +0000)]
sched_ithd() must be called from within a critical section.
Matthew Dillon [Mon, 2 Jul 2007 01:43:30 +0000 (01:43 +0000)]
Copy a junk file from pc32 needed for <time.h>
Matthew Dillon [Mon, 2 Jul 2007 01:42:07 +0000 (01:42 +0000)]
Only use the symbol returned by dladdr() if its address is <= the
address we are trying to decode.
Matthew Dillon [Mon, 2 Jul 2007 01:41:26 +0000 (01:41 +0000)]
Clean up a kprintf() that was missing a newline.
Submitted-by: Joe Talbott <josepht@cstone.net>
Matthew Dillon [Mon, 2 Jul 2007 01:37:11 +0000 (01:37 +0000)]
Implement an architecture function cpu_mplock_contested() which is
called by the LWKT thread scheduler when the only thread(s) it can
schedule need the MP lock and the scheduler was unable to acquire the
MP lock.
On real systems this function just executes the cpu 'pause' instruction,
and on virtual systems this functions sleeps for a millisecond.
Use umtx_sleep() instead of sigpause() in the virtual kernel's idle loop
to interlock threads scheduled via a signal with the idle loop sleep.
This fixes a race condition that caused the vkernel to stop scheduling
(but there may be more issues, stay tuned).
Matthew Dillon [Mon, 2 Jul 2007 01:30:07 +0000 (01:30 +0000)]
Do not allow umtx_sleep() to restart on a restartable signal. We want to
return EINTR so the caller can handle side effects from the signal.
Hasso Tepper [Sun, 1 Jul 2007 21:24:04 +0000 (21:24 +0000)]
Nuke USB_MATCH*, USB_ATTACH* and USB_DETACH* macros.
Sascha Wildner [Sun, 1 Jul 2007 17:23:25 +0000 (17:23 +0000)]
Remove .Pp before .Sh.
Sascha Wildner [Sun, 1 Jul 2007 10:49:36 +0000 (10:49 +0000)]
Add markup and clean up a bit.
Matthew Dillon [Sun, 1 Jul 2007 04:02:33 +0000 (04:02 +0000)]
Also credit lots of help from Aggelos Economopoulos <aoiko@cc.ece.ntua.gr>
in the SMP virtual kernel commit.
Matthew Dillon [Sun, 1 Jul 2007 03:28:54 +0000 (03:28 +0000)]
Use dladdr() to obtain symbol names when possible and try to dump the
entire stack backtrace instead of just the first call.
Matthew Dillon [Sun, 1 Jul 2007 03:04:15 +0000 (03:04 +0000)]
Conditionalize SMP bits for non-SMP builds.
Matthew Dillon [Sun, 1 Jul 2007 02:51:45 +0000 (02:51 +0000)]
Bring in all of Joe Talbott's SMP virtual kernel work to date, which makes
virtual kernel builds with SMP almost get through a full boot. This work
includes:
* Creation of 'cpu' threads via libthread_xu
* Globaldata initialization
* AP synchronization
* Bootstrapping to the idle thread
* SMP pmap (mmu) functions
* IPI handling
My part of this commit:
* Bring all the signal interrupts under DragonFly's machine independant
interrupt handler API. This will properly deal with the MP lock
and critical section handling.
* Some additional pmap bits to handle SMP invalidation issues.
Submitted-by: Joe Talbott <josepht@cstone.net>
Additional-bits-by: Matt Dillon
Matthew Dillon [Sun, 1 Jul 2007 01:11:38 +0000 (01:11 +0000)]
More multi-threaded support for virtualization. Move the save context
from the process structure to the lwp structure, cleaning up the vmspace
support structures at the same time. This allows multiple LWPs in the
same process to be running a virtualization context at the same time.
Sascha Wildner [Sun, 1 Jul 2007 00:03:49 +0000 (00:03 +0000)]
mi_switch() and cpu_switch() are gone. Remove manpage and prototype.
Matthew Dillon [Sat, 30 Jun 2007 23:38:31 +0000 (23:38 +0000)]
A signal is sent to a particular LWP must be delivered to that LWP and never
posted to the process generically. Otherwise things like seg faults can
end up being posted to the wrong LWP.
Sascha Wildner [Sat, 30 Jun 2007 21:52:19 +0000 (21:52 +0000)]
Use the actual function name in the message.
Sascha Wildner [Sat, 30 Jun 2007 21:47:54 +0000 (21:47 +0000)]
tvtohz() was split into tvtohz_low() and tvtohz_high() in Jan 2004.
Update the manual page with some words from the comments in kern_clock.c.
Hasso Tepper [Sat, 30 Jun 2007 20:39:22 +0000 (20:39 +0000)]
Nuke PROC_(UN)LOCK, usb_callout_t, usb_kthread_create* and uio_procp.
Hasso Tepper [Sat, 30 Jun 2007 20:17:36 +0000 (20:17 +0000)]
Fix KASSERT messages.
Sascha Wildner [Sat, 30 Jun 2007 19:20:48 +0000 (19:20 +0000)]
Clean up a bit.
Sascha Wildner [Sat, 30 Jun 2007 19:03:52 +0000 (19:03 +0000)]
Use .Va for errno.
Matthew Dillon [Sat, 30 Jun 2007 05:54:03 +0000 (05:54 +0000)]
Try to avoid accidental foot shooting by not allowing a virtual kernel
to be installed unless DESTDIR is explicitly specified.
Matthew Dillon [Sat, 30 Jun 2007 02:33:04 +0000 (02:33 +0000)]
Move the P_WEXIT check from lwpsignal() to kern_kill(). That is, disallow
signals to exiting processes but allow signals to threads that have not gone
through the exit interlock yet. This allows exit1() to interlock the process
and still signal its LWPs.
Fix a bug in exit1() which was improperly using lwp_signotify() to wake up
LWPs to force the to exit. This function is basically a NOP if there are
no signals pending to the LWP. Send a real SIGKILL to the LWP instead.
This fixes a bug where vkernels get stuck in an exiting state and cannot be
killed.
Reported-by: Joe Talbott <josepht@cstone.net>
Matthew Dillon [Sat, 30 Jun 2007 01:59:41 +0000 (01:59 +0000)]
Update the documentation for sys_checkpoint().
Matthew Dillon [Sat, 30 Jun 2007 01:40:56 +0000 (01:40 +0000)]
Add MLINKS for checkpoint.1, because most point looking for information
out of the cold on checkpointing will probably try 'man checkpoint' more
often then 'man checkpt'.
Matthew Dillon [Fri, 29 Jun 2007 23:40:00 +0000 (23:40 +0000)]
Flag the checkpoint descriptor so on restore we can identify it and use the
descriptor for the restore rather then trying to look up the original
checkpoint file. This issue occurs when a program calls sys_checkpoint()
manually.
This allows a checkpoint-resume to be done on a copied checkpoint file,
or a gzipped (then gunzipped) checkpoint file, etc. The original checkpoint
file no longer needs to remain intact.
Requested-by: _why <why@ruby-lang.org>
Hasso Tepper [Fri, 29 Jun 2007 22:56:31 +0000 (22:56 +0000)]
Nuke usb_ callout macros.
Matthew Dillon [Fri, 29 Jun 2007 21:54:15 +0000 (21:54 +0000)]
Implement struct lwp->lwp_vmspace. Leave p_vmspace intact. This allows
vkernels to run threaded and to run emulated VM spaces on a per-thread basis.
struct proc->p_vmspace is left intact, making it easy to switch into and out
of an emulated VM space. This is needed for the virtual kernel SMP work.
This also gives us the flexibility to run emulated VM spaces in their own
threads, or in a limited number of separate threads. Linux does this and
they say it improved performance. I don't think it necessarily improved
performance but its nice to have the flexibility to do it in the future.
Sascha Wildner [Fri, 29 Jun 2007 19:34:41 +0000 (19:34 +0000)]
Add some useful references to various manual pages which deal with random
numbers.
Suggested-by: Robin Carey <robin_carey5@yahoo.co.uk>
Dragonfly-bug: <http://bugs.dragonflybsd.org/issue708>
Matthew Dillon [Fri, 29 Jun 2007 17:18:42 +0000 (17:18 +0000)]
This is a simple little syslink test program which ping-pongs a 64K
buffer around.
Matthew Dillon [Fri, 29 Jun 2007 05:14:00 +0000 (05:14 +0000)]
Clean up syslink a bit and add an abstraction that will eventually allow
zero-copy support for I/O on syslink DMA bufs.
Matthew Dillon [Fri, 29 Jun 2007 05:12:40 +0000 (05:12 +0000)]
Add O_MAPONREAD (not yet implemented). This will have the semantics of
replacing the underlying VM with a copy-on-write page mapping when possible.
Ultimately when used with syslink this will have the semantics of allowing
a fully shared mapping for syslink DMA buffers.
Matthew Dillon [Fri, 29 Jun 2007 05:09:15 +0000 (05:09 +0000)]
Add a new flag, XIOF_VMLINEAR, which requires that the buffer being mapped
be contiguous within a single VM object.
Matthew Dillon [Fri, 29 Jun 2007 00:18:05 +0000 (00:18 +0000)]
Get out-of-band DMA buffers working for user<->user syslinks. This
allows the syslink protocol to operate in a manner very similar to the
way sophisticated DMA hardware works, where you have a DMA buffer attached
to a command.
Augment the syslink protocol to implement read, write, and read-modify-write
style commands.
Obtain the MP lock in places where needed because fileops are called without
it held now. Our VM ops are not MP safe yet.
Use an XIO to map VM pages between userland processes. Add additional
XIO functions to aid in copying data to and from a userland context. This
removes an extra buffer copy from the path and allows us to manipulate pure
vm_page_t's for just about everything.
Matthew Dillon [Thu, 28 Jun 2007 20:24:57 +0000 (20:24 +0000)]
Clarify cpu localization requirements when using callout_stop() and
callout_reset().
Fix a SMP race in callout_stop() where the callout structure was being
modified in an unsafe manner.
Hasso Tepper [Thu, 28 Jun 2007 13:55:13 +0000 (13:55 +0000)]
Nuke SIMPLEQ_* and logprintf.
Hasso Tepper [Thu, 28 Jun 2007 09:33:33 +0000 (09:33 +0000)]
Remove duplicate.
Hasso Tepper [Thu, 28 Jun 2007 06:32:33 +0000 (06:32 +0000)]
Nuke device_ptr_t, USBBASEDEVICE, USBDEVNAME(), USBDEVUNIT(), USBGETSOFTC(),
USBDEVPTRNAME() and Static with help from sed(1).
Matthew Dillon [Wed, 27 Jun 2007 18:15:57 +0000 (18:15 +0000)]
Fix a bug-a-boo, the type uuid was being printed instead of the storage
uuid (so the type was being printed twice).
Hasso Tepper [Wed, 27 Jun 2007 13:26:18 +0000 (13:26 +0000)]
Use kernel functions. I don't understand how I could miss these ...
Hasso Tepper [Wed, 27 Jun 2007 12:28:00 +0000 (12:28 +0000)]
Nuke the code specific to NetBSD/OpenBSD/FreeBSD at first. I doubt anyone
will update these pieces and I don't intend to review macros for all
platforms.
There is the chance though that I might kill something which should stay
in the code in form "TODO: port it to DF". So, please review and kick me.
Joe Talbott [Tue, 26 Jun 2007 23:30:05 +0000 (23:30 +0000)]
Fix files that included the posix scheduling headers that were merged earlier.
Matthew Dillon [Tue, 26 Jun 2007 20:47:58 +0000 (20:47 +0000)]
Implement jscan -o. Take the patch from Steve and add some additional
checks to the write() to deal with EINTR and EAGAIN.
Submitted-by: "Steve O'Hara-Smith" <steve@sohara.org>
Matthew Dillon [Tue, 26 Jun 2007 20:39:33 +0000 (20:39 +0000)]
A file descriptor of -1 is legal when accessing journal status. Just allow
it generally, the journal command switch will recheck it on a per-command
basis.
Hasso Tepper [Tue, 26 Jun 2007 19:52:10 +0000 (19:52 +0000)]
Nuke USBDEV().
Matthew Dillon [Tue, 26 Jun 2007 19:31:10 +0000 (19:31 +0000)]
Repo-copy numerous files from sys/emulation/posix4 to sys/sys and sys/kern
and adjust the build to suit. posix scheduling is here to stay.
Submitted-by: Joe Talbott <josepht@cstone.net>
Sepherosa Ziehau [Tue, 26 Jun 2007 15:10:23 +0000 (15:10 +0000)]
If RX csum calculation with pseudo header is enabled, bge(4)'s will
miscalculate csum of frames carrying UDP datagrams. If UDP datagrams
are not fragmented by IP, then the rate of miscalculation is low, but
if UDP datagrams are fragmented by IP, then most of the frames will be
delivered to the upper layer with wrong hardware csum, which is quite
common for NFS.
Disable hardware RX csum calculation with pseudo header; it will be
better than doing software csum if hardware csum error happens, since
the error rate is too high.
Reported-by: Thomas Nikolajsen <thomas.nikolajsen@mail.dk>
Hasso Tepper [Tue, 26 Jun 2007 14:56:50 +0000 (14:56 +0000)]
One callout_stop() is enough.
Suggested-by: Joerg
Hasso Tepper [Tue, 26 Jun 2007 11:53:16 +0000 (11:53 +0000)]
malloc -> kmalloc
Hasso Tepper [Tue, 26 Jun 2007 11:04:50 +0000 (11:04 +0000)]
- Fix headphone jack sensing support for Olivetti Olibook 610-430 XPSE.
- Drain all callout handlers during driver detach appropriately.
- M_NOWAIT -> M_WAITOK
Obtained-from: FreeBSD
Hasso Tepper [Tue, 26 Jun 2007 08:36:24 +0000 (08:36 +0000)]
Clean up sys/bus/usb/usb_port.h. Remove not used/dead/old code.
Hasso Tepper [Tue, 26 Jun 2007 07:47:28 +0000 (07:47 +0000)]
Nuke "is is" stammering.
Matthew Dillon [Tue, 26 Jun 2007 02:40:20 +0000 (02:40 +0000)]
Add a new option (-i) that allows the insane deviation value to be set, and
change the default to 0.5 seconds. For example, -i 0.025 would set the test
to be 25ms.
Change the insane check... just map out a server deemed to be insane for 60
minutes, do not disconnect or reset it (which might lead to excessive packet
traffic).
Update the documentation.
Matthew Dillon [Tue, 26 Jun 2007 01:41:38 +0000 (01:41 +0000)]
Create a default dntpd.conf file for DragonFly using three pool.ntp.org
hosts as the time source.
Matthew Dillon [Tue, 26 Jun 2007 00:40:35 +0000 (00:40 +0000)]
Adjust debug output so columns line up better.
Matthew Dillon [Mon, 25 Jun 2007 21:33:36 +0000 (21:33 +0000)]
Recode the state machine to make it a bit less confusing. Collapse the
two failure states into a single failure state and handle failure processing
in each state.
Handle DNS failures by having dntpd relookup failed DNSes occassionally.
dntpd will now relookup the server name if a server fails, allowing you
to specify domains which front pools of ntp servers. dntpd will also
check for duplicate IPs and relookup again (up to a point).
Add a sanity check. If two or more servers are specified a quorum of
servers must agree that the selected time offset is reasonable. For the
moment do a +/- 30 second check (though we can probably make this +/- 2
seconds). If a server is determined to be broken, scrap its data and
reconnect. If it is still broken, permanently disable it. This is
primarily to handle severely broken servers that are occassionally present
in ntp pools.
Matthew Dillon [Sun, 24 Jun 2007 20:00:00 +0000 (20:00 +0000)]
Fix rts_input() which is the only procedure which calls raw_input(). As
with other packet input routines, the mbuf must be demuxed and forwarded
to the correct protocol thread so it can be cpu-localized for processing.
This allow anyone, including interrupt code, to write to the routing
socket.
Reported-by: "Sepherosa Ziehau" <sepherosa@gmail.com>
Sascha Wildner [Sun, 24 Jun 2007 17:42:58 +0000 (17:42 +0000)]
Add missing name.
Sascha Wildner [Sun, 24 Jun 2007 17:37:35 +0000 (17:37 +0000)]
Fix typo in a diagnostic message.
Sascha Wildner [Sun, 24 Jun 2007 10:50:43 +0000 (10:50 +0000)]
Fix HISTORY.
Sascha Wildner [Sun, 24 Jun 2007 10:47:48 +0000 (10:47 +0000)]
Add a slightly modified ataraid(4) manpage from FreeBSD as nataraid(4).
Peter Avalos [Sun, 24 Jun 2007 05:17:51 +0000 (05:17 +0000)]
From FreeBSD:
Fixed the threshold for using the simple Taylor approximation.
In e_log.c, there was just a off-by-1 (1 ulp) error in the comment
about the threshold. The precision of the threshold is unimportant,
but the magic numbers in the code are easier to understand when the
threshold is described precisely.
In e_logf.c, mistranslation of the magic numbers gave an off-by-1
(1 * 16 ulps) error in the intended negative bound for the threshold
and an off-by-7 (7 * 16 ulps) error in the intended positive bound for
the threshold, and the intended bounds were not translated from the
double precision bounds so they were unnecessarily small by a factor
of about 2048.
The optimization of using the simple Taylor approximation for args
near a power of 2 is dubious since it only applies to a relatively
small proportion of args, but if it is done then doing it 2048 times
as often _may_ be more efficient. (My benchmarks show unexplained
dependencies on the data that increase with further optimizations
in this area.)
Sascha Wildner [Sat, 23 Jun 2007 20:52:41 +0000 (20:52 +0000)]
Remove trailing whitespace.
Sascha Wildner [Sat, 23 Jun 2007 20:51:42 +0000 (20:51 +0000)]
Add markup for DIOCGPART.
Sascha Wildner [Sat, 23 Jun 2007 20:39:10 +0000 (20:39 +0000)]
Fix markup.
Sascha Wildner [Sat, 23 Jun 2007 10:13:39 +0000 (10:13 +0000)]
Actually process rc_info.
Sascha Wildner [Sat, 23 Jun 2007 09:37:24 +0000 (09:37 +0000)]
Use .Va for rc variables.
Sepherosa Ziehau [Sat, 23 Jun 2007 09:25:02 +0000 (09:25 +0000)]
- Add hw.skcX.imtime sysctl node and hw.skc.imtime tunable for interrupt
moderation time. Adjusting of hw.skcX.imtime will be committed to NIC
immediately.
- Increase default interrupt moderation time from 100 usec to 160 usec.
This reduces host interrupt load without noticable performance impact.
Simon Schubert [Fri, 22 Jun 2007 21:41:16 +0000 (21:41 +0000)]
Remove unused variable.
Sepherosa Ziehau [Fri, 22 Jun 2007 15:26:18 +0000 (15:26 +0000)]
- Factor out bge_{disable,enable}_intr().
- In bge_enable_intr(), trigger another hardware interrupt after clearing
interrupt mask, since any writing to BGE_MBX_IRQ0_LO will acknowledge
interrupts. Add comment about it.
- In bge_disable_intr(), acknowledge and disable interrupt by writing 1 to
BGE_MBX_IRQ0_LO, since setting interrupt mask itself does not de-assert
a currently asserted interrupt. Add comment about it.
- Since we have explicitly disabled interrupt using BGE_MBX_IRQ0_LO, set
"RX/TX coalesced BD count during interrupt" to 1. In this way, RX/TX
coalescing engine will properly update status block, which contains RX/TX
descriptor index. This only affects polling(4) operation, since we don't
have a "during interrupt" period in our interrupt handler.
- Fix comment.
Tested-with: 5751, 5701(altima)
Sepherosa Ziehau [Fri, 22 Jun 2007 12:08:07 +0000 (12:08 +0000)]
- Add KTR_IF_{BGE,EM} to opt_ktr.h
- Add commented out KTR_IF_{BGE,EM} entries to LINT
Reminded-by: swildner@
# LINT compiling test is conducted with KTR_IF_{BGE,EM}
Sepherosa Ziehau [Fri, 22 Jun 2007 11:53:40 +0000 (11:53 +0000)]
Drain packets even if link is down.
Suggested-by: joerg@
Obtained-from: NetBSD (mlelstv@netbsd.org)
Sepherosa Ziehau [Thu, 21 Jun 2007 15:00:18 +0000 (15:00 +0000)]
Add some KTRs in bge(4) to count RX/TX packets per interrupt.
Hasso Tepper [Thu, 21 Jun 2007 13:36:58 +0000 (13:36 +0000)]
Add mpls-in-ip. Bring in some fixes from IANA and FreeBSD in progress.
Peter Avalos [Wed, 20 Jun 2007 23:37:33 +0000 (23:37 +0000)]
Upgrade to less-406 fixing some display bugs.
Peter Avalos [Wed, 20 Jun 2007 23:28:28 +0000 (23:28 +0000)]
Add our READMEs.
Peter Avalos [Wed, 20 Jun 2007 23:25:56 +0000 (23:25 +0000)]
Merge from vendor branch LESS:
Import less-406.
Peter Avalos [Wed, 20 Jun 2007 23:25:56 +0000 (23:25 +0000)]
Import less-406.
Matthew Dillon [Wed, 20 Jun 2007 06:23:24 +0000 (06:23 +0000)]
Fix an issue with positive namecache timeouts. Locked children often
depend on the resolved vnode in the parent ncp's remaining intact, but
the positive namecache timeout code broke that rule and caused certain
VFS functions which depend on an intact parent (rename & remove primarily)
to occassionally return EPERM. Only zap the node if it has no children.
Reported-by: Thomas Nikolajsen <thomas.nikolajsen@mail.dk>
Matthew Dillon [Tue, 19 Jun 2007 19:28:18 +0000 (19:28 +0000)]
Correct a bug in the -S truncation mode where the mode was not being passed
to open(2), resulting in new files being created with weird permissions.
Matthew Dillon [Tue, 19 Jun 2007 19:18:20 +0000 (19:18 +0000)]
Do not blindly allow the block count to overflow. Restrict newfs filesystem
sizes to just under 1TB and report a fatal error if the media is too large.
Matthew Dillon [Tue, 19 Jun 2007 19:09:46 +0000 (19:09 +0000)]
The fstype was not being properly tested for a CCD uuid.
Correct a bug when generating an interleave table for very large disk
arrays (> 2TB). A size variable was 32 bits instead of 64 bits.
Matthew Dillon [Tue, 19 Jun 2007 19:07:41 +0000 (19:07 +0000)]
Refuse to label media that is too large to handle a 32 bit disklabel
(aka > 2TB). disklabel64 will have to be used instead on such media.
Matthew Dillon [Tue, 19 Jun 2007 17:25:48 +0000 (17:25 +0000)]
Add the -p pidfile option to the vkernel.
Submitted-by: Chris Turner <c.turner@199technologies.org>
Sepherosa Ziehau [Tue, 19 Jun 2007 14:59:41 +0000 (14:59 +0000)]
Add sysctl/tunable for TX/RX interrupt coalescing variables. Default
values are obtained from empirical measurement on bcm5751(PCIe).
Inspired-by: Bruce Evans <brde@optusnet.com.au> on freebsd-net mail list
For a running bge(4), setting these sysctl variables will not take
effect immediately; they are committed to device in the upcoming
interrupt handler.
Adapted-from: NetBSD if_bge.c 1.58 (jonathan@netbsd.org)
# On Altima AC9100 (bcm5701 based), TX/RX interrupt coalescing values
# seem to have no effect at all :(
Matthew Dillon [Tue, 19 Jun 2007 06:39:10 +0000 (06:39 +0000)]
Rename d_obj_uuid to d_stor_uuid to conform to the naming convention being
used in other structures.
Matthew Dillon [Tue, 19 Jun 2007 06:38:33 +0000 (06:38 +0000)]
Correct a couple of uuid retention issues. Output the storage uuid for
each partition and generate a new storage uuid for any partition missing
one (also regenerate it if the user deletes the storage uuid line for
that partition).
Matthew Dillon [Tue, 19 Jun 2007 06:07:57 +0000 (06:07 +0000)]
Make some adjustments to clean up structural field names. Add type and
storage uuid's to the partinfo structure for the DIOCGPART ioctl and
load the fields up for GPT slices and disklabel64 partitions.
Matthew Dillon [Tue, 19 Jun 2007 02:53:56 +0000 (02:53 +0000)]
Implement non-booting support for the DragonFly 64 bit disklabel:
* Add full kernel support. Both 32 and 64 bit labels will be probed.
* Add a new program, disklabel64, which allows you to create and edit
the new disklabel.
* Add some logic to prevent foot shooting.
DragonFly's 64 bit disklabels start at byte offset 0 on the disk slice
or GPT partition and operate in a slice-relative fashion. No translation
is required when going from on-disk to in-core or vise-versa, unlike the
existing 32 bit disklabels. 512 bytes at the beginning of the label are
reserved for legacy boot code. Specifically, the label starts at sector 0,
NOT sector 1, which means its location on the disk is the same regardless
of the sector size.
The label has a UUID to uniquely identify the storage and a type and
object uuid for each partition. All location specifications are 64 bit
byte offsets, NOT logical blocks. The label enforces an alignment
requirement for label-related I/O and partitions which defaults to 4K
regardless of the sector size. This makes the label 100% portable across
media with different sector sizes within the constraints of the alignment
requirement.
All partitions are specified using byte offsets and sizes, constrained
by the alignment requirement, relative to the base of the label (i.e.
offset 0 in the slice). disklabel64 will adjust the offsets for display
purposes to be relative to the partition table area. The label headers,
partition table, and boot2 areas come BEFORE the partition table area and
partitions which overlap any of those objects are not allowed.
By default, a virgin 64 bit disklabel will reserve 32K for boot2. As of
this writing, boot1 and boot2 blocks have not yet been implemented.
Matthew Dillon [Tue, 19 Jun 2007 02:30:35 +0000 (02:30 +0000)]
Improve the error message for gpt add a little.