dragonfly.git
15 years agoHAMMER 53B/Many: Complete overhaul of strategy code, reservations, etc
Matthew Dillon [Sun, 8 Jun 2008 18:16:26 +0000 (18:16 +0000)]
HAMMER 53B/Many: Complete overhaul of strategy code, reservations, etc

* Completely overhaul the strategy code.  Implement direct reads and writes
  for all cases.  REMOVE THE BACKEND BIO QUEUE.  BIOs are no longer queued
  to the flusher under any circumstances.

  Remove numerous hacks that were previously emplaced to deal with BIO's
  being queued to the flusher.

* Add a mechanism to invalidate buffer cache buffers that might be shadowed
  by direct I/O.  e.g. if a strategy write uses the vnode's bio directly
  there may be a shadow hammer_buffer that will then become stale and must
  be invalidated.

* Implement a reservation tracking structure (hammer_reserve) to track
  storage reservations made by the frontend.  The backend will not attempt
  to free or reuse reserved space if it encounters it.

  Use reservations to back cached holes (struct hammer_hole) for the
  same reason.

* Index hammer_buffer on the zone-X offset instead of the zone-2 offset.
  Base the RB tree in the hammer_mount instead of (zone-2) hammer_volume.
  This removes nearly all blockmap lookup operations from the critical path.

* Do a much better job tracking cached dirty data for the purposes of
  calculating whether the filesystem will become full or not.

* Fix a critical bug in the CRC generation of short data buffers.

* Fix a VM deadlock.

* Use 16-byte alignment for all on-disk data instead of 8-byte alignment.

* Major code cleanup.

As-of this commit write performance is now extremely good.

15 years agoHAMMER Utilities: Critical bug in newfs_hammer
Matthew Dillon [Sun, 8 Jun 2008 17:19:09 +0000 (17:19 +0000)]
HAMMER Utilities: Critical bug in newfs_hammer

* newfs_hammer was not properly setting up the small-data zone.

15 years agoAdd tunable to enable/disable PBCC support in acx(4) and it is enabled
Sepherosa Ziehau [Sun, 8 Jun 2008 10:06:05 +0000 (10:06 +0000)]
Add tunable to enable/disable PBCC support in acx(4) and it is enabled
by default.

15 years agoParallelize in_ifaddrhead operation
Sepherosa Ziehau [Sun, 8 Jun 2008 08:38:06 +0000 (08:38 +0000)]
Parallelize in_ifaddrhead operation

15 years agoAssert that move in directory entry hash table can't fail.
Nicolas Thery [Sun, 8 Jun 2008 07:56:06 +0000 (07:56 +0000)]
Assert that move in directory entry hash table can't fail.

15 years ago- oia is no longer used
Sepherosa Ziehau [Sun, 8 Jun 2008 03:58:03 +0000 (03:58 +0000)]
- oia is no longer used
- Tranform for(;;) loop into while() loop

15 years agoMove fetching of "hw.hasbrokenint12" tunable closer to it's usage.
Michael Neumann [Sat, 7 Jun 2008 12:30:26 +0000 (12:30 +0000)]
Move fetching of "hw.hasbrokenint12" tunable closer to it's usage.

15 years agoCosmetic changes (remove whitespace).
Michael Neumann [Sat, 7 Jun 2008 12:15:33 +0000 (12:15 +0000)]
Cosmetic changes (remove whitespace).

15 years agoRemove unnecessary conversion to kilobytes (divide by 1024) to then later
Michael Neumann [Sat, 7 Jun 2008 12:03:52 +0000 (12:03 +0000)]
Remove unnecessary conversion to kilobytes (divide by 1024) to then later
multiply it again by 1024 to get to bytes.

15 years agoUse NULL instead of 0.
Michael Neumann [Sat, 7 Jun 2008 11:44:04 +0000 (11:44 +0000)]
Use NULL instead of 0.

15 years agoCorrect typos.
Michael Neumann [Sat, 7 Jun 2008 11:37:23 +0000 (11:37 +0000)]
Correct typos.

15 years agoHAMMER 53A/Many: Read and write performance enhancements, etc.
Matthew Dillon [Sat, 7 Jun 2008 07:41:51 +0000 (07:41 +0000)]
HAMMER 53A/Many: Read and write performance enhancements, etc.

* Add hammer_io_direct_read().  For full-block reads this code allows
  a high-level frontend buffer cache buffer associated with the
  regular file vnode to directly access the underlying storage,
  instead of loading that storage via a hammer_buffer and bcopy()ing it.

* Add a write bypass, allowing the frontend to bypass the flusher and
  write full-blocks directly to the underlying storage, greatly improving
  frontend write performance.  Caveat: See note at bottom.

  The write bypass is implemented by adding a feature whereby the frontend
  can soft-reserve unused disk space on the physical media without having
  to interact (much) with on-disk meta-data structures.  This allows the
  frontend to flush high-level buffer cache buffers directly to disk
  and release the buffer for reuse by the system, resulting in very high
  write performance.

  To properly associate the reserved space with the filesystem so it can be
  accessed in later reads, an in-memory hammer_record is created referencing
  it.  This record is queued to the backend flusher for final disposition.
  The backend disposes of the record by inserting the appropriate B-Tree
  element and marking the storage as allocated.  At that point the storage
  becomes official.

* Clean up numerous procedures to support the above new features.  In
  particular, do a major cleanup of the cached truncation offset code
  (this is the code which allows HAMMER to implement wholely asynchronous
  truncate()/ftruncate() support.

  Also clean up the flusher triggering code, removing numerous hacks that
  had been in place to deal with the lack of a direct-write mechanism.

* Start working on statistics gathering to track record and B-Tree
  operations.

* CAVEAT: The backend flusher creates a significant cpu burden when flushing
  a large number of in-memory data records.  Even though the data itself
  has already been written to disk, there is currently a great deal of
  overhead involved in manipulating the B-Tree to insert the new records.
  Overall write performance will only be modestly improved until these
  code paths are optimized.

15 years agoUse ASSERT_IFAC_VALID whenever possible
Sepherosa Ziehau [Sat, 7 Jun 2008 07:22:22 +0000 (07:22 +0000)]
Use ASSERT_IFAC_VALID whenever possible

15 years agoAdd ASSERT_IFAC_VALID
Sepherosa Ziehau [Sat, 7 Jun 2008 06:34:57 +0000 (06:34 +0000)]
Add ASSERT_IFAC_VALID

15 years ago- Expose ifa_forwardmsg()
Sepherosa Ziehau [Sat, 7 Jun 2008 04:59:01 +0000 (04:59 +0000)]
- Expose ifa_forwardmsg()
- Add ifa_domsg()

# They will be needed soon

15 years agoDon't use NULL where 0 is meant.
Sascha Wildner [Fri, 6 Jun 2008 13:19:25 +0000 (13:19 +0000)]
Don't use NULL where 0 is meant.

15 years agoMake sure that ifac is still valid before unlinking it from or linking it to ifnet
Sepherosa Ziehau [Fri, 6 Jun 2008 12:35:27 +0000 (12:35 +0000)]
Make sure that ifac is still valid before unlinking it from or linking it to ifnet

15 years agoAdd periodic rf calibration support for acx111 part. This seems to stablize
Sepherosa Ziehau [Fri, 6 Jun 2008 10:47:14 +0000 (10:47 +0000)]
Add periodic rf calibration support for acx111 part.  This seems to stablize
performance during long time TX stress.

15 years ago* Fix some cases where NULL was used but 0 was meant (and vice versa).
Sascha Wildner [Thu, 5 Jun 2008 18:06:33 +0000 (18:06 +0000)]
* Fix some cases where NULL was used but 0 was meant (and vice versa).

* Remove some bogus casts of NULL to (void *).

15 years agoRemove some unneeded definitions of NULL.
Sascha Wildner [Thu, 5 Jun 2008 18:01:49 +0000 (18:01 +0000)]
Remove some unneeded definitions of NULL.

15 years agoInclude <sys/_null.h> for the definition of NULL.
Sascha Wildner [Thu, 5 Jun 2008 17:53:10 +0000 (17:53 +0000)]
Include <sys/_null.h> for the definition of NULL.

15 years agoAdd <sys/_null.h> for the definition of NULL:
Sascha Wildner [Thu, 5 Jun 2008 17:49:53 +0000 (17:49 +0000)]
Add <sys/_null.h> for the definition of NULL:

[...]
#ifndef __cplusplus
#define NULL ((void *)0)
#else
#define NULL 0
#endif
[...]

15 years agoAdd rt_cpuid, which records rtentry's owning CPU id. It could ease route
Sepherosa Ziehau [Thu, 5 Jun 2008 15:29:47 +0000 (15:29 +0000)]
Add rt_cpuid, which records rtentry's owning CPU id.  It could ease route
entry related debugging and sanity checks.

15 years agoFix bugs in spin_trylock_wr():
Nicolas Thery [Wed, 4 Jun 2008 04:34:54 +0000 (04:34 +0000)]
Fix bugs in spin_trylock_wr():

- globaldata.gd_spinlock_wr was not decremented back on failure;

- incorrect comparison in loop trying to clear cached shared bits (loop
  must fail if spinlock is still held for read by another cpu).

Reviewed-by: dillon@
15 years agoHAMMER 52/Many: Read-only mounts and mount upgrades/downgrades.
Matthew Dillon [Tue, 3 Jun 2008 18:47:25 +0000 (18:47 +0000)]
HAMMER 52/Many: Read-only mounts and mount upgrades/downgrades.

* Finish implementing MNT_UPDATE, allowing a HAMMER mount to be upgraded
  or downgraded.

* Adjust the recovery code to not flush buffers dirtied by recovery
  operations (running the UNDOs) when the mount is read-only.  The
  buffers will be flushed when the mount is updated to read-write.

* Improve recovery performance by not flushing dirty buffers until the
  end (if a read-write mount).

* A crash which occurs during recovery might cause the next recovery to
  fail.  Delay writing out the recovered volume header until all the other
  buffers have been written out to fix the problem.

15 years agoHAMMER Utilities: Enhance mount_hammer
Matthew Dillon [Tue, 3 Jun 2008 18:43:34 +0000 (18:43 +0000)]
HAMMER Utilities: Enhance mount_hammer

* Allow devices to be specified as dev:dev:dev, so a multi-volume hammer
  mount can be specified in /etc/fstab.

* Implement -u (mount update)

15 years agoDo not update f_offset on EINVAL.
Matthew Dillon [Tue, 3 Jun 2008 16:16:40 +0000 (16:16 +0000)]
Do not update f_offset on EINVAL.

Reported-by: VOROSKOI Andras <voroskoi@gmail.com>
15 years agomdoc cleanup
Sascha Wildner [Tue, 3 Jun 2008 12:40:09 +0000 (12:40 +0000)]
mdoc cleanup

15 years agoFix a crash when "Arctic Ocean" was selected.
Sascha Wildner [Tue, 3 Jun 2008 09:33:27 +0000 (09:33 +0000)]
Fix a crash when "Arctic Ocean" was selected.

Taken-from: FreeBSD

15 years agoHAMMER Utilities: More pre-formatting, cleanup
Matthew Dillon [Tue, 3 Jun 2008 06:20:30 +0000 (06:20 +0000)]
HAMMER Utilities: More pre-formatting, cleanup

* Fully initialize the large-data and small-data blockmaps in addition
  to the B-Tree blockmap.

* Set vol0_stat_bigblocks properly so used space shows as 0, or otherwise
  a fairly small number, when the volume is empty.

* Display the total amount of space pre-allocated by newfs_hammer for
  the boot-area, memory-log, undo-buffer, and blockmap infrastructure.

15 years agoAdd 'options HAMMER' to LINT.
Sascha Wildner [Mon, 2 Jun 2008 20:40:07 +0000 (20:40 +0000)]
Add 'options HAMMER' to LINT.

Noticed-by: Dionysus Blazakis <dion.blazakis@gmail.com>
15 years agoHAMMER 51/Many: Filesystem full casework, nohistory flag.
Matthew Dillon [Mon, 2 Jun 2008 20:19:03 +0000 (20:19 +0000)]
HAMMER 51/Many: Filesystem full casework, nohistory flag.

* Track the amount of unsynced information and return ENOSPC if the
  filesystem would become full.  The idea here is to detect that the
  filesystem is full and yet still give the flusher enough runway to
  flush cached dirty data and inodes.

* Implement the NOHISTORY flag.  Implement inheritance of NOHISTORY and
  NODUMP.

  The NOHISTORY flag tells HAMMER not to retain historical information on
  a filesystem object.  If set on a directory any objects created in that
  directory will also inherit the flag.  For example, it could be set
  on /usr/obj.

15 years agoReport the nohistory, noshistory, and nouhistory flags, and allow them
Matthew Dillon [Mon, 2 Jun 2008 20:17:07 +0000 (20:17 +0000)]
Report the nohistory, noshistory, and nouhistory flags, and allow them
to be specified with chflags.

Set the inverted field for nosunlink, nosunlnk, nouunlink, nouulnk.  This
is an inverted flag.

15 years agoAdd the UF_NOHISTORY and SF_NOHISTORY chflags flags. The nohistory flag
Matthew Dillon [Mon, 2 Jun 2008 20:13:38 +0000 (20:13 +0000)]
Add the UF_NOHISTORY and SF_NOHISTORY chflags flags.  The nohistory flag
allows you to mount a HAMMER filesystem normally and still specify that
certain files and directory not retain historical information.

If set on a directory the flag will be inherited by any objects
created within that directory.

Adjust UFS to inherit the NOHISTORY and NODUMP flags from the parent
directory when creating new objects.  NODUMP used to be ufs-specific
and UFS's backup/restore remembered the inheritance.  As a more generic
flag it needs to be inherited within the filesystem.  Note that UFS
has no historical retention capability and ignores the NOHISTORY flag,
but we might use it with the journaling audit trail later.

15 years agoDisallow negative seek positions for regular files, directories, and
Matthew Dillon [Mon, 2 Jun 2008 20:06:36 +0000 (20:06 +0000)]
Disallow negative seek positions for regular files, directories, and
character-special devices to conform to OpenGroup specifications.

Reported-by: VOROSKOI Andras <voroskoi@gmail.com>
15 years agoAdd missing exit(1).
Matthew Dillon [Mon, 2 Jun 2008 20:03:22 +0000 (20:03 +0000)]
Add missing exit(1).

Reported-by: Johannes Hofmann <hofmann@blob.baaderstrasse.com>
15 years agoHAMMER Utilities: Correct vol0_stat_freebigblocks.
Matthew Dillon [Mon, 2 Jun 2008 16:57:53 +0000 (16:57 +0000)]
HAMMER Utilities: Correct vol0_stat_freebigblocks.

* The root_vol->ondisk->vol0_stat_freebigblocks was not being properly
  decremented when newfs_hammer allocated big blocks, causing it to report
  a value that is too large.

15 years agoFix kernel compile warnings.
Matthew Dillon [Mon, 2 Jun 2008 16:55:08 +0000 (16:55 +0000)]
Fix kernel compile warnings.

15 years agoEven using the objcache we need a one-per-cpu free-thread cache in order
Matthew Dillon [Mon, 2 Jun 2008 16:54:21 +0000 (16:54 +0000)]
Even using the objcache we need a one-per-cpu free-thread cache in order
to keep an exiting thread intact throughout its exit sequence.

This fixes a double-fault which can occur on shutdown.  The bug was mainly
tickled by exiting kernel threads.

15 years agoAccording to SUSv3 including just regex.h must be enough. Fixes build of
Hasso Tepper [Mon, 2 Jun 2008 06:50:08 +0000 (06:50 +0000)]
According to SUSv3 including just regex.h must be enough. Fixes build of
several pkgsrc packages.

15 years agoUnbreak buildworld.
Hasso Tepper [Mon, 2 Jun 2008 06:42:45 +0000 (06:42 +0000)]
Unbreak buildworld.

15 years agoHAMMER 50/Many: VFS_STATVFS() support, stabilization.
Matthew Dillon [Sun, 1 Jun 2008 21:05:39 +0000 (21:05 +0000)]
HAMMER 50/Many: VFS_STATVFS() support, stabilization.

* Add support for VFS_STATVFS(), returning 64 bit quantities for available
  space, etc.

* When freeing a big-block any holes cached for that block must be
  cleaned out.

* Fix a conditional testing whether a layer2 big-block must be allocated in
  layer1.  The bug could only occur if a layer2 big-block gets freed in
  layer1, and we currently never do this.

* Clean-up comments related to freeing blocks.

15 years agoHAMMER Utilities: Performance adjustments, bug fixes.
Matthew Dillon [Sun, 1 Jun 2008 20:59:29 +0000 (20:59 +0000)]
HAMMER Utilities: Performance adjustments, bug fixes.

* Newfs_hammer now pre-allocates the layer1 and layer2 blockmap blocks,
  and pre-sizes each blockmap to 4x the initial filesystem size instead
  of 100x the initial filesystem size.

  The blockmap can be dynamically resized at any time, given a little code.
  In addition, there is simply no need to give it a 100x initial dynamic
  range.  This only bloats the size of the layer-2 map unnecessarily.

* Change alloc_blockmap() to use rootmap->next_offset for allocations
  instead of rootmap->alloc_offset and fix a bug where rootmap->phys_offset
  was improperly being incremented (it is a fixed field once set).

  The bug was in a code-path that could not by executed by current
  incarnations of newfs_hammer.

15 years agoUse newly available libc and system calls related to statvfs to make df
Matthew Dillon [Sun, 1 Jun 2008 20:52:21 +0000 (20:52 +0000)]
Use newly available libc and system calls related to statvfs to make df
work with 64 bit statvfs fields.  Tested with a 4TB VN-backed HAMMER
filesystem.

15 years agoAdd getmntvinfo() which uses the new getvfsstat() system call.
Matthew Dillon [Sun, 1 Jun 2008 20:46:45 +0000 (20:46 +0000)]
Add getmntvinfo() which uses the new getvfsstat() system call.

15 years agoMore header file cleanups related statvfs.
Matthew Dillon [Sun, 1 Jun 2008 20:44:45 +0000 (20:44 +0000)]
More header file cleanups related statvfs.

15 years agoClean up statvfs() and related prototypes. Place the prototypes in the
Matthew Dillon [Sun, 1 Jun 2008 20:18:03 +0000 (20:18 +0000)]
Clean up statvfs() and related prototypes.  Place the prototypes in the
correct file.

15 years agoImplement a new system call: getvfsstat(). This system call returns
Matthew Dillon [Sun, 1 Jun 2008 19:55:32 +0000 (19:55 +0000)]
Implement a new system call: getvfsstat().  This system call returns
an array of statfs and statvfs structures.  Unfortunately there is no way
to just return an array of statvfs structures because the statvfs structure
does not have sufficient information in it to identify the mount point.

    getvfsstat(struct statfs *buf, struct statvfs *vbuf,
       long vbufsize, int flags);

15 years ago* Implement new system calls in the kernel: statvfs(), fstatvfs(),
Matthew Dillon [Sun, 1 Jun 2008 19:27:37 +0000 (19:27 +0000)]
* Implement new system calls in the kernel:  statvfs(), fstatvfs(),
  fhstatvfs().

* Implement a new VFS op, VFS_STATVFS().  Implement a default for this new
  op for VFSs which do not implement VFS_STATVFS(), which calls VFS_STATFS()
  and converts the structure (using Joerg's conversion procedure from libc).

* Remove statvfs(), fstatvfs(), and fhstatvfs() from libc.  These functions
  are now system calls.

15 years agoRaise the size of the /etc MFS to 12MB (for ssh blacklists).
Sascha Wildner [Sun, 1 Jun 2008 08:54:18 +0000 (08:54 +0000)]
Raise the size of the /etc MFS to 12MB (for ssh blacklists).

15 years ago- Rename ifa_portfn() to ifnet_portfn()
Sepherosa Ziehau [Sun, 1 Jun 2008 08:09:14 +0000 (08:09 +0000)]
- Rename ifa_portfn() to ifnet_portfn()
- Create inline function ifa_portfn() which simply calls ifnet_portfn()

15 years agoRename:
Sepherosa Ziehau [Sun, 1 Jun 2008 07:44:37 +0000 (07:44 +0000)]
Rename:
ifaddrinit -> ifnetinit
ifaddr_threads -> ifnet_threads

15 years agoAvoid code duplication
Sepherosa Ziehau [Sun, 1 Jun 2008 07:43:29 +0000 (07:43 +0000)]
Avoid code duplication

15 years agoacx111 parts can't send using short slot time, but it seems to have no problem
Sepherosa Ziehau [Sun, 1 Jun 2008 04:01:24 +0000 (04:01 +0000)]
acx111 parts can't send using short slot time, but it seems to have no problem
to receive packets sent using short slot time.  Turn on short slot time
support, so that we don't prevent other STA from using short slot time.

15 years agoUse 1Mbits/s as beacon sending rate; it seems to fix TX performance issue
Sepherosa Ziehau [Sun, 1 Jun 2008 03:58:38 +0000 (03:58 +0000)]
Use 1Mbits/s as beacon sending rate; it seems to fix TX performance issue
when using acx(4) as hostap.

15 years agoHAMMER Utilities: New utility 'undo'.
Matthew Dillon [Sun, 1 Jun 2008 02:03:10 +0000 (02:03 +0000)]
HAMMER Utilities: New utility 'undo'.

* Add a new utility called 'undo' which makes use of HAMMER capabilities
  to retrieve prior versions of a file, even if that file has been deleted.

  This utility can dump all prior versions of the file, generate a history
  of transaction ids associated with the file, has a 'quick diff' relative
  to the most recent change, and can also generate diffs for all versions
  of the file.

  This utility only works with HAMMER filesystems.

15 years agoHAMMER Utilities: Cleanup
Matthew Dillon [Sun, 1 Jun 2008 01:33:58 +0000 (01:33 +0000)]
HAMMER Utilities: Cleanup

* Cleanup the softprune code a bit.

15 years agoHAMMER 49B/Many: Stabilization pass
Matthew Dillon [Sun, 1 Jun 2008 01:33:25 +0000 (01:33 +0000)]
HAMMER 49B/Many: Stabilization pass

* Fix range checks in the pruning ioctl.

* Fix an incorrect assertion in hammer_vop_strategy_read().

15 years agoHAMMER Utilities: Add the 'hammer softprune' command.
Matthew Dillon [Sat, 31 May 2008 18:45:04 +0000 (18:45 +0000)]
HAMMER Utilities: Add the 'hammer softprune' command.

Add a new hammer pruning command called 'hammer softprune'.  This command
is much simpler to use then the 'hammer prune' command.  You simply specify
a directory containing softlinks to HAMMER snapshots, typically in the form:
"<path_to_hammer_filesystem>/@@0x<16-char-transaction_id>".  The command
will scan the directory non-recursively, collect all the softlinks, extract
the transaction ids, and prune the HAMMER filesystem to contain only those
snapshots.

In addition, information created before the snapshot softlink with the
lowest transaction id is destroyed and information created after the
softlink with the highest transaction id is retained (remains fine-grained).

This gives the administrator an easy way to maintain official snapshots
while at the same time retaining our 'undo' capability by leaving recent
modifications intact.

A simple cron job or script coupled with the use of the 'hammer synctid'
can be used to create a snapshot softlink every so often, and older
snapshots can be cleaned out or thinned simply by removing the associated
softlinks, and then re-running 'hammer softprune' on the directory
containing the softlinks.

Unlike the 'hammer prune' command, the softprune command does not require
the time ranges for snapshots to be well-ordered.

15 years agoHAMMER 49/Many: Enhance pruning code
Matthew Dillon [Sat, 31 May 2008 18:37:57 +0000 (18:37 +0000)]
HAMMER 49/Many: Enhance pruning code

* Pass the element array in as a pointer rather then embedding it in
  the hammer_ioc_prune structure.

* Adjust the modulo calculations to allow non-aligned snapshots to be
  pruned, to support the new 'hammer softprune' command.

15 years agoReduce log verbosity
Sepherosa Ziehau [Sat, 31 May 2008 13:12:59 +0000 (13:12 +0000)]
Reduce log verbosity

15 years agomdoc cleanup
Sascha Wildner [Sat, 31 May 2008 12:04:15 +0000 (12:04 +0000)]
mdoc cleanup

15 years agoUse same naming convention as other host controller stats
Sepherosa Ziehau [Sat, 31 May 2008 11:18:09 +0000 (11:18 +0000)]
Use same naming convention as other host controller stats

15 years ago- Avoid excessive goto
Sepherosa Ziehau [Sat, 31 May 2008 08:29:05 +0000 (08:29 +0000)]
- Avoid excessive goto
- Adjust arphdr pointer, if we need to do another m_pullup()
- Indentation in switch block

Obtained-from: FreeBSD w/ mod

15 years agoAdd ifa_listmask field in ifaddr_container; currently it is mainly used
Sepherosa Ziehau [Sat, 31 May 2008 06:03:26 +0000 (06:03 +0000)]
Add ifa_listmask field in ifaddr_container; currently it is mainly used
to do sanity checks.

15 years agoAdd some missing manual pages: wcscoll(3), wcswidth(3), wcsxfrm(3), and
Sascha Wildner [Sat, 31 May 2008 04:51:55 +0000 (04:51 +0000)]
Add some missing manual pages: wcscoll(3), wcswidth(3), wcsxfrm(3), and
wcwidth(3)

Taken-from: NetBSD

15 years agoMention that -W too only works with -p or -d.
Sascha Wildner [Fri, 30 May 2008 22:58:08 +0000 (22:58 +0000)]
Mention that -W too only works with -p or -d.

15 years agoMinor corrections.
Sascha Wildner [Fri, 30 May 2008 22:51:31 +0000 (22:51 +0000)]
Minor corrections.

15 years agoImplement Farnsworth mode.
Simon Schubert [Fri, 30 May 2008 21:47:04 +0000 (21:47 +0000)]
Implement Farnsworth mode.

15 years agoFix typo.
Sascha Wildner [Fri, 30 May 2008 18:00:23 +0000 (18:00 +0000)]
Fix typo.

Noticed-by: Antonio Huete Jimenez <tuxillo@quantumachine.net>
15 years agoInclude sys/select.h to conform to SUS.
Simon Schubert [Fri, 30 May 2008 09:39:46 +0000 (09:39 +0000)]
Include sys/select.h to conform to SUS.

Noticed-by: hasso@
15 years agoFix type name as well.
Simon Schubert [Fri, 30 May 2008 08:07:22 +0000 (08:07 +0000)]
Fix type name as well.

Noticed-by: hasso@
15 years agoFix macro name.
Simon Schubert [Fri, 30 May 2008 08:05:49 +0000 (08:05 +0000)]
Fix macro name.

Noticed-by: hasso@
15 years agoAdd brief description about the recently added interrupt moderation sysctl nodes
Sepherosa Ziehau [Thu, 29 May 2008 13:30:15 +0000 (13:30 +0000)]
Add brief description about the recently added interrupt moderation sysctl nodes

15 years ago- Rename bce_init_context() to bce_init_ctx()
Sepherosa Ziehau [Wed, 28 May 2008 13:53:42 +0000 (13:53 +0000)]
- Rename bce_init_context() to bce_init_ctx()
- Clear out all quick context entries.  This fixes the watchdog timeout, which
  usually happens after many tiny packets are transmitted

Obtained-from: FreeBSD if_bce.c rev 1.36

# FreeBSD didn't mention a _single_ word of this change and its effect in the
# commit log.  The description of the fix and the its effect is obtained from
# Linux bnx2 commit log.

15 years ago- ifnet.if_output() should be called without ifnet.if_serializer being
Sepherosa Ziehau [Wed, 28 May 2008 12:11:13 +0000 (12:11 +0000)]
- ifnet.if_output() should be called without ifnet.if_serializer being
  held.  Add assertion about it in ether_output().
- ether_output_frame() should be called without the ifnet.if_serializer
  being held.  Add assertion in it.
- arp_ifinit() will be called with ifnet.if_serializer being held.  To
  prevent serializer from recursion, ifnet.if_serializer is released
  before calling arprequest(), which supposes caller does not hold output
  iface's serializer.
- ifnet.if_serializer can't be held when calling arp_ifinit2().

Reported-by: dillon@
15 years ago- Add tunables and sysctl nodes for interrupt moderation variables.
Sepherosa Ziehau [Wed, 28 May 2008 10:51:56 +0000 (10:51 +0000)]
- Add tunables and sysctl nodes for interrupt moderation variables.
  Settings are committed to device during device initialization or in
  interrupt routine.
- Default interrupt moderation variables' value from Broadcom's driver
  seem to be misconfigiured.  Following changes are made:
  Send max coalesced BD count 20 -> 24
  Send coalescing ticks 80 -> 1000 (~1000HZ)
  Receive max coalesced BD count 6  -> 24
  Receive coalescing ticks 18 -> 125  (~8000Hz)
  Two changes on "Receive" interrupt moderation variables avoid livelock

15 years agoInclude sys/fd_set.h in the BSD_VISIBLE case.
Simon Schubert [Wed, 28 May 2008 10:37:25 +0000 (10:37 +0000)]
Include sys/fd_set.h in the BSD_VISIBLE case.

Seems that a lot of software assumes that fd_set will be defined after
including sys/types.h, so restore this property.

Noted-by: hasso@
15 years agoMove definition of fd_set to sys/fd_set.h.
Simon Schubert [Wed, 28 May 2008 10:35:11 +0000 (10:35 +0000)]
Move definition of fd_set to sys/fd_set.h.

15 years agoGenerate a semi-random MAC address when connecting to a SOCK_SEQPACKET
Matthew Dillon [Tue, 27 May 2008 23:44:46 +0000 (23:44 +0000)]
Generate a semi-random MAC address when connecting to a SOCK_SEQPACKET
socket (ala vknetd's /dev/vknet) instead of a TAP interface.

15 years agoImplement a new utility called vknet. This utility interconnects the
Matthew Dillon [Tue, 27 May 2008 23:26:38 +0000 (23:26 +0000)]
Implement a new utility called vknet.  This utility interconnects the
networks between the local machine and a remote machine, typically by
connecting to a TAP interface or socket supplied by a running vknetd on
each machine.

Update manual pages for vknetd and vkernel to include vknet.

15 years agoOnly test the IP protocol (ip_p) for IP frames.
Matthew Dillon [Tue, 27 May 2008 22:47:16 +0000 (22:47 +0000)]
Only test the IP protocol (ip_p) for IP frames.

15 years agoFix socketvar.h inclusion by userland. This is a temporary hack and,
Matthew Dillon [Tue, 27 May 2008 17:31:12 +0000 (17:31 +0000)]
Fix socketvar.h inclusion by userland.  This is a temporary hack and,
frankly, a lot more of the header file should be made _KERNEL-only.

Reported-by: Hasso Tepper <hasso@estpak.ee>
15 years agoAdd the notty utility, a program I wrote long ago which I should have
Matthew Dillon [Tue, 27 May 2008 17:10:49 +0000 (17:10 +0000)]
Add the notty utility, a program I wrote long ago which I should have
brought into base long ago.  This program provides a convenient shortcut
to detaching a command from the controlling terminal and running it in the
background.

15 years agoUse standard include path
Sepherosa Ziehau [Tue, 27 May 2008 13:27:35 +0000 (13:27 +0000)]
Use standard include path

15 years agoSync zoneinfo database with tzdata2008c from elsie.
Sascha Wildner [Tue, 27 May 2008 12:08:29 +0000 (12:08 +0000)]
Sync zoneinfo database with tzdata2008c from elsie.

africa:         8.10 -> 8.11
asia:           8.18 -> 8.20

africa - Morocco observes DST in 2008

asia   - Choibalsan is now 8 hours off UTC
       - Pakistan observes DST in 2008

15 years agoOnly collect 'count' packets when polling(4) is used. Set softc cached
Sepherosa Ziehau [Tue, 27 May 2008 12:07:01 +0000 (12:07 +0000)]
Only collect 'count' packets when polling(4) is used.  Set softc cached
RX consumer index to the value we have processed before break out of the
loop, so we could come back again upon next poll.

15 years agoAdd a boot loader tunable hw.usb.hack_defer_exploration, which if set
Michael Neumann [Tue, 27 May 2008 12:00:47 +0000 (12:00 +0000)]
Add a boot loader tunable hw.usb.hack_defer_exploration, which if set
to 0 reverts to the old USB behaviour, i.e. USB keyboards should be again
usable early at boot. By default, this is set to 1 which will avoid hanging
the system on qemu and my HP compaq laptop and maybe others.

Note that this is a hack around a shortcoming in the current USB stack and
will go away once the shortcoming has been fixed.

15 years ago- Apply same adjustment to softc cached TX/RX BD index
Sepherosa Ziehau [Tue, 27 May 2008 11:39:42 +0000 (11:39 +0000)]
- Apply same adjustment to softc cached TX/RX BD index
- Clear used_tx_bd in bce_free_tx_chain

Obtained-from: FreeBSD if_bce.c rev 1.{36,37}

15 years agoDo not try to set-up the bridge or tap interfaces when connecting to
Matthew Dillon [Tue, 27 May 2008 07:48:00 +0000 (07:48 +0000)]
Do not try to set-up the bridge or tap interfaces when connecting to
a unix domain socket instead of a tap interface.

15 years agoAdd vknetd to the build.
Matthew Dillon [Tue, 27 May 2008 07:46:57 +0000 (07:46 +0000)]
Add vknetd to the build.

15 years agoGet rid of an old and terrible hack. Local stream sockets enqueue packets
Matthew Dillon [Tue, 27 May 2008 05:25:36 +0000 (05:25 +0000)]
Get rid of an old and terrible hack.  Local stream sockets enqueue packets
directly on the peer's sockbuf, rather then the sender's sockbuf.  That
part of the code is fine, but in order to prevent the sender from queueing
infinite mbufs (because its sockbuf appears to be empty when you do that)
the code dynamically messed around with the sender's high water mark.

This blew up the new SOCK_SEQPACKET.  In particular, it blows up the
use of the PR_ATOMIC on stream sockets and can cause spurious EMSGSIZE
errors to be returned instead of the EWOULDBLOCK that should have been
returned.

Also fix, or partially the resource limit code which tries to reduce the
high water mark when a user is using too many mbufs.  This never worked
well and still doesn't, but it is in better shape now.

Get rid of the crufty code and simply add a flag to the signalsockbuf,
SSB_STOP, to stop the sender.

Also adjust the vkernel to increase the default socket buffer when
connecting to vknet instead of if_tap.  VKE currently issues non-blocking
writes to vknet/tap and we do not want to lose packets for no good reason.

15 years agoCreate a new daemon called vknetd. This daemon uses the new SOCK_SEQPACKET
Matthew Dillon [Tue, 27 May 2008 01:58:02 +0000 (01:58 +0000)]
Create a new daemon called vknetd.  This daemon uses the new SOCK_SEQPACKET
feature to create a virtualized packet bridge accessible by userland (in
particular, user-run virtual kernels).

15 years ago* Implement SOCK_SEQPACKET sockets for local communications. These sockets
Matthew Dillon [Tue, 27 May 2008 01:10:47 +0000 (01:10 +0000)]
* Implement SOCK_SEQPACKET sockets for local communications.  These sockets
  operate like SOCK_STREAM but each write() builds a record and each read()
  reads a record.  That is, the data is not aggregated together or allowed
  to be partially read.

  This allows local sockets to have the same packetization characteristics
  as if_tap when desired.

* Add a feature to the vkernel which allows a unix domain socket to be
  specified for the network interface rather then a TAP interface.  The
  vkernel will connect to the socket using SOCK_SEQPACKET and read and
  write packets to it.

* Clean up some libc/kernel namespace collisions related to including
  sys/socket.h.

15 years agoAllocate lwkt threads from objcache instead of custom per-cpu cache backed
Nicolas Thery [Mon, 26 May 2008 17:11:09 +0000 (17:11 +0000)]
Allocate lwkt threads from objcache instead of custom per-cpu cache backed
by zone.

Reviewed-by: dillon@
15 years agoAvoid panic upon module unloading
Sepherosa Ziehau [Mon, 26 May 2008 14:23:25 +0000 (14:23 +0000)]
Avoid panic upon module unloading

15 years agoAllow a NULL pointer as argument to usb_get_next_event(), and don't
Michael Neumann [Mon, 26 May 2008 14:00:46 +0000 (14:00 +0000)]
Allow a NULL pointer as argument to usb_get_next_event(), and don't
allocate a "struct usb_event" on stack in usb_add_event().

Obtained-from: NetBSD/usb.c 1.83

15 years agoRemove __HAVE_GENERIC_SOFT_INTERRUPTS ifdefs as we don't support the
Michael Neumann [Mon, 26 May 2008 13:56:08 +0000 (13:56 +0000)]
Remove __HAVE_GENERIC_SOFT_INTERRUPTS ifdefs as we don't support the
softintr_* API of NetBSD.

15 years agoFix following possible bugs for SIOCSIFADDR, if in_ifinit() fails
Sepherosa Ziehau [Mon, 26 May 2008 13:29:33 +0000 (13:29 +0000)]
Fix following possible bugs for SIOCSIFADDR, if in_ifinit() fails
Conditions:
   o  ifaceX has an AF_INET ia
   o  SIOCSIFADDR is used to change address, and new address' hash value is
      different from ia's
   o  ia is currently in hash bucket B1
   o  ia is removed from B1 and installed into hash table using new address
      hash value, assume its new hash bucket is B2, and B1 != B2
1) Dangling ia reference in inaddr hash table
   o  ifnet.if_ioctl fails
   o  ia is reinstalled into hash bucket B1, but without being first removed
      from hash bucket B2
   Hash bucket B2 will have a dangling reference to ia
2) ia is left in wrong hash bucket
   o  rtinit fails
   o  ia's address is restored to oldaddr
   ia itself is left in hash bucket indexed by new address's hash value

- In in_ifinit(), if it fails, unlink ia from inaddr hash table instead of
  delaying the unlinking to in_control_internal().  If necessary reinstall
  ia into inaddr hash table with original address
- After the above fix, in_control_internal() needs to unlink ia from inaddr
  only if cmd is SIOCDIFADDR and ia resides in inaddr hash table.  Whether
  ia is in inaddr hash table or not, is currently indicated by ia address's
  family; add XXX comment that this assumption is not good
- Constfy 'sin' parameter to in_ifinit()

Reviewed-by: dillon@
15 years agoFix typos and cosmetic changes.
Michael Neumann [Mon, 26 May 2008 13:24:59 +0000 (13:24 +0000)]
Fix typos and cosmetic changes.