ChernLeeWritten by MikeSmithBased on a tutorial written by MattDillonAlso based on tuning(7) written by Configuration and TuningSynopsissystem configurationsystem optimizationOne of the important aspects of &os; is system configuration.
Correct system configuration will help prevent headaches during future upgrades.
This chapter will explain much of the &os; configuration process,
including some of the parameters which
can be set to tune a &os; system.
After reading this chapter, you will know:
How to efficiently work with
file systems and swap partitions.The basics of rc.conf configuration and
rc.d startup systems.How to configure and test a network card.How to configure virtual hosts on your network devices.How to use the various configuration files in
/etc.How to tune &os; using sysctl
variables.How to tune disk performance and modify kernel
limitations.Before reading this chapter, you should:
Understand &unix; and &os; basics ().Be familiar with the basics of kernel configuration/compilation
().Initial ConfigurationPartition Layoutpartition layout/etc/var/usrBase PartitionsWhen laying out file systems with &man.disklabel.8;
remember that hard
drives transfer data faster from the outer
tracks to the inner.
Thus smaller and heavier-accessed file systems
should be closer to the outside of the drive, while
larger partitions like /usr should be placed
toward the inner. It is a good idea to create
partitions in a similar order to: root, swap,
/var, /usr.The size of /var
reflects the intended machine usage.
/var is used to hold
mailboxes, log files, and printer spools. Mailboxes and log
files can grow to unexpected sizes depending
on how many users exist and how long log
files are kept. Most users would never require a gigabyte,
but remember that /var/tmp
must be large enough to contain packages.
The /usr partition holds much
of the files required to support the system, the &pkgsrc;
collection (recommended) and the source code (optional).
At least 2 gigabytes would be recommended for this partition.When selecting partition sizes, keep the space
requirements in mind. Running out of space in
one partition while barely using another can be a
hassle.Swap Partitionswap sizingswap partitionAs a rule of thumb, the swap partition should be
about double the size of system memory (RAM). For example,
if the machine has 128 megabytes of memory,
the swap file should be 256 megabytes. Systems with
less memory may perform better with more swap.
Less than 256 megabytes of swap is not recommended and
memory expansion should be considered.
The kernel's VM paging algorithms are tuned to
perform best when the swap partition is at least two times the
size of main memory. Configuring too little swap can lead to
inefficiencies in the VM page scanning code and might create
issues later if more memory is added.On larger systems with multiple SCSI disks (or
multiple IDE disks operating on different controllers), it is
recommend that a swap is configured on each drive (up
to four drives). The swap partitions should be
approximately the same size. The kernel can handle arbitrary
sizes but internal data structures scale to 4 times the
largest swap partition. Keeping the swap partitions near the
same size will allow the kernel to optimally stripe swap space
across disks.
Large swap sizes are fine, even if swap is not
used much. It might be easier to recover
from a runaway program before being forced to reboot.Why Partition?Several users think a single large partition will be fine,
but there are several reasons why this is a bad idea.
First, each partition has different operational
characteristics and separating them allows the file system to
tune accordingly. For example, the root
and /usr partitions are read-mostly, without
much writing. While a lot of reading and writing could
occur in /var and
/var/tmp.By properly partitioning a system, fragmentation
introduced in the smaller write heavy partitions
will not bleed over into the mostly-read partitions.
Keeping the write-loaded partitions closer to
the disk's edge,
will
increase I/O performance in the partitions where it occurs
the most. Now while I/O
performance in the larger partitions may be needed,
shifting them more toward the edge of the disk will not
lead to a significant performance improvement over moving
/var to the edge.
Finally, there are safety concerns. A smaller, neater root
partition which is mostly read-only has a greater
chance of surviving a bad crash.Core Configurationrc filesrc.confThe principal location for system configuration information
is within /etc/rc.conf. This file
contains a wide range of configuration information, principally
used at system startup to configure the system. Its name
directly implies this; it is configuration information for the
rc* files.An administrator should make entries in the
rc.conf file to
override the default settings from
/etc/defaults/rc.conf. The defaults file
should not be copied verbatim to /etc - it
contains default values, not examples. All system-specific
changes should be made in the rc.conf
file itself.A number of strategies may be applied in clustered
applications to separate site-wide configuration from
system-specific configuration in order to keep administration
overhead down. The recommended approach is to place site-wide
configuration into another file,
such as /etc/rc.conf.site, and then include
this file into /etc/rc.conf, which will
contain only system-specific information.As rc.conf is read by &man.sh.1; it is
trivial to achieve this. For example:rc.conf: . rc.conf.site
hostname="node15.example.com"
network_interfaces="fxp0 lo0"
ifconfig_fxp0="inet 10.1.1.1"rc.conf.site: defaultrouter="10.1.1.254"
saver="daemon"
blanktime="100"The rc.conf.site file can then be
distributed to every system using rsync or a
similar program, while the rc.conf file
remains unique.Upgrading the system using
make world will not overwrite the
rc.conf
file, so system configuration information will not be lost.Application ConfigurationTypically, installed applications have their own
configuration files, with their own syntax, etc. It is
important that these files be kept separate from the base
system, so that they may be easily located and managed by the
package management tools./usr/pkg/etcTypically, these files are installed in
/usr/pkg/etc. In the case where an
application has a large number of configuration files, a
subdirectory will be created to hold them.Normally, when a port or package is installed, sample
configuration files are also installed. These are usually in
/usr/pkg/share/examples/PACKAGENAME. If
there are no existing configuration files for the application,
they will be created by copying the .default
files.For example, consider the contents of the directory
/usr/pkg/etc/httpd:-rw-r--r-- 1 root wheel 43570 Aug 20 15:26 httpd.conf
-rw-r--r-- 1 root wheel 12965 Aug 20 15:26 magic
-rw-r--r-- 1 root wheel 15020 Aug 20 15:26 mime.typesIf you modify any file, for example httpd.conf
a later update of the Apache port would not
overwrite this changed file.Starting ServicesservicesIt is common for a system to host a number of services.
These may be started in several different fashions, each having
different advantages./usr/pkg/etc/rc.dSoftware installed from a port or the packages collection
will often place a script in
/usr/pkg/etc/rc.d which is invoked at
system startup with a argument, and at
system shutdown with a argument.
This is the recommended way for
starting system-wide services that are to be run as
root, or that
expect to be started as root.
These scripts are registered as
part of the installation of the package, and will be removed
when the package is removed.A generic startup script in
/usr/pkg/etc/rc.d looks like:#!/bin/sh
echo -n ' FooBar'
case "$1" in
start)
/usr/pkg/bin/foobar
;;
stop)
kill -9 `cat /var/run/foobar.pid`
;;
*)
echo "Usage: `basename $0` {start|stop}" >&2
exit 64
;;
esac
exit 0
The startup scripts of &os; will look in
/usr/pkg/etc/rc.d for scripts that have an
.sh extension and are executable by
root. Those scripts that are found are called with
an option at startup, and
at shutdown to allow them to carry out their purpose. So if you wanted
the above sample script to be picked up and run at the proper time during
system startup, you should save it to a file called
FooBar.sh in
/usr/pkg/etc/rc.d and make sure it is
executable. You can make a shell script executable with &man.chmod.1;
as shown below:&prompt.root; chmod 755 FooBar.shSome services expect to be invoked by &man.inetd.8; when a
connection is received on a suitable port. This is common for
mail reader servers (POP and IMAP, etc.). These services are
enabled by editing the file /etc/inetd.conf.
See &man.inetd.8; for details on editing this file.Some additional system services may not be covered by the
toggles in /etc/rc.conf. These are
traditionally enabled by placing the command(s) to invoke them
in /etc/rc.local (which does not exist by default).
Note that rc.local is
generally regarded as the location of last resort; if there is a
better place to start a service, do it there.Do not place any commands in
/etc/rc.conf. To start daemons, or
run any commands at boot time, place a script in
/usr/pkg/etc/rc.d instead.It is also possible to use the &man.cron.8; daemon to start
system services. This approach has a number of advantages, not
least being that because &man.cron.8; runs these processes as the
owner of the crontab, services may be started
and maintained by non-root users.This takes advantage of a feature of &man.cron.8;: the
time specification may be replaced by @reboot,
which will
cause the job to be run when &man.cron.8; is started shortly after
system boot.TomRhodesContributed by Configuring the cron UtilitycronconfigurationOne of the most useful utilities in &os; is &man.cron.8;. The
cron utility runs in the background and constantly
checks the /etc/crontab file. The cron
utility also checks the /var/cron/tabs directory, in
search of new crontab files. These
crontab files store information about specific
functions which cron is supposed to perform at
certain times.The cron utility uses two different
types of configuration files, the system crontab and user crontabs. The
only difference between these two formats is the sixth field. In the
system crontab, the sixth field is the name of a user for the command
to run as. This gives the system crontab the ability to run commands
as any user. In a user crontab, the sixth field is the command to run,
and all commands run as the user who created the crontab; this is an
important security feature.User crontabs allow individual users to schedule tasks without the
need for root privileges. Commands in a user's crontab run with the
permissions of the user who owns the crontab.The root user can have a user crontab just like
any other user. This one is different from
/etc/crontab (the system crontab). Because of the
system crontab, there's usually no need to create a user crontab
for root.Let us take a look at the /etc/crontab file
(the system crontab):# /etc/crontab - root's crontab for &os;
#
#
#
SHELL=/bin/sh
PATH=/etc:/bin:/sbin:/usr/bin:/usr/sbin
HOME=/var/log
#
#
#minute hour mday month wday who command
#
#
*/5 * * * * root /usr/libexec/atrun Like most &os; configuration files, the #
character represents a comment. A comment can be placed in
the file as a reminder of what and why a desired action is performed.
Comments cannot be on the same line as a command or else they will
be interpreted as part of the command; they must be on a new line.
Blank lines are ignored.First, the environment must be defined. The equals
(=) character is used to define any environment
settings, as with this example where it is used for the SHELL,
PATH, and HOME options. If the shell line is
omitted, cron will use the default, which is
sh. If the PATH variable is
omitted, no default will be used and file locations will need to
be absolute. If HOME is omitted, cron
will use the invoking users home directory.This line defines a total of seven fields. Listed here are the
values minute, hour,
mday, month, wday,
who, and command. These
are almost all self explanatory. minute is the time in minutes the
command will be run. hour is similar to the minute option, just in
hours. mday stands for day of the month. month is similar to hour
and minute, as it designates the month. The wday option stands for
day of the week. All these fields must be numeric values, and follow
the twenty-four hour clock. The who field is special,
and only exists in the /etc/crontab file.
This field specifies which user the command should be run as.
When a user installs his or her crontab file, they
will not have this option. Finally, the command option is listed.
This is the last field, so naturally it should designate the command
to be executed.This last line will define the values discussed above. Notice here
we have a */5 listing, followed by several more
* characters. These * characters
mean first-last, and can be interpreted as
every time. So, judging by this line,
it is apparent that the atrun command is to be invoked by
root every five minutes regardless of what
day or month it is. For more information on the atrun command,
see the &man.atrun.8; manual page.Commands can have any number of flags passed to them; however,
commands which extend to multiple lines need to be broken with the backslash
\ continuation character.This is the basic set up for every
crontab file, although there is one thing
different about this one. Field number six, where we specified
the username, only exists in the system
/etc/crontab file. This field should be
omitted for individual user crontab
files.Installing a CrontabYou must not use the procedure described here to
edit/install the system crontab. Simply use your favorite
editor: the cron utility will notice that the file
has changed and immediately begin using the updated version.
If you use crontab to load the
/etc/crontab file you may get an error
like root: not found because of the
system crontab's additional user field.To install a freshly written user
crontab, first use your favorite editor to create
a file in the proper format, and then use the
crontab utility. The most common usage
is:&prompt.user; crontab crontab-fileIn this example, crontab-file is the filename
of a crontab that was previously created.There is also an option to list installed
crontab files: just pass the
option to crontab and look
over the output.For users who wish to begin their own crontab file from scratch,
without the use of a template, the crontab -e
option is available. This will invoke the selected editor
with an empty file. When the file is saved, it will be
automatically installed by the crontab command.
If you later want to remove your user crontab
completely, use crontab with the
option.
TomRhodesContributed by Using rc under &os;rcNG&os; uses the &netbsd;
rc.d system for system initialization.
Users should notice the files listed in the
/etc/rc.d directory. Many of these files
are for basic services which can be controlled with the
, ,
and options.
For instance, &man.sshd.8; can be restarted with the following
command:&prompt.root; /etc/rc.d/sshd restartThis procedure is similar for other services. Of course,
services are usually started automatically as specified in
&man.rc.conf.5;. For example, enabling the Network Address
Translation daemon at startup is as simple as adding the
following line to /etc/rc.conf:natd_enable="YES"If a line is already
present, then simply change the to
. The rc scripts will automatically load
any other dependent services during the next reboot, as
described below.Since the rc.d system is primarily
intended to start/stop services at system startup/shutdown time,
the standard ,
and options will only
perform their action if the appropriate
/etc/rc.conf variables are set. For
instance the above sshd restart command will
only work if sshd_enable is set to
in /etc/rc.conf. To
, or
a service regardless of the settings in
/etc/rc.conf, the commands should be
prefixed with force. For instance to restart
sshd regardless of the current
/etc/rc.conf setting, execute the following
command:&prompt.root; /etc/rc.d/sshd forcerestartIt is easy to check if a service is enabled in
/etc/rc.conf by running the appropriate
rc.d script with the option
. Thus, an administrator can check that
sshd is in fact enabled in
/etc/rc.conf by running:&prompt.root; /etc/rc.d/sshd rcvar
# sshd
$sshd_enable=YESThe second line (# sshd) is the output
from the rc.d script, not a
root prompt.To determine if a service is running, a
option is available. For instance to
verify that sshd is actually started:&prompt.root; /etc/rc.d/sshd status
sshd is running as pid 433.It is also possible to a service.
This will attempt to send a signal to an individual service, forcing the
service to reload its configuration files. In most cases this
means sending the service a SIGHUP
signal.The rcNG structure is used both
for network services and system initialization. Some services are run
only at boot; and the RCNG system is what triggers them.
Many system services depend on other services to function
properly. For example, NIS and other RPC-based services may
fail to start until after the rpcbind
(portmapper) service has started. To resolve this issue,
information about dependencies and other meta-data is included
in the comments at the top of each startup script. The
&man.rcorder.8; program is then used to parse these comments
during system initialization to determine the order in which
system services should be invoked to satisfy the dependencies.
The following words may be included at the top of each startup
file:PROVIDE: Specifies the services this file provides.REQUIRE: Lists services which are required for this
service. This file will run after
the specified services.BEFORE: Lists services which depend on this service.
This file will run before
the specified services.KEYWORD: When &man.rcorder.8; uses the
option, then only the rc.d files matching this keyword are used.
Previously this was used to define *BSD dependent features.
For example, when using , only the
rc.d scripts defining the
shutdown keyword are used.
With the option, &man.rcorder.8 will
skip any rc.d script defining the
corresponding keyword to skip. For example, scripts defining the
nostart keyword are skipped at boot time.By using this method, an administrator can easily control system
services without the hassle of runlevels like
some other &unix; operating systems.Additional information about the &os;
rc.d system can be found in the &man.rc.8;,
&man.rc.conf.5;, and &man.rc.subr.8; manual pages.MarcFonvieilleContributed by Setting Up Network Interface Cardsnetwork card configurationNowadays we can not think about a computer without thinking
about a network connection. Adding and configuring a network
card is a common task for any &os; administrator.Locating the Correct Drivernetwork card configurationlocating the driverBefore you begin, you should know the model of the card
you have, the chip it uses, and whether it is a PCI or ISA card.
&os; supports a wide variety of both PCI and ISA cards.
Check the Hardware Compatibility List for your release to see
if your card is supported.Once you are sure your card is supported, you need
to determine the proper driver for the card. The file
/usr/src/sys/i386/conf/LINT will give you
the list of network interfaces drivers with some information
about the supported chipsets/cards. If you have doubts about
which driver is the correct one, read the manual page of the
driver. The manual page will give you more information about
the supported hardware and even the possible problems that
could occur.If you own a common card, most of the time you will not
have to look very hard for a driver. Drivers for common
network cards are present in the GENERIC
kernel, so your card should show up during boot, like so:dc0: <82c169 PNIC 10/100BaseTX> port 0xa000-0xa0ff mem 0xd3800000-0xd38
000ff irq 15 at device 11.0 on pci0
dc0: Ethernet address: 00:a0:cc:da:da:da
miibus0: <MII bus> on dc0
ukphy0: <Generic IEEE 802.3u media interface> on miibus0
ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
dc1: <82c169 PNIC 10/100BaseTX> port 0x9800-0x98ff mem 0xd3000000-0xd30
000ff irq 11 at device 12.0 on pci0
dc1: Ethernet address: 00:a0:cc:da:da:db
miibus1: <MII bus> on dc1
ukphy1: <Generic IEEE 802.3u media interface> on miibus1
ukphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, autoIn this example, we see that two cards using the &man.dc.4;
driver are present on the system.To use your network card, you will need to load the proper
driver. This may be accomplished in one of two ways. The
easiest way is to simply load a kernel module for your network
card with &man.kldload.8;. A module is not available for all
network card drivers (ISA cards and cards using the &man.ed.4;
driver, for example). Alternatively, you may statically compile
the support for your card into your kernel. Check
/usr/src/sys/i386/conf/LINT and the
manual page of the driver to know what to add in your kernel
configuration file. For more information about recompiling your
kernel, please see . If your card
was detected at boot by your kernel (GENERIC)
you do not have to build a new kernel.Configuring the Network CardNetwork card configurationconfigurationOnce the right driver is loaded for the network card, the
card needs to be configured. As with many other things, the
network card may have been configured at installation time.To display the configuration for the network interfaces on
your system, enter the following command:&prompt.user; ifconfig
dc0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
inet 192.168.1.3 netmask 0xffffff00 broadcast 192.168.1.255
ether 00:a0:cc:da:da:da
media: Ethernet autoselect (100baseTX <full-duplex>)
status: active
dc1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
inet 10.0.0.1 netmask 0xffffff00 broadcast 10.0.0.255
ether 00:a0:cc:da:da:db
media: Ethernet 10baseT/UTP
status: no carrier
lp0: flags=8810<POINTOPOINT,SIMPLEX,MULTICAST> mtu 1500
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384
inet 127.0.0.1 netmask 0xff000000
tun0: flags=8010<POINTOPOINT,MULTICAST> mtu 1500Note that entries concerning IPv6
(inet6 etc.) were omitted in this
example.In this example, the following devices were
displayed:dc0: The first Ethernet
interfacedc1: The second Ethernet
interfacelp0: The parallel port
interfacelo0: The loopback devicetun0: The tunnel device used by
ppp&os; uses the driver name followed by the order in
which one the card is detected at the kernel boot to name the
network card, starting the count at zero. For example,
sis2 would be the third network card
on the system using the &man.sis.4; driver.In this example, the dc0 device is
up and running. The key indicators are:UP means that the card is configured
and ready.The card has an Internet (inet)
address (in this case
192.168.1.3).It has a valid subnet mask (netmask;
0xffffff00 is the same as
255.255.255.0).It has a valid broadcast address (in this case,
192.168.1.255).The MAC address of the card (ether)
is 00:a0:cc:da:da:daThe physical media selection is on autoselection mode
(media: Ethernet autoselect (100baseTX
<full-duplex>)). We see that
dc1 was configured to run with
10baseT/UTP media. For more
information on available media types for a driver, please
refer to its manual page.The status of the link (status)
is active, i.e. the carrier is detected.
For dc1, we see
status: no carrier. This is normal when
an Ethernet cable is not plugged into the card.If the &man.ifconfig.8; output had shown something similar
to:dc0: flags=8843<BROADCAST,SIMPLEX,MULTICAST> mtu 1500
ether 00:a0:cc:da:da:dait would indicate the card has not been configured.To configure your card, you need root
privileges. The network card configuration can be done from the
command line with &man.ifconfig.8; as root.
&prompt.root; ifconfig dc0 inet 192.168.1.3 netmask 255.255.255.0Manually configuring the card has the disadvantage that you
would have to do it after each reboot of the system. The file
/etc/rc.conf is where to add the network
card's configuration.Open /etc/rc.conf in your favorite
editor. You need to add a line for each network card present on
the system, for example in our case, we added these lines:ifconfig_dc0="inet 192.168.1.3 netmask 255.255.255.0"
ifconfig_dc1="inet 10.0.0.1 netmask 255.255.255.0 media 10baseT/UTP"You have to replace dc0,
dc1, and so on, with
the correct device for your cards, and the addresses with the
proper ones. You should read the card driver and
&man.ifconfig.8; manual pages for more details about the allowed
options and also &man.rc.conf.5; manual page for more
information on the syntax of
/etc/rc.conf.If you configured the network during installation, some
lines about the network card(s) may be already present. Double
check /etc/rc.conf before adding any
lines.You will also have to edit the file
/etc/hosts to add the names and the IP
addresses of various machines of the LAN, if they are not already
there. For more information please refer to &man.hosts.5;
and to /usr/share/examples/etc/hosts.Testing and TroubleshootingOnce you have made the necessary changes in
/etc/rc.conf, you should reboot your
system. This will allow the change(s) to the interface(s) to
be applied, and verify that the system restarts without any
configuration errors.Once the system has been rebooted, you should test the
network interfaces.Testing the Ethernet Cardnetwork card configurationtesting the cardTo verify that an Ethernet card is configured correctly,
you have to try two things. First, ping the interface itself,
and then ping another machine on the LAN.First test the local interface:&prompt.user; ping -c5 192.168.1.3
PING 192.168.1.3 (192.168.1.3): 56 data bytes
64 bytes from 192.168.1.3: icmp_seq=0 ttl=64 time=0.082 ms
64 bytes from 192.168.1.3: icmp_seq=1 ttl=64 time=0.074 ms
64 bytes from 192.168.1.3: icmp_seq=2 ttl=64 time=0.076 ms
64 bytes from 192.168.1.3: icmp_seq=3 ttl=64 time=0.108 ms
64 bytes from 192.168.1.3: icmp_seq=4 ttl=64 time=0.076 ms
--- 192.168.1.3 ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.074/0.083/0.108/0.013 msNow we have to ping another machine on the LAN:&prompt.user; ping -c5 192.168.1.2
PING 192.168.1.2 (192.168.1.2): 56 data bytes
64 bytes from 192.168.1.2: icmp_seq=0 ttl=64 time=0.726 ms
64 bytes from 192.168.1.2: icmp_seq=1 ttl=64 time=0.766 ms
64 bytes from 192.168.1.2: icmp_seq=2 ttl=64 time=0.700 ms
64 bytes from 192.168.1.2: icmp_seq=3 ttl=64 time=0.747 ms
64 bytes from 192.168.1.2: icmp_seq=4 ttl=64 time=0.704 ms
--- 192.168.1.2 ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.700/0.729/0.766/0.025 msYou could also use the machine name instead of
192.168.1.2 if you have set up the
/etc/hosts file.Troubleshootingnetwork card configurationtroubleshootingTroubleshooting hardware and software configurations is always
a pain, and a pain which can be alleviated by checking the simple
things first. Is your network cable plugged in? Have you properly
configured the network services? Did you configure the firewall
correctly? Is the card you are using supported by &os;? Always
check the hardware notes before sending off a bug report. Update
your version of &os; to the latest PREVIEW version. Check the
mailing list archives, or perhaps search the Internet.If the card works, yet performance is poor, it would be
worthwhile to read over the &man.tuning.7; manual page. You
can also check the network configuration as incorrect network
settings can cause slow connections.Some users experience one or two device
timeouts, which is normal for some cards. If they
continue, or are bothersome, you may wish to be sure the
device is not conflicting with another device. Double check
the cable connections. Perhaps you may just need to get
another card.At times, users see a few watchdog timeout
errors. The first thing to do here is to check your network
cable. Many cards require a PCI slot which supports Bus
Mastering. On some old motherboards, only one PCI slot allows
it (usually slot 0). Check the network card and the
motherboard documentation to determine if that may be the
problem.No route to host messages occur if the
system is unable to route a packet to the destination host.
This can happen if no default route is specified, or if a
cable is unplugged. Check the output of netstat
-rn and make sure there is a valid route to the host
you are trying to reach. If there is not, read on to .ping: sendto: Permission denied error
messages are often caused by a misconfigured firewall. If
ipfw is enabled in the kernel but no rules
have been defined, then the default policy is to deny all
traffic, even ping requests! Read on to for more information.Sometimes performance of the card is poor, or below average.
In these cases it is best to set the media selection mode
from autoselect to the correct media selection.
While this usually works for most hardware, it may not resolve
this issue for everyone. Again, check all the network settings,
and read over the &man.tuning.7; manual page.Virtual Hostsvirtual hostsIP aliasesA very common use of &os; is virtual site hosting, where
one server appears to the network as many servers. This is
achieved by assigning multiple network addresses to a single
interface.A given network interface has one real address,
and may have any number of alias addresses.
These aliases are
normally added by placing alias entries in
/etc/rc.conf.An alias entry for the interface fxp0
looks like:ifconfig_fxp0_alias0="inet xxx.xxx.xxx.xxx netmask xxx.xxx.xxx.xxx"Note that alias entries must start with
alias0 and proceed upwards in order, (for example,
_alias1, _alias2, and so on).
The configuration process will stop at the first missing number.
The calculation of alias netmasks is important, but
fortunately quite simple. For a given interface, there must be
one address which correctly represents the network's netmask.
Any other addresses which fall within this network must have a
netmask of all 1s (expressed as either
255.255.255.255 or
0xffffffff).
For example, consider the case where the
fxp0 interface is
connected to two networks, the 10.1.1.0
network with a netmask of 255.255.255.0
and the 202.0.75.16 network with
a netmask of 255.255.255.240.
We want the system to appear at 10.1.1.1
through 10.1.1.5 and at
202.0.75.17 through
202.0.75.20. As noted above, only the
first address in a given network range (in this case,
10.0.1.1 and
202.0.75.17) should have a real
netmask; all the rest (10.1.1.2
through 10.1.1.5 and
202.0.75.18 through
202.0.75.20) must be configured with a
netmask of 255.255.255.255.The following entries configure the adapter correctly for
this arrangement: ifconfig_fxp0="inet 10.1.1.1 netmask 255.255.255.0"
ifconfig_fxp0_alias0="inet 10.1.1.2 netmask 255.255.255.255"
ifconfig_fxp0_alias1="inet 10.1.1.3 netmask 255.255.255.255"
ifconfig_fxp0_alias2="inet 10.1.1.4 netmask 255.255.255.255"
ifconfig_fxp0_alias3="inet 10.1.1.5 netmask 255.255.255.255"
ifconfig_fxp0_alias4="inet 202.0.75.17 netmask 255.255.255.240"
ifconfig_fxp0_alias5="inet 202.0.75.18 netmask 255.255.255.255"
ifconfig_fxp0_alias6="inet 202.0.75.19 netmask 255.255.255.255"
ifconfig_fxp0_alias7="inet 202.0.75.20 netmask 255.255.255.255"Configuration Files/etc LayoutThere are a number of directories in which configuration
information is kept. These include:/etcGeneric system configuration information; data here is
system-specific./etc/defaultsDefault versions of system configuration files./etc/mailExtra &man.sendmail.8; configuration, other
MTA configuration files.
/etc/pppConfiguration for both user- and kernel-ppp programs.
/etc/namedbDefault location for &man.named.8; data. Normally
named.conf and zone files are stored
here./usr/pkg/etcConfiguration files for installed applications.
May contain per-application subdirectories./usr/pkg/etc/rc.dStart/stop scripts for installed applications./var/dbAutomatically generated system-specific database files,
such as the package database, the locate database, and so
onHostnameshostnameDNS/etc/resolv.confresolv.conf/etc/resolv.conf dictates how &os;'s
resolver accesses the Internet Domain Name System (DNS).The most common entries to resolv.conf are:
nameserverThe IP address of a name server the resolver
should query. The servers are queried in the order
listed with a maximum of three.searchSearch list for hostname lookup. This is normally
determined by the domain of the local hostname.domainThe local domain name.A typical resolv.conf:search example.com
nameserver 147.11.1.11
nameserver 147.11.100.30Only one of the search and
domain options should be used.If you are using DHCP, &man.dhclient.8; usually rewrites
resolv.conf with information received from the
DHCP server./etc/hostshosts/etc/hosts is a simple text
database reminiscent of the old Internet. It works in
conjunction with DNS and NIS providing name to IP address
mappings. Local computers connected via a LAN can be placed
in here for simplistic naming purposes instead of setting up
a &man.named.8; server. Additionally,
/etc/hosts can be used to provide a
local record of Internet names, reducing the need to query
externally for commonly accessed names.#
#
# Host Database
# This file should contain the addresses and aliases
# for local hosts that share this file.
# In the presence of the domain name service or NIS, this file may
# not be consulted at all; see /etc/nsswitch.conf for the resolution order.
#
#
::1 localhost localhost.my.domain myname.my.domain
127.0.0.1 localhost localhost.my.domain myname.my.domain
#
# Imaginary network.
#10.0.0.2 myname.my.domain myname
#10.0.0.3 myfriend.my.domain myfriend
#
# According to RFC 1918, you can use the following IP networks for
# private nets which will never be connected to the Internet:
#
# 10.0.0.0 - 10.255.255.255
# 172.16.0.0 - 172.31.255.255
# 192.168.0.0 - 192.168.255.255
#
# In case you want to be able to connect to the Internet, you need
# real official assigned numbers. PLEASE PLEASE PLEASE do not try
# to invent your own network numbers but instead get one from your
# network provider (if any) or from the Internet Registry (ftp to
# rs.internic.net, directory `/templates').
#/etc/hosts takes on the simple format
of:[Internet address] [official hostname] [alias1] [alias2] ...For example:10.0.0.1 myRealHostname.example.com myRealHostname foobar1 foobar2Consult &man.hosts.5; for more information.Log File Configurationlog filessyslog.confsyslog.confsyslog.conf is the configuration file
for the &man.syslogd.8; program. It indicates which types
of syslog messages are logged to particular
log files.#
#
# Spaces ARE valid field separators in this file. However,
# other *nix-like systems still insist on using tabs as field
# separators. If you are sharing this file between systems, you
# may want to use only tabs as field separators here.
# Consult the syslog.conf(5) manual page.
*.err;kern.debug;auth.notice;mail.crit /dev/console
*.notice;kern.debug;lpr.info;mail.crit;news.err /var/log/messages
security.* /var/log/security
mail.info /var/log/maillog
lpr.info /var/log/lpd-errs
cron.* /var/log/cron
*.err root
*.notice;news.err root
*.alert root
*.emerg *
# uncomment this to log all writes to /dev/console to /var/log/console.log
#console.info /var/log/console.log
# uncomment this to enable logging of all log messages to /var/log/all.log
#*.* /var/log/all.log
# uncomment this to enable logging to a remote log host named loghost
#*.* @loghost
# uncomment these if you're running inn
# news.crit /var/log/news/news.crit
# news.err /var/log/news/news.err
# news.notice /var/log/news/news.notice
!startslip
*.* /var/log/slip.log
!ppp
*.* /var/log/ppp.logConsult the &man.syslog.conf.5; manual page for more
information.newsyslog.confnewsyslog.confnewsyslog.conf is the configuration
file for &man.newsyslog.8;, a program that is normally scheduled
to run by &man.cron.8;. &man.newsyslog.8; determines when log
files require archiving or rearranging.
logfile is moved to
logfile.0, logfile.0
is moved to logfile.1, and so on.
Alternatively, the log files may be archived in &man.gzip.1; format
causing them to be named: logfile.0.gz,
logfile.1.gz, and so on.newsyslog.conf indicates which log
files are to be managed, how many are to be kept, and when
they are to be touched. Log files can be rearranged and/or
archived when they have either reached a certain size, or at a
certain periodic time/date.# configuration file for newsyslog
#
#
# filename [owner:group] mode count size when [ZB] [/pid_file] [sig_num]
/var/log/cron 600 3 100 * Z
/var/log/amd.log 644 7 100 * Z
/var/log/kerberos.log 644 7 100 * Z
/var/log/lpd-errs 644 7 100 * Z
/var/log/maillog 644 7 * @T00 Z
/var/log/sendmail.st 644 10 * 168 B
/var/log/messages 644 5 100 * Z
/var/log/all.log 600 7 * @T00 Z
/var/log/slip.log 600 3 100 * Z
/var/log/ppp.log 600 3 100 * Z
/var/log/security 600 10 100 * Z
/var/log/wtmp 644 3 * @01T05 B
/var/log/daily.log 640 7 * @T00 Z
/var/log/weekly.log 640 5 1 $W6D0 Z
/var/log/monthly.log 640 12 * $M1D0 Z
/var/log/console.log 640 5 100 * ZConsult the &man.newsyslog.8; manual page for more
information.sysctl.confsysctl.confsysctlsysctl.conf looks much like
rc.conf. Values are set in a
variable=value
form. The specified values are set after the system goes into
multi-user mode. Not all variables are settable in this mode.A sample sysctl.conf turning off logging
of fatal signal exits and letting Linux programs know they are really
running under &os;:kern.logsigexit=0 # Do not log fatal signal exits (e.g. sig 11)
compat.linux.osname: Linux
compat.linux.osrelease: 2.4.2Tuning with sysctlsysctltuningwith sysctl&man.sysctl.8; is an interface that allows you to make changes
to a running &os; system. This includes many advanced
options of the TCP/IP stack and virtual memory system that can
dramatically improve performance for an experienced system
administrator. Over five hundred system variables can be read
and set using &man.sysctl.8;.At its core, &man.sysctl.8; serves two functions: to read and
to modify system settings.To view all readable variables:&prompt.user; sysctl -aTo read a particular variable, for example,
kern.maxproc:&prompt.user; sysctl kern.maxproc
kern.maxproc: 1044To set a particular variable, use the intuitive
variable=value
syntax:&prompt.root; sysctl kern.maxfiles=5000
kern.maxfiles: 2088 -> 5000Settings of sysctl variables are usually either strings,
numbers, or booleans (a boolean being 1 for yes
or a 0 for no).If you want to set automatically some variables each time
the machine boots, add them to the
/etc/sysctl.conf file. For more information
see the &man.sysctl.conf.5; manual page and the
.TomRhodesContributed by &man.sysctl.8; Read-onlyIn some cases it may be desirable to modify read-only &man.sysctl.8;
values. While this is not recommended, it is also sometimes unavoidable.For instance on some laptop models the &man.cardbus.4; device will
not probe memory ranges, and fail with errors which look similar to:cbb0: Could not map register memory
device_probe_and_attach: cbb0 attach returned 12Cases like the one above usually require the modification of some
default &man.sysctl.8; settings which are set read only. To overcome
these situations a user can put &man.sysctl.8; OIDs
in their local /boot/loader.conf. Default
settings are located in the /boot/defaults/loader.conf
file.Fixing the problem mentioned above would require a user to set
in the aforementioned
file. Now &man.cardbus.4; will work properly.Tuning DisksSysctl Variablesvfs.write_behindvfs.write_behindThe vfs.write_behind sysctl variable
defaults to 1 (on). This tells the file system
to issue media writes as full clusters are collected, which
typically occurs when writing large sequential files. The idea
is to avoid saturating the buffer cache with dirty buffers when
it would not benefit I/O performance. However, this may stall
processes and under certain circumstances you may wish to turn it
off.vfs.hirunningspacevfs.hirunningspaceThe vfs.hirunningspace sysctl variable
determines how much outstanding write I/O may be queued to disk
controllers system-wide at any given instance. The default is
usually sufficient but on machines with lots of disks you may
want to bump it up to four or five megabytes.
Note that setting too high a value (exceeding the buffer cache's
write threshold) can lead to extremely bad clustering
performance. Do not set this value arbitrarily high! Higher
write values may add latency to reads occurring at the same time.
There are various other buffer-cache and VM page cache
related sysctls. We do not recommend modifying these values.
The VM system does an extremely good job of
automatically tuning itself.vm.swap_idle_enabledvm.swap_idle_enabledThe vm.swap_idle_enabled sysctl variable
is useful in large multi-user systems where you have lots of
users entering and leaving the system and lots of idle processes.
Such systems tend to generate a great deal of continuous pressure
on free memory reserves. Turning this feature on and tweaking
the swapout hysteresis (in idle seconds) via
vm.swap_idle_threshold1 and
vm.swap_idle_threshold2 allows you to depress
the priority of memory pages associated with idle processes more
quickly then the normal pageout algorithm. This gives a helping
hand to the pageout daemon. Do not turn this option on unless
you need it, because the tradeoff you are making is essentially
pre-page memory sooner rather than later; thus eating more swap
and disk bandwidth. In a small system this option will have a
determinable effect but in a large system that is already doing
moderate paging this option allows the VM system to stage whole
processes into and out of memory easily.hw.ata.wchw.ata.wcIDE drives lie about when a write completes. With IDE write
caching turned on, IDE hard drives not only write data
to disk out of order, but will sometimes delay writing some
blocks indefinitely when under heavy disk loads. A crash or
power failure may cause serious file system corruption. Turning
off write caching will remove the danger of this data loss, but
will also cause disk operations to proceed
very slowly. Change this only if prepared
to suffer with the disk slowdown.Changing this variable must be done from the
boot loader at boot time. Attempting to do it after the
kernel boots will have no effect.For more information, please see &man.ata.4; manual page.Soft UpdatesSoft UpdatestunefsThe &man.tunefs.8; program can be used to fine-tune a
file system. This program has many different options, but for
now we are only concerned with toggling Soft Updates on and
off, which is done by:&prompt.root; tunefs -n enable /filesystem
&prompt.root; tunefs -n disable /filesystemA filesystem cannot be modified with &man.tunefs.8; while
it is mounted. A good time to enable Soft Updates is before any
partitions have been mounted, in single-user mode.It is possible to enable Soft Updates
at filesystem creation time, through use of the -U
option to &man.newfs.8;.Soft Updates drastically improves meta-data performance, mainly
file creation and deletion, through the use of a memory cache. We
recommend to use Soft Updates on all of your file systems but
/. There
are two downsides to Soft Updates that you should be aware of: First,
Soft Updates guarantees filesystem consistency in the case of a crash
but could very easily be several seconds (even a minute!) behind
updating the physical disk. If your system crashes you may lose more
work than otherwise. Secondly, Soft Updates delays the freeing of
filesystem blocks. If you have a filesystem (such as the root
filesystem) which is almost full, performing a major update, such as
make installworld, can cause the filesystem to run
out of space and the update to fail.More Details about Soft UpdatesSoft UpdatesdetailsThere are two traditional approaches to writing a file
systems meta-data back to disk. (Meta-data updates are
updates to non-content data like inodes or
directories.)Historically, the default behavior was to write out
meta-data updates synchronously. If a directory had been
changed, the system waited until the change was actually
written to disk. The file data buffers (file contents) were
passed through the buffer cache and backed up
to disk later on asynchronously. The advantage of this
implementation is that it operates safely. If there is
a failure during an update, the meta-data are always in a
consistent state. A file is either created completely
or not at all. If the data blocks of a file did not find
their way out of the buffer cache onto the disk by the time
of the crash, &man.fsck.8; is able to recognize this and
repair the filesystem by setting the file length to
0. Additionally, the implementation is clear and simple.
The disadvantage is that meta-data changes are slow. An
rm -r, for instance, touches all the files
in a directory sequentially, but each directory
change (deletion of a file) will be written synchronously
to the disk. This includes updates to the directory itself,
to the inode table, and possibly to indirect blocks
allocated by the file. Similar considerations apply for
unrolling large hierarchies (tar -x).The second case is asynchronous meta-data updates. This
is the default for Linux/ext2fs and
mount -o async for *BSD ufs. All
meta-data updates are simply being passed through the buffer
cache too, that is, they will be intermixed with the updates
of the file content data. The advantage of this
implementation is there is no need to wait until each
meta-data update has been written to disk, so all operations
which cause huge amounts of meta-data updates work much
faster than in the synchronous case. Also, the
implementation is still clear and simple, so there is a low
risk for bugs creeping into the code. The disadvantage is
that there is no guarantee at all for a consistent state of
the filesystem. If there is a failure during an operation
that updated large amounts of meta-data (like a power
failure, or someone pressing the reset button),
the filesystem
will be left in an unpredictable state. There is no opportunity
to examine the state of the filesystem when the system
comes up again; the data blocks of a file could already have
been written to the disk while the updates of the inode
table or the associated directory were not. It is actually
impossible to implement a fsck which is
able to clean up the resulting chaos (because the necessary
information is not available on the disk). If the
filesystem has been damaged beyond repair, the only choice
is to use &man.newfs.8; on it and restore it from backup.
The usual solution for this problem was to implement
dirty region logging, which is also
referred to as journaling, although that
term is not used consistently and is occasionally applied
to other forms of transaction logging as well. Meta-data
updates are still written synchronously, but only into a
small region of the disk. Later on they will be moved
to their proper location. Because the logging
area is a small, contiguous region on the disk, there
are no long distances for the disk heads to move, even
during heavy operations, so these operations are quicker
than synchronous updates.
Additionally the complexity of the implementation is fairly
limited, so the risk of bugs being present is low. A disadvantage
is that all meta-data are written twice (once into the
logging region and once to the proper location) so for
normal work, a performance pessimization
might result. On the other hand, in case of a crash, all
pending meta-data operations can be quickly either rolled-back
or completed from the logging area after the system comes
up again, resulting in a fast filesystem startup.Kirk McKusick, the developer of Berkeley FFS,
solved this problem with Soft Updates: all pending
meta-data updates are kept in memory and written out to disk
in a sorted sequence (ordered meta-data
updates). This has the effect that, in case of
heavy meta-data operations, later updates to an item
catch the earlier ones if the earlier ones are still in
memory and have not already been written to disk. So all
operations on, say, a directory are generally performed in
memory before the update is written to disk (the data
blocks are sorted according to their position so
that they will not be on the disk ahead of their meta-data).
If the system crashes, this causes an implicit log
rewind: all operations which did not find their way
to the disk appear as if they had never happened. A
consistent filesystem state is maintained that appears to
be the one of 30 to 60 seconds earlier. The
algorithm used guarantees that all resources in use
are marked as such in their appropriate bitmaps: blocks and inodes.
After a crash, the only resource allocation error
that occurs is that resources are
marked as used which are actually free.
&man.fsck.8; recognizes this situation,
and frees the resources that are no longer used. It is safe to
ignore the dirty state of the filesystem after a crash by
forcibly mounting it with mount -f. In
order to free resources that may be unused, &man.fsck.8;
needs to be run at a later time.The advantage is that meta-data operations are nearly as
fast as asynchronous updates (i.e. faster than with
logging, which has to write the
meta-data twice). The disadvantages are the complexity of
the code (implying a higher risk for bugs in an area that
is highly sensitive regarding loss of user data), and a
higher memory consumption. Additionally there are some
idiosyncrasies one has to get used to.
After a crash, the state of the filesystem appears to be
somewhat older. In situations where
the standard synchronous approach would have caused some
zero-length files to remain after the
fsck, these files do not exist at all
with a Soft Updates filesystem because neither the meta-data
nor the file contents have ever been written to disk.
Disk space is not released until the updates have been
written to disk, which may take place some time after
running rm. This may cause problems
when installing large amounts of data on a filesystem
that does not have enough free space to hold all the files
twice.Tuning Kernel Limitstuningkernel limitsFile/Process Limitskern.maxfileskern.maxfileskern.maxfiles can be raised or
lowered based upon your system requirements. This variable
indicates the maximum number of file descriptors on your
system. When the file descriptor table is full,
file: table is full will show up repeatedly
in the system message buffer, which can be viewed with the
dmesg command.Each open file, socket, or fifo uses one file
descriptor. A large-scale production server may easily
require many thousands of file descriptors, depending on the
kind and number of services running concurrently.kern.maxfile's default value is
dictated by the option in your
kernel configuration file. kern.maxfiles grows
proportionally to the value of . When
compiling a custom kernel, it is a good idea to set this kernel
configuration option according to the uses of your system. From
this number, the kernel is given most of its pre-defined limits.
Even though a production machine may not actually have 256 users
connected at once, the resources needed may be similar to a
high-scale web server.Setting to
0 in your kernel configuration file will choose
a reasonable default value based on the amount of RAM present in
your system. It is set to 0 in the default GENERIC kernel.kern.ipc.somaxconnkern.ipc.somaxconnThe kern.ipc.somaxconn sysctl variable
limits the size of the listen queue for accepting new TCP
connections. The default value of 128 is
typically too low for robust handling of new connections in a
heavily loaded web server environment. For such environments, it
is recommended to increase this value to 1024 or
higher. The service daemon may itself limit the listen queue size
(e.g. &man.sendmail.8;, or Apache) but
will often have a directive in its configuration file to adjust
the queue size. Large listen queues also do a better job of
avoiding Denial of Service (DoS) attacks.Network LimitsThe NMBCLUSTERS kernel configuration
option dictates the amount of network Mbufs available to the
system. A heavily-trafficked server with a low number of Mbufs
will hinder &os;'s ability. Each cluster represents
approximately 2 K of memory, so a value of 1024 represents 2
megabytes of kernel memory reserved for network buffers. A
simple calculation can be done to figure out how many are
needed. If you have a web server which maxes out at 1000
simultaneous connections, and each connection eats a 16 K receive
and 16 K send buffer, you need approximately 32 MB worth of
network buffers to cover the web server. A good rule of thumb is
to multiply by 2, so 2x32 MB / 2 KB =
64 MB / 2 kB = 32768. We recommend
values between 4096 and 32768 for machines with greater amounts
of memory. Under no circumstances should you specify an
arbitrarily high value for this parameter as it could lead to a
boot time crash. The option to
&man.netstat.1; may be used to observe network cluster
use. kern.ipc.nmbclusters loader tunable should
be used to tune this at boot time.For busy servers that make extensive use of the
&man.sendfile.2; system call, it may be necessary to increase
the number of &man.sendfile.2; buffers via the
NSFBUFS kernel configuration option or by
setting its value in /boot/loader.conf
(see &man.loader.8; for details). A common indicator that
this parameter needs to be adjusted is when processes are seen
in the sfbufa state. The sysctl
variable kern.ipc.nsfbufs is a read-only
glimpse at the kernel configured variable. This parameter
nominally scales with kern.maxusers,
however it may be necessary to tune accordingly.Even though a socket has been marked as non-blocking,
calling &man.sendfile.2; on the non-blocking socket may
result in the &man.sendfile.2; call blocking until enough
struct sf_buf's are made
available.net.inet.ip.portrange.*net.inet.ip.portrange.*The net.inet.ip.portrange.* sysctl
variables control the port number ranges automatically bound to TCP
and UDP sockets. There are three ranges: a low range, a default
range, and a high range. Most network programs use the default
range which is controlled by the
net.inet.ip.portrange.first and
net.inet.ip.portrange.last, which default to
1024 and 5000, respectively. Bound port ranges are used for
outgoing connections, and it is possible to run the system out of
ports under certain circumstances. This most commonly occurs
when you are running a heavily loaded web proxy. The port range
is not an issue when running servers which handle mainly incoming
connections, such as a normal web server, or has a limited number
of outgoing connections, such as a mail relay. For situations
where you may run yourself out of ports, it is recommended to
increase net.inet.ip.portrange.last modestly.
A value of 10000, 20000 or
30000 may be reasonable. You should also
consider firewall effects when changing the port range. Some
firewalls may block large ranges of ports (usually low-numbered
ports) and expect systems to use higher ranges of ports for
outgoing connections — for this reason it is recommended that
net.inet.ip.portrange.first be lowered.TCP Bandwidth Delay ProductTCP Bandwidth Delay Product Limitingnet.inet.tcp.inflight_enableThe TCP Bandwidth Delay Product Limiting is similar to
TCP/Vegas in NetBSD.
&netbsd;
It can be
enabled by setting net.inet.tcp.inflight_enable
sysctl variable to 1. The system will attempt
to calculate the bandwidth delay product for each connection and
limit the amount of data queued to the network to just the amount
required to maintain optimum throughput.This feature is useful if you are serving data over modems,
Gigabit Ethernet, or even high speed WAN links (or any other link
with a high bandwidth delay product), especially if you are also
using window scaling or have configured a large send window. If
you enable this option, you should also be sure to set
net.inet.tcp.inflight_debug to
0 (disable debugging), and for production use
setting net.inet.tcp.inflight_min to at least
6144 may be beneficial. However, note that
setting high minimums may effectively disable bandwidth limiting
depending on the link. The limiting feature reduces the amount of
data built up in intermediate route and switch packet queues as
well as reduces the amount of data built up in the local host's
interface queue. With fewer packets queued up, interactive
connections, especially over slow modems, will also be able to
operate with lower Round Trip Times. However,
note that this feature only effects data transmission (uploading
/ server side). It has no effect on data reception (downloading).
Adjusting net.inet.tcp.inflight_stab is
not recommended. This parameter defaults to
20, representing 2 maximal packets added to the bandwidth delay
product window calculation. The additional window is required to
stabilize the algorithm and improve responsiveness to changing
conditions, but it can also result in higher ping times over slow
links (though still much lower than you would get without the
inflight algorithm). In such cases, you may wish to try reducing
this parameter to 15, 10, or 5; and may also have to reduce
net.inet.tcp.inflight_min (for example, to
3500) to get the desired effect. Reducing these parameters
should be done as a last resort only.Adding Swap SpaceNo matter how well you plan, sometimes a system does not run
as you expect. If you find you need more swap space, it is
simple enough to add. You have three ways to increase swap
space: adding a new hard drive, enabling swap over NFS, and
creating a swap file on an existing partition.Swap on a New Hard DriveThe best way to add swap, of course, is to use this as an
excuse to add another hard drive. You can always use another
hard drive, after all. If you can do this, go reread the
discussion about swap space in
for some suggestions on how to best arrange your swap.Swapping over NFSSwapping over NFS is only recommended if you do not have a
local hard disk to swap to. Even though &os; has an excellent
NFS implementation, NFS swapping will be limited
by the available network bandwidth and puts an additional
burden on the NFS server.SwapfilesYou can create a file of a specified size to use as a swap
file. In our example here we will use a 64MB file called
/usr/swap0. You can use any name you
want, of course.Creating a SwapfileBe certain that your kernel configuration includes
the vnode driver. It is not in recent versions of
GENERIC.pseudo-device vn 1 #Vnode driver (turns a file into a device)Create a vn-device:&prompt.root; cd /dev
&prompt.root; sh MAKEDEV vn0Create a swapfile (/usr/swap0):&prompt.root; dd if=/dev/zero of=/usr/swap0 bs=1024k count=64Set proper permissions on (/usr/swap0):&prompt.root; chmod 0600 /usr/swap0Enable the swap file in /etc/rc.conf:swapfile="/usr/swap0" # Set to name of swapfile if aux swapfile desired.Reboot the machine or to enable the swap file immediately,
type:&prompt.root; vnconfig -e /dev/vn0b /usr/swap0 swapHitenPandyaWritten by TomRhodesPower and Resource ManagementIt is very important to utilize hardware resources in an
efficient manner. Before ACPI was introduced,
it was very difficult and inflexible for operating systems to manage
the power usage and thermal properties of a system. The hardware was
controlled by some sort of BIOS embedded
interface, such as Plug and Play BIOS (PNPBIOS), or
Advanced Power Management (APM) and so on.
Power and Resource Management is one of the key components of a modern
operating system. For example, you may want an operating system to
monitor system limits (and possibly alert you) in case your system
temperature increased unexpectedly.In this section, we will provide
comprehensive information about ACPI. References
will be provided for further reading at the end. Please be aware
that ACPI is available on &os; systems as a
default kernel module. What Is ACPI?Advanced Configuration and Power Interface
(ACPI) is a standard written by
an alliance of vendors to provide a standard interface for
hardware resources and power management (hence the name).
It is a key element in Operating System-directed
configuration and Power Management, i.e.: it provides
more control and flexibility to the operating system
(OS).
Modern systems stretched the limits of the
current Plug and Play interfaces (such as APM), prior to the introduction of
ACPI. ACPI is the direct
successor to APM
(Advanced Power Management).Shortcomings of Advanced Power Management (APM)The Advanced Power Management (APM)
facility control's the power usage of a system based on its
activity. The APM BIOS is supplied by the (system) vendor and
it is specific to the hardware platform. An APM driver in the
OS mediates access to the APM Software Interface,
which allows management of power levels.There are four major problems in APM. Firstly, power
management is done by the (vendor-specific) BIOS, and the OS
does not have any knowledge of it. One example of this, is when
the user sets idle-time values for a hard drive in the APM BIOS,
that when exceeded, it (BIOS) would spin down the hard drive,
without the consent of the OS. Secondly, the APM logic is
embedded in the BIOS, and it operates outside the scope of the
OS. This means users can only fix problems in their APM BIOS by
flashing a new one into the ROM; which, is a very dangerous
procedure, and if it fails, it could leave the system in an
unrecoverable state. Thirdly, APM is a vendor-specific
technology, which, means that there is a lot or parity
(duplication of efforts) and bugs found in one vendor's BIOS,
may not be solved in others. Last but not the least, the APM
BIOS did not have enough room to implement a sophisticated power
policy, or one that can adapt very well to the purpose of the
machine.Plug and Play BIOS (PNPBIOS) was
unreliable in many situations. PNPBIOS is 16-bit technology,
so the OS has to use 16-bit emulation in order to
interface with PNPBIOS methods.The &os; APM driver is documented in
the &man.apm.4; manual page.Configuring ACPIThe acpi.ko driver is loaded by default
at start up by the &man.loader.8; and should not
be compiled into the kernel. The reasoning behind this is that modules
are easier to work with, say if switching to another acpi.ko
without doing a kernel rebuild. This has the advantage of making testing easier.
Another reason is that starting ACPI after a system has been
brought up is not too useful, and in some cases can be fatal. In doubt, just
disable ACPI all together. This driver should not and can not
be unloaded because the system bus uses it for various hardware interactions.
ACPI can be disabled with the &man.acpiconf.8; utility.
In fact most of the interaction with ACPI can be done via
&man.acpiconf.8;. Basically this means, if anything about ACPI
is in the &man.dmesg.8; output, then most likely it is already running.ACPI and APM cannot coexist and
should be used separately. The last one to load will terminate if the driver
notices the other running.In the simplest form, ACPI can be used to put the
system into a sleep mode with &man.acpiconf.8;, the
flag, and a 1-5 option. Most users will only need
1. Option 5 will do a soft-off
which is the same action as:&prompt.root; halt -pThe other options are available. Check out the &man.acpiconf.8;
manual page for more information.NateLawsonWritten by PeterSchultzWith contributions from TomRhodesUsing and Debugging &os; ACPIACPI is a fundamentally new way of
discovering devices, managing power usage, and providing
standardized access to various hardware previously managed
by the BIOS. Progress is being made toward
ACPI working on all systems, but bugs in some
motherboards' ACPI Machine
Language (AML) bytecode,
incompleteness in &os;'s kernel subsystems, and bugs in the Intel
ACPI-CA interpreter continue to appear.This document is intended to help you assist the &os;
ACPI maintainers in identifying the root cause
of problems you observe and debugging and developing a solution.
Thanks for reading this and we hope we can solve your system's
problems.Submitting Debugging InformationBefore submitting a problem, be sure you are running the latest
BIOS version and, if available, embedded
controller firmware version.For those of you that want to submit a problem right away,
please send the following information to
&a.bugs.name;Description of the buggy behavior, including system type
and model and anything that causes the bug to appear. Also,
please note as accurately as possible when the bug began
occurring if it is new for you.The dmesg output after boot
, including any error messages
generated by you exercising the bug.dmesg output from boot
with ACPI
disabled, if disabling it helps fix the problem.Output from sysctl hw.acpi. This is also
a good way of figuring out what features your system
offers.URL where your
ACPI Source Language
(ASL)
can be found. Do not send the
ASL directly to the list as it can be
very large. Generate a copy of your ASL
by running this command:&prompt.root; acpidump -t -d > name-system.asl(Substitute your login name for
name and manufacturer/model for
system. Example:
njl-FooCo6000.asl)BackgroundACPI is present in all modern computers
that conform to the ia32 (x86), ia64 (Itanium), and amd64 (AMD)
architectures. The full standard has many features including
CPU performance management, power planes
control, thermal zones, various battery systems, embedded
controllers, and bus enumeration. Most systems implement less
than the full standard. For instance, a desktop system usually
only implements the bus enumeration parts while a laptop might
have cooling and battery management support as well. Laptops
also have suspend and resume, with their own associated
complexity.An ACPI-compliant system has various
components. The BIOS and chipset vendors
provide various fixed tables (e.g., FADT)
in memory that specify things like the APIC
map (used for SMP), config registers, and
simple configuration values. Additionally, a table of bytecode
(the Differentiated System Description Table
DSDT) is provided that specifies a
tree-like name space of devices and methods.The ACPI driver must parse the fixed
tables, implement an interpreter for the bytecode, and modify
device drivers and the kernel to accept information from the
ACPI subsystem. For &os;, Intel has
provided an interpreter (ACPI-CA) that is
shared with Linux and &netbsd;.
&netbsd;
The path to the
ACPI-CA source code is
src/sys/contrib/dev/acpica-unix-YYYYMMDD,
where YYYYMMDD is the release date of the ACPI-CA source code. The
glue code that allows ACPI-CA to work on
&os; is in src/sys/dev/acpica5/Osd. Finally,
drivers that implement various ACPI devices
are found in src/sys/dev/acpica5,
and architecture-dependent code resides in
/sys/arch/acpica5.
Common ProblemsFor ACPI to work correctly, all the parts
have to work correctly. Here are some common problems, in order
of frequency of appearance, and some possible workarounds or
fixes.Suspend/ResumeACPI has three suspend to
RAM (STR) states,
S1-S3, and one suspend
to disk state (STD), called
S4. S5 is
soft off and is the normal state your system
is in when plugged in but not powered up.
S4 can actually be implemented two separate
ways. S4BIOS is a
BIOS-assisted suspend to disk.
S4OS is implemented
entirely by the operating system.Start by checking sysctl
for the suspend-related items. Here
are the results for my Thinkpad:hw.acpi.supported_sleep_state: S3 S4 S5hw.acpi.s4bios: 0This means that I can use acpiconf -s
to test S3,
S4OS, and
S5. If was one
(1), I would have
S4BIOS
support instead of S4
OS.When testing suspend/resume, start with
S1, if supported. This state is most
likely to work since it doesn't require much driver support.
No one has implemented S2 but if you have
it, it's similar to S1. The next thing
to try is S3. This is the deepest
STR state and requires a lot of driver
support to properly reinitialize your hardware. If you have
problems resuming, feel free to email the &a.bugs.name; list but
do not expect the problem to be resolved since there are a lot
of drivers/hardware that need more testing and work.To help isolate the problem, remove as many drivers from
your kernel as possible. If it works, you can narrow down
which driver is the problem by loading drivers until it fails
again. Typically binary drivers like
nvidia.ko, X11
display drivers, and USB will have the most
problems while Ethernet interfaces usually work fine. If you
can load/unload the drivers ok, you can automate this by
putting the appropriate commands in
/etc/rc.suspend and
/etc/rc.resume. There is a
commented-out example for unloading and loading a driver. Try
setting to zero (0) if
your display is messed up after resume. Try setting longer or
shorter values for to see
if that helps.Another thing to try is load a recent Linux distribution
with ACPI support and test their
suspend/resume support on the same hardware. If it works
on Linux, it's likely a &os; driver problem and narrowing down
which driver causes the problems will help us fix the problem.
Note that the ACPI maintainers do not
usually maintain other drivers (e.g sound,
ATA, etc.) so any work done on tracking
down a driver problem should probably eventually be posted
to the &a.bugs.name; list and mailed to the driver
maintainer. If you are feeling adventurous, go ahead and
start putting some debugging &man.printf.3;s in a problematic
driver to track down where in its resume function it
hangs.Finally, try disabling ACPI and
enabling APM instead. If suspend/resume
works with APM, you may be better off
sticking with APM, especially on older
hardware (pre-2000). It took vendors a while to get
ACPI support correct and older hardware is
more likely to have BIOS problems with
ACPI.System Hangs (temporary or permanent)Most system hangs are a result of lost interrupts or an
interrupt storm. Chipsets have a lot of problems based on how
the BIOS configures interrupts before boot,
correctness of the APIC
(MADT) table, and routing of the
System Control Interrupt
(SCI).Interrupt storms can be distinguished from lost interrupts
by checking the output of vmstat -i
and looking at the line that has
acpi0. If the counter is increasing at more
than a couple per second, you have an interrupt storm. If the
system appears hung, try breaking to DDB
(CTRLALTESC on
console) and type .Your best hope when dealing with interrupt problems is to
try disabling APIC support with
hint.apic.0.disabled="1" in
loader.conf.PanicsPanics are relatively rare for ACPI and
are the top priority to be fixed. The first step is to
isolate the steps to reproduce the panic (if possible)
and get a backtrace. Follow the advice for enabling
and setting up a serial console
(see )
or setting up a &man.dump.8; partition. You can get a
backtrace in DDB with
. If you have to handwrite the
backtrace, be sure to at least get the lowest five (5) and top
five (5) lines in the trace.Then, try to isolate the problem by booting with
ACPI disabled. If that works, you can
isolate the ACPI subsystem by using various
values of . See the
&man.acpi.4; manual page for some examples.System Powers Up After Suspend or ShutdownFirst, try setting
0
in &man.loader.conf.5;. This keeps ACPI
from disabling various events during the shutdown process.
Some systems need this value set to 1 (the
default) for the same reason. This usually fixes
the problem of a system powering up spontaneously after a
suspend or poweroff.Other ProblemsIf you have other problems with ACPI
(working with a docking station, devices not detected, etc.),
please email a description to the mailing list as well;
however, some of these issues may be related to unfinished
parts of the ACPI subsystem so they might
take a while to be implemented. Please be patient and
prepared to test patches we may send you.ASL, acpidump, and
IASLThe most common problem is the BIOS
vendors providing incorrect (or outright buggy!) bytecode. This
is usually manifested by kernel console messages like
this:ACPI-1287: *** Error: Method execution failed [\\_SB_.PCI0.LPC0.FIGD._STA] \\
(Node 0xc3f6d160), AE_NOT_FOUNDOften, you can resolve these problems by updating your
BIOS to the latest revision. Most console
messages are harmless but if you have other problems like
battery status not working, they're a good place to start
looking for problems in the AML. The
bytecode, known as AML, is compiled from a
source language called ASL. The
AML is found in the table known as the
DSDT. To get a copy of your
ASL, use &man.acpidump.8;. You should use
both the (show contents of the fixed tables)
and (disassemble AML to
ASL) options. See the
Submitting Debugging
Information section for an example syntax.The simplest first check you can do is to recompile your
ASL to check for errors. Warnings can
usually be ignored but errors are bugs that will usually prevent
ACPI from working correctly. To recompile
your ASL, issue the following command:&prompt.root; iasl your.aslFixing Your ASLIn the long run, our goal is for almost everyone to have
ACPI work without any user intervention. At
this point, however, we are still developing workarounds for
common mistakes made by the BIOS vendors.
The Microsoft interpreter (acpi.sys and
acpiec.sys) does not strictly check for
adherence to the standard, and thus many BIOS
vendors who only test ACPI under Windows
never fix their ASL. We hope to continue to
identify and document exactly what non-standard behavior is
allowed by Microsoft's interpreter and replicate it so &os; can
work without forcing users to fix the ASL.
As a workaround and to help us identify behavior, you can fix
the ASL manually. If this works for you,
please send a &man.diff.1; of the old and new
ASL so we can possibly work around the buggy
behavior in ACPI-CA and thus make your fix
unnecessary.Here is a list of common error messages, their cause, and
how to fix them:_OS dependenciesSome AML assumes the world consists of
various Windows versions. You can tell &os; to claim it is
any OS to see if this fixes problems you
may have. An easy way to override this is to set
=Windows 2001
in /boot/loader.conf or other similar
strings you find in the ASL.Missing Return statementsSome methods do not explicitly return a value as the
standard requires. While ACPI-CA
does not handle this, &os; has a workaround that allows it to
return the value implicitly. You can also add explicit
Return statements where required if you know what value should
be returned. To force iasl to compile the
ASL, use the
flag.Overriding the Default AMLAfter you customize your.asl, you
will want to compile it, run:&prompt.root; iasl your.aslYou can add the flag to force creation
of the AML, even if there are errors during
compilation. Remember that some errors (e.g., missing Return
statements) are automatically worked around by the
interpreter.DSDT.aml is the default output
filename for iasl. You can load this
instead of your BIOS's buggy copy (which
is still present in flash memory) by editing
/boot/loader.conf as
follows:acpi_dsdt_load="YES"
acpi_dsdt_name="/boot/DSDT.aml"Be sure to copy your DSDT.aml to the
/boot directory.Getting Debugging Output From
ACPIThe ACPI driver has a very flexible
debugging facility. It allows you to specify a set of subsystems
as well as the level of verbosity. The subsystems you wish to
debug are specified as layers and are broken down
into ACPI-CA components (ACPI_ALL_COMPONENTS)
and ACPI hardware support (ACPI_ALL_DRIVERS).
The verbosity of debugging output is specified as the
level and ranges from ACPI_LV_ERROR (just report
errors) to ACPI_LV_VERBOSE (everything). The
level is a bitmask so multiple options can be set
at once, separated by spaces. In practice, you will want to use
a serial console to log the output if it is so long
it flushes the console message buffer. Debugging output is not enabled by default. To enable it,
add to your kernel config
if ACPI is compiled into the kernel. You can
add to your
/etc/make.conf to enable it globally. If
it is a module, you can recompile just your
acpi.ko module as follows:&prompt.root; cd /sys/dev/acpica5
&& make clean &&
make ACPI_DEBUG=1Install acpi.ko in
/boot/kernel and add your
desired level and layer to loader.conf.
This example enables debug messages for all
ACPI-CA components and all
ACPI hardware drivers
(CPU, LID, etc.) It will
only output error messages, the least verbose level.debug.acpi.layer="ACPI_ALL_COMPONENTS ACPI_ALL_DRIVERS"
debug.acpi.level="ACPI_LV_ERROR"If the information you want is triggered by a specific event
(say, a suspend and then resume), you can leave out changes to
loader.conf and instead use
sysctl to specify the layer and level after
booting and preparing your system for the specific event. The
sysctls are named the same as the tunables
in loader.conf.ReferencesMore information about ACPI may be found
in the following locations:The &a.freebsd.acpi; (This is FreeBSD-specific; posting
&os; questions here may not generate much of an answer.)The ACPI Mailing List Archives (FreeBSD)
The old ACPI Mailing List Archives (FreeBSD)
The ACPI 2.0 Specification
&os; Manual pages:
&man.acpidump.8;,
&man.acpiconf.8;,
&man.acpidb.8;
DSDT debugging resource.
(Uses Compaq as an example but generally useful.)