tuning.7: Fix .Fx/.Dx confusion a bit better.
[dragonfly.git] / share / man / man7 / tuning.7
CommitLineData
033a4603 1.\" Copyright (c) 2001 Matthew Dillon. Terms and conditions are those of
984263bc
MD
2.\" the BSD Copyright as specified in the file "/usr/src/COPYRIGHT" in
3.\" the source tree.
4.\"
8e6d36ef 5.Dd Aprip 27, 2017
984263bc
MD
6.Dt TUNING 7
7.Os
8.Sh NAME
9.Nm tuning
ac5c99e1 10.Nd performance tuning under DragonFly
fbd254cb 11.Sh SYSTEM SETUP
783fd4c3
SW
12Modern
13.Dx
14systems typically have just three partitions on the main drive.
f86b3e43
MD
15In order, a UFS
16.Pa /boot ,
17.Pa swap ,
18and a HAMMER
fbd254cb
MD
19.Pa root .
20The installer used to create separate PFSs for half a dozen directories,
21but now it just puts (almost) everything in the root.
22It will separate stuff that doesn't need to be backed up into a /build
23subdirectory and create null-mounts for things like /usr/obj, but it
24no longer creates separate PFSs for these.
25If desired, you can make /build its own mount to separate-out the
26components of the filesystem which do not need to be persistent.
f86b3e43
MD
27.Pp
28Generally speaking the
29.Pa /boot
fbd254cb
MD
30partition should be 1GB in size. This is the minimum recommended
31size, giving you room for backup kernels and alternative boot schemes.
32.Dx
33always installs debug-enabled kernels and modules and these can take
34up quite a bit of disk space (but will not take up any extra ram).
f86b3e43
MD
35.Pp
36In the old days we recommended that swap be sized to at least 2x main
37memory. These days swap is often used for other activities, including
fbd254cb
MD
38.Xr tmpfs 5
39and
40.Xr swapcache 8 .
f86b3e43 41We recommend that swap be sized to the larger of 2x main memory or
9fc561b7
MD
421GB if you have a fairly small disk and 16GB or more if you have a
43modestly endowed system.
44If you have a modest SSD + large HDD combination, we recommend
45a large dedicated swap partition on the SSD. For example, if
46you have a 128GB SSD and 2TB or more of HDD storage, dedicating
47upwards of 64GB of the SSD to swap and using
48.Xr swapcache 8
49and
50.Xr tmpfs 5
51will significantly improve your HDD's performance.
52.Pp
53In an all-SSD or mostly-SSD system,
54.Xr swapcache 8
55is not normally used but you may still want to have a large swap
56partition to support
57.Xr tmpfs 5
58use.
59Our synth/poudriere build machines run with a 200GB
60swap partition and use tmpfs for all the builder jails. 50-100 GB
61is swapped out at the peak of the build. As a result, actual
62system storage bandwidth is minimized and performance increased.
63.Pp
f86b3e43
MD
64If you are on a minimally configured machine you may, of course,
65configure far less swap or no swap at all but we recommend at least
66some swap.
984263bc 67The kernel's VM paging algorithms are tuned to perform best when there is
9fc561b7 68swap space configured.
f86b3e43
MD
69Configuring too little swap can lead to inefficiencies in the VM
70page scanning code as well as create issues later on if you add
9fc561b7 71more memory to your machine, so don't be shy about it.
fbd254cb
MD
72Swap is a good idea even if you don't think you will ever need it as it
73allows the
9fc561b7 74machine to page out completely unused data and idle programs (like getty),
f86b3e43
MD
75maximizing the ram available for your activities.
76.Pp
77If you intend to use the
78.Xr swapcache 8
9fc561b7
MD
79facility with a SSD + HDD combination we recommend configuring as much
80swap space as you can on the SSD.
81However, keep in mind that each 1GByte of swapcache requires around
821MByte of ram, so don't scale your swap beyond the equivalent ram
83that you reasonably want to eat to support it.
f86b3e43
MD
84.Pp
85Finally, on larger systems with multiple drives, if the use
fbd254cb
MD
86of SSD swap is not in the cards or if it is and you need higher-than-normal
87swapcache bandwidth, you can configure swap on up to four drives and
88the kernel will interleave the storage.
984263bc
MD
89The swap partitions on the drives should be approximately the same size.
90The kernel can handle arbitrary sizes but
91internal data structures scale to 4 times the largest swap partition.
92Keeping
93the swap partitions near the same size will allow the kernel to optimally
94stripe swap space across the N disks.
95Do not worry about overdoing it a
96little, swap space is the saving grace of
97.Ux
98and even if you do not normally use much swap, it can give you more time to
99recover from a runaway program before being forced to reboot.
fbd254cb
MD
100However, keep in mind that any sort of swap space failure can lock the
101system up.
102Most machines are setup with only one or two swap partitions.
984263bc 103.Pp
f86b3e43 104Most
f5f2fec6 105.Dx
fbd254cb
MD
106systems have a single HAMMER root.
107PFSs can be used to administratively separate domains for backup purposes
108but tend to be a hassle otherwise so if you don't need the administrative
109separation you don't really need to use multiple HAMMER PFSs.
f86b3e43
MD
110All the PFSs share the same allocation layer so there is no longer a need
111to size each individual mount.
112Instead you should review the
113.Xr hammer 8
114manual page and use the 'hammer viconfig' facility to adjust snapshot
115retention and other parameters.
116By default
117HAMMER keeps 60 days worth of snapshots.
118Usually snapshots are not desired on PFSs such as
119.Pa /usr/obj
120or
984263bc 121.Pa /tmp
f86b3e43
MD
122since data on these partitions cycles a lot.
123.Pp
124If a very large work area is desired it is often beneficial to
125configure it as a separate HAMMER mount. If it is integrated into
126the root mount it should at least be its own HAMMER PFS.
127We recommend naming the large work area
128.Pa /build .
129Similarly if a machine is going to have a large number of users
130you might want to separate your
984263bc 131.Pa /home
f86b3e43 132out as well.
984263bc
MD
133.Pp
134A number of run-time
135.Xr mount 8
136options exist that can help you tune the system.
137The most obvious and most dangerous one is
138.Cm async .
139Do not ever use it; it is far too dangerous.
140A less dangerous and more
141useful
142.Xr mount 8
143option is called
144.Cm noatime .
145.Ux
146filesystems normally update the last-accessed time of a file or
147directory whenever it is accessed.
fbd254cb
MD
148However, this creates a massive burden on copy-on-write filesystems like
149HAMMER, particularly when scanning the filesystem.
9bb2a92d 150.Dx
fbd254cb
MD
151currently defaults to disabling atime updates on HAMMER mounts.
152It can be enabled by setting the
153.Va vfs.hammer.noatime
154tunable to 0 in
155.Xr loader.conf 5
156but we recommend leaving it disabled.
157The lack of atime updates can create issues with certain programs
158such as when detecting whether unread mail is present, but
159applications for the most part no longer depend on it.
160.Sh SSD SWAP
161The single most important thing you can do is have at least one
162solid-state drive in your system, and configure your swap space
163on that drive.
164If you are using a combination of a smaller SSD and a very larger HDD,
165you can use
166.Xr swapcache 8
167to automatically cache data from your HDD.
168But even if you do not, having swap space configured on your SSD will
169significantly improve performance under even modest paging loads.
170It is particularly useful to configure a significant amount of swap
171on a workstation, 32GB or more is not uncommon, to handle bloated
172leaky applications such as browsers.
984263bc
MD
173.Sh SYSCTL TUNING
174.Xr sysctl 8
175variables permit system behavior to be monitored and controlled at
176run-time.
177Some sysctls simply report on the behavior of the system; others allow
178the system behavior to be modified;
179some may be set at boot time using
180.Xr rc.conf 5 ,
181but most will be set via
182.Xr sysctl.conf 5 .
183There are several hundred sysctls in the system, including many that appear
184to be candidates for tuning but actually are not.
185In this document we will only cover the ones that have the greatest effect
186on the system.
187.Pp
188The
189.Va kern.ipc.shm_use_phys
57f5fcde 190sysctl defaults to 1 (on) and may be set to 0 (off) or 1 (on).
984263bc
MD
191Setting
192this parameter to 1 will cause all System V shared memory segments to be
193mapped to unpageable physical RAM.
194This feature only has an effect if you
195are either (A) mapping small amounts of shared memory across many (hundreds)
196of processes, or (B) mapping large amounts of shared memory across any
197number of processes.
198This feature allows the kernel to remove a great deal
199of internal memory management page-tracking overhead at the cost of wiring
200the shared memory into core, making it unswappable.
201.Pp
202The
984263bc
MD
203.Va vfs.write_behind
204sysctl defaults to 1 (on). This tells the filesystem to issue media
205writes as full clusters are collected, which typically occurs when writing
206large sequential files. The idea is to avoid saturating the buffer
207cache with dirty buffers when it would not benefit I/O performance. However,
208this may stall processes and under certain circumstances you may wish to turn
209it off.
210.Pp
211The
212.Va vfs.hirunningspace
213sysctl determines how much outstanding write I/O may be queued to
214disk controllers system wide at any given instance. The default is
215usually sufficient but on machines with lots of disks you may want to bump
216it up to four or five megabytes. Note that setting too high a value
217(exceeding the buffer cache's write threshold) can lead to extremely
218bad clustering performance. Do not set this value arbitrarily high! Also,
3221afbe 219higher write queueing values may add latency to reads occurring at the same
984263bc 220time.
9fc561b7
MD
221The
222.Va vfs.bufcache_bw
223controls data cycling within the buffer cache. I/O bandwidth less than
224this specification (per second) will cycle into the much larger general
225VM page cache while I/O bandwidth in excess of this specification will
226be recycled within the buffer cache, reducing the load on the rest of
227the VM system.
228The default value is 200 megabytes (209715200), which means that the
229system will try harder to cache data coming off a slower hard drive
230and less hard trying to cache data coming off a fast SSD.
231This parameter is particularly important if you have NVMe drives in
52f06bfd 232your system as these storage devices are capable of transferring
9fc561b7 233well over 2GBytes/sec into the system.
984263bc
MD
234.Pp
235There are various other buffer-cache and VM page cache related sysctls.
9fc561b7 236We do not recommend modifying their values.
984263bc
MD
237.Pp
238The
239.Va net.inet.tcp.sendspace
240and
241.Va net.inet.tcp.recvspace
242sysctls are of particular interest if you are running network intensive
243applications.
244They control the amount of send and receive buffer space
245allowed for any given TCP connection.
fbd254cb
MD
246However,
247.Dx
248now auto-tunes these parameters using a number of other related
249sysctls (run 'sysctl net.inet.tcp' to get a list) and usually
250no longer need to be tuned manually.
984263bc 251We do not recommend
fbd254cb
MD
252increasing or decreasing the defaults if you are managing a very large
253number of connections.
984263bc
MD
254Note that the routing table (see
255.Xr route 8 )
256can be used to introduce route-specific send and receive buffer size
257defaults.
258.Pp
259As an additional management tool you can use pipes in your
260firewall rules (see
261.Xr ipfw 8 )
262to limit the bandwidth going to or from particular IP blocks or ports.
263For example, if you have a T1 you might want to limit your web traffic
264to 70% of the T1's bandwidth in order to leave the remainder available
265for mail and interactive use.
266Normally a heavily loaded web server
267will not introduce significant latencies into other services even if
268the network link is maxed out, but enforcing a limit can smooth things
269out and lead to longer term stability.
270Many people also enforce artificial
271bandwidth limitations in order to ensure that they are not charged for
272using too much bandwidth.
273.Pp
f79ec571 274Setting the send or receive TCP buffer to values larger than 65535 will result
984263bc
MD
275in a marginal performance improvement unless both hosts support the window
276scaling extension of the TCP protocol, which is controlled by the
277.Va net.inet.tcp.rfc1323
278sysctl.
279These extensions should be enabled and the TCP buffer size should be set
280to a value larger than 65536 in order to obtain good performance from
281certain types of network links; specifically, gigabit WAN links and
282high-latency satellite links.
cabeba47 283RFC 1323 support is enabled by default.
984263bc
MD
284.Pp
285The
286.Va net.inet.tcp.always_keepalive
287sysctl determines whether or not the TCP implementation should attempt
288to detect dead TCP connections by intermittently delivering
289.Dq keepalives
290on the connection.
fbd254cb
MD
291By default, this is now enabled for all applications.
292We do not recommend turning it off.
293The extra network bandwidth is minimal and this feature will clean-up
294stalled and long-dead connections that might not otherwise be cleaned
295up.
296In the past people using dialup connections often did not want to
297use this feature in order to be able to retain connections across
298long disconnections, but in modern day the only default that makes
299sense is for the feature to be turned on.
984263bc
MD
300.Pp
301The
302.Va net.inet.tcp.delayed_ack
3f5e28f4 303TCP feature is largely misunderstood. Historically speaking this feature
984263bc
MD
304was designed to allow the acknowledgement to transmitted data to be returned
305along with the response. For example, when you type over a remote shell
306the acknowledgement to the character you send can be returned along with the
307data representing the echo of the character. With delayed acks turned off
308the acknowledgement may be sent in its own packet before the remote service
309has a chance to echo the data it just received. This same concept also
310applies to any interactive protocol (e.g. SMTP, WWW, POP3) and can cut the
a3220ac5
SW
311number of tiny packets flowing across the network in half. The
312.Dx
984263bc
MD
313delayed-ack implementation also follows the TCP protocol rule that
314at least every other packet be acknowledged even if the standard 100ms
315timeout has not yet passed. Normally the worst a delayed ack can do is
316slightly delay the teardown of a connection, or slightly delay the ramp-up
317of a slow-start TCP connection. While we aren't sure we believe that
318the several FAQs related to packages such as SAMBA and SQUID which advise
163ffa07 319turning off delayed acks may be referring to the slow-start issue.
984263bc
MD
320.Pp
321The
322.Va net.inet.tcp.inflight_enable
323sysctl turns on bandwidth delay product limiting for all TCP connections.
fbd254cb
MD
324This feature is now turned on by default and we recommend that it be
325left on.
326It will slightly reduce the maximum bandwidth of a connection but the
327benefits of the feature in reducing packet backlogs at router constriction
328points are enormous.
329These benefits make it a whole lot easier for router algorithms to manage
330QOS for multiple connections.
331The limiting feature reduces the amount of data built up in intermediate
332router and switch packet queues as well as reduces the amount of data built
333up in the local host's interface queue. With fewer packets queued up,
334interactive connections, especially over slow modems, will also be able
335to operate with lower round trip times. However, note that this feature
336only affects data transmission (uploading / server-side). It does not
337affect data reception (downloading).
338.Pp
984263bc
MD
339The system will attempt to calculate the bandwidth delay product for each
340connection and limit the amount of data queued to the network to just the
341amount required to maintain optimum throughput. This feature is useful
342if you are serving data over modems, GigE, or high speed WAN links (or
343any other link with a high bandwidth*delay product), especially if you are
fbd254cb
MD
344also using window scaling or have configured a large send window.
345.Pp
346For production use setting
984263bc
MD
347.Va net.inet.tcp.inflight_min
348to at least 6144 may be beneficial. Note, however, that setting high
349minimums may effectively disable bandwidth limiting depending on the link.
984263bc
MD
350.Pp
351Adjusting
352.Va net.inet.tcp.inflight_stab
353is not recommended.
a4bb2daa
MD
354This parameter defaults to 50, representing +5% fudge when calculating the
355bwnd from the bw. This fudge is on top of an additional fixed +2*maxseg
356added to bwnd. The fudge factor is required to stabilize the algorithm
357at very high speeds while the fixed 2*maxseg stabilizes the algorithm at
358low speeds. If you increase this value excessive packet buffering may occur.
984263bc
MD
359.Pp
360The
361.Va net.inet.ip.portrange.*
362sysctls control the port number ranges automatically bound to TCP and UDP
363sockets. There are three ranges: A low range, a default range, and a
4041d919
SW
364high range, selectable via an IP_PORTRANGE
365.Fn setsockopt
366call.
367Most network programs use the default range which is controlled by
984263bc
MD
368.Va net.inet.ip.portrange.first
369and
370.Va net.inet.ip.portrange.last ,
371which defaults to 1024 and 5000 respectively. Bound port ranges are
372used for outgoing connections and it is possible to run the system out
373of ports under certain circumstances. This most commonly occurs when you are
374running a heavily loaded web proxy. The port range is not an issue
375when running serves which handle mainly incoming connections such as a
376normal web server, or has a limited number of outgoing connections such
377as a mail relay. For situations where you may run yourself out of
378ports we recommend increasing
379.Va net.inet.ip.portrange.last
380modestly. A value of 10000 or 20000 or 30000 may be reasonable. You should
381also consider firewall effects when changing the port range. Some firewalls
382may block large ranges of ports (usually low-numbered ports) and expect systems
383to use higher ranges of ports for outgoing connections. For this reason
384we do not recommend that
385.Va net.inet.ip.portrange.first
386be lowered.
387.Pp
388The
389.Va kern.ipc.somaxconn
390sysctl limits the size of the listen queue for accepting new TCP connections.
391The default value of 128 is typically too low for robust handling of new
392connections in a heavily loaded web server environment.
393For such environments,
394we recommend increasing this value to 1024 or higher.
395The service daemon
396may itself limit the listen queue size (e.g.\&
397.Xr sendmail 8 ,
398apache) but will
399often have a directive in its configuration file to adjust the queue size up.
400Larger listen queues also do a better job of fending off denial of service
401attacks.
402.Pp
403The
fbd254cb
MD
404.Va kern.maxvnodes
405specifies how many vnodes and related file structures the kernel will
406cache.
407The kernel uses a very generous default for this parameter based on
408available physical memory.
409You generally do not want to mess with this parameter as it directly
410effects how well the kernel can cache not only file structures but also
411the underlying file data.
412But you can lower it if kernel memory use is higher than you would like.
413.Pp
414The
984263bc
MD
415.Va kern.maxfiles
416sysctl determines how many open files the system supports.
417The default is
fbd254cb
MD
418typically based on available physical memory but you may need to bump
419it up if you are running databases or large descriptor-heavy daemons.
984263bc
MD
420The read-only
421.Va kern.openfiles
422sysctl may be interrogated to determine the current number of open files
423on the system.
424.Pp
425The
426.Va vm.swap_idle_enabled
427sysctl is useful in large multi-user systems where you have lots of users
428entering and leaving the system and lots of idle processes.
429Such systems
430tend to generate a great deal of continuous pressure on free memory reserves.
431Turning this feature on and adjusting the swapout hysteresis (in idle
432seconds) via
433.Va vm.swap_idle_threshold1
434and
435.Va vm.swap_idle_threshold2
436allows you to depress the priority of pages associated with idle processes
f79ec571 437more quickly than the normal pageout algorithm.
984263bc
MD
438This gives a helping hand
439to the pageout daemon.
440Do not turn this option on unless you need it,
441because the tradeoff you are making is to essentially pre-page memory sooner
23265324 442rather than later, eating more swap and disk bandwidth.
984263bc
MD
443In a small system
444this option will have a detrimental effect but in a large system that is
445already doing moderate paging this option allows the VM system to stage
446whole processes into and out of memory more easily.
447.Sh LOADER TUNABLES
448Some aspects of the system behavior may not be tunable at runtime because
449memory allocations they perform must occur early in the boot process.
450To change loader tunables, you must set their values in
451.Xr loader.conf 5
452and reboot the system.
453.Pp
454.Va kern.maxusers
455controls the scaling of a number of static system tables, including defaults
456for the maximum number of open files, sizing of network memory resources, etc.
f5f2fec6
SW
457On
458.Dx ,
984263bc
MD
459.Va kern.maxusers
460is automatically sized at boot based on the amount of memory available in
461the system, and may be determined at run-time by inspecting the value of the
462read-only
463.Va kern.maxusers
464sysctl.
465Some sites will require larger or smaller values of
466.Va kern.maxusers
467and may set it as a loader tunable; values of 64, 128, and 256 are not
468uncommon.
469We do not recommend going above 256 unless you need a huge number
470of file descriptors; many of the tunable values set to their defaults by
471.Va kern.maxusers
472may be individually overridden at boot-time or run-time as described
473elsewhere in this document.
984263bc 474.Pp
fbd254cb
MD
475.Va kern.nbuf
476sets how many filesystem buffers the kernel should cache.
477Filesystem buffers can be up to 128KB each. UFS typically uses an 8KB
478blocksize while HAMMER typically uses 64KB.
479The defaults usually suffice.
480The cached buffers represent wired physical memory so specifying a value
481that is too large can result in excessive kernel memory use, and is also
482not entirely necessary since the pages backing the buffers are also
483cached by the VM page cache (which does not use wired memory).
484The buffer cache significantly improves the hot path for cached file
485accesses.
486.Pp
7bc27c52
SW
487The
488.Va kern.dfldsiz
489and
490.Va kern.dflssiz
491tunables set the default soft limits for process data and stack size
492respectively.
493Processes may increase these up to the hard limits by calling
494.Xr setrlimit 2 .
495The
496.Va kern.maxdsiz ,
497.Va kern.maxssiz ,
498and
499.Va kern.maxtsiz
500tunables set the hard limits for process data, stack, and text size
501respectively; processes may not exceed these limits.
502The
503.Va kern.sgrowsiz
504tunable controls how much the stack segment will grow when a process
505needs to allocate more stack.
506.Pp
984263bc 507.Va kern.ipc.nmbclusters
fbd254cb
MD
508and
509.Va kern.ipc.nmbjclusters
984263bc
MD
510may be adjusted to increase the number of network mbufs the system is
511willing to allocate.
fbd254cb 512Each normal cluster represents approximately 2K of memory,
984263bc
MD
513so a value of 1024 represents 2M of kernel memory reserved for network
514buffers.
fbd254cb
MD
515Each 'j' cluster is typically 4KB, so a value of 1024 represents 4M of
516kernel memory.
517You can do a simple calculation to figure out how many you need but
518keep in mind that tcp buffer sizing is now more dynamic than it used to
519be.
520.Pp
521The defaults usually suffice but you may want to bump it up on service-heavy
522machines.
523Modern machines often need a large number of mbufs to operate services
524efficiently, values of 65536, even upwards of 262144 or more are common.
525If you are running a server, it is better to be generous than to be frugal.
526Remember the memory calculation though.
527.Pp
984263bc
MD
528Under no circumstances
529should you specify an arbitrarily high value for this parameter, it could
530lead to a boot-time crash.
531The
532.Fl m
533option to
534.Xr netstat 1
535may be used to observe network cluster use.
984263bc
MD
536.Sh KERNEL CONFIG TUNING
537There are a number of kernel options that you may have to fiddle with in
538a large-scale system.
539In order to change these options you need to be
540able to compile a new kernel from source.
541The
542.Xr config 8
543manual page and the handbook are good starting points for learning how to
544do this.
545Generally the first thing you do when creating your own custom
546kernel is to strip out all the drivers and services you do not use.
547Removing things like
548.Dv INET6
549and drivers you do not have will reduce the size of your kernel, sometimes
550by a megabyte or more, leaving more memory available for applications.
551.Pp
f86b3e43 552If your motherboard is AHCI-capable then we strongly recommend turning
fbd254cb 553on AHCI mode in the BIOS if it is not the default.
984263bc
MD
554.Sh CPU, MEMORY, DISK, NETWORK
555The type of tuning you do depends heavily on where your system begins to
556bottleneck as load increases.
557If your system runs out of CPU (idle times
558are perpetually 0%) then you need to consider upgrading the CPU or moving to
559an SMP motherboard (multiple CPU's), or perhaps you need to revisit the
560programs that are causing the load and try to optimize them.
561If your system
562is paging to swap a lot you need to consider adding more memory.
563If your
564system is saturating the disk you typically see high CPU idle times and
565total disk saturation.
566.Xr systat 1
567can be used to monitor this.
568There are many solutions to saturated disks:
569increasing memory for caching, mirroring disks, distributing operations across
570several machines, and so forth.
984263bc
MD
571.Pp
572Finally, you might run out of network suds.
fbd254cb 573Optimize the network path
984263bc 574as much as possible.
fbd254cb
MD
575If you are operating a machine as a router you may need to
576setup a
577.Xr pf 4
578firewall (also see
579.Xr firewall 7 .
580.Dx
581has a very good fair-share queueing algorithm for QOS in
582.Xr pf 4 .
583.Sh SOURCE OF KERNEL MEMORY USAGE
584The primary sources of kernel memory usage are:
499dbb9a 585.Bl -tag -width ".Va kern.maxvnodes"
fbd254cb
MD
586.It Va kern.maxvnodes
587The maximum number of cached vnodes in the system.
52f06bfd 588These can eat quite a bit of kernel memory, primarily due to auxiliary
fbd254cb
MD
589structures tracked by the HAMMER filesystem.
590It is relatively easy to configure a smaller value, but we do not
591recommend reducing this parameter below 100000.
592Smaller values directly impact the number of discrete files the
593kernel can cache data for at once.
499dbb9a 594.It Va kern.ipc.nmbclusters , Va kern.ipc.nmbjclusters
fbd254cb
MD
595Calculate approximately 2KB per normal cluster and 4KB per jumbo
596cluster.
597Do not make these values too low or you risk deadlocking the network
598stack.
599.It Va kern.nbuf
600The number of filesystem buffers managed by the kernel.
601The kernel wires the underlying cached VM pages, typically 8KB (UFS) or
60264KB (HAMMER) per buffer.
603.It swap/swapcache
604Swap memory requires approximately 1MB of physical ram for each 1GB
605of swap space.
606When swapcache is used, additional memory may be required to keep
607VM objects around longer (only really reducable by reducing the
608value of
609.Va kern.maxvnodes
610which you can do post-boot if you desire).
611.It tmpfs
612Tmpfs is very useful but keep in mind that while the file data itself
613is backed by swap, the meta-data (the directory topology) requires
614wired kernel memory.
615.It mmu page tables
616Even though the underlying data pages themselves can be paged to swap,
617the page tables are usually wired into memory.
618This can create problems when a large number of processes are mmap()ing
619very large files.
620Sometimes turning on
621.Va machdep.pmap_mmu_optimize
622suffices to reduce overhead.
623Page table kernel memory use can be observed by using 'vmstat -z'
624.It Va kern.ipc.shm_use_phys
625It is sometimes necessary to force shared memory to use physical memory
626when running a large database which uses shared memory to implement its
627own data caching.
628The use of sysv shared memory in this regard allows the database to
629distinguish between data which it knows it can access instantly (i.e.
630without even having to page-in from swap) verses data which it might require
631and I/O to fetch.
632.Pp
633If you use this feature be very careful with regards to the database's
634shared memory configuration as you will be wiring the memory.
635.El
984263bc 636.Sh SEE ALSO
71990c18
SW
637.Xr netstat 1 ,
638.Xr systat 1 ,
639.Xr dm 4 ,
640.Xr dummynet 4 ,
641.Xr nata 4 ,
642.Xr pf 4 ,
643.Xr login.conf 5 ,
644.Xr pf.conf 5 ,
645.Xr rc.conf 5 ,
646.Xr sysctl.conf 5 ,
647.Xr firewall 7 ,
648.Xr hier 7 ,
984263bc
MD
649.Xr boot 8 ,
650.Xr ccdconfig 8 ,
651.Xr config 8 ,
652.Xr disklabel 8 ,
653.Xr fsck 8 ,
654.Xr ifconfig 8 ,
655.Xr ipfw 8 ,
656.Xr loader 8 ,
657.Xr mount 8 ,
658.Xr newfs 8 ,
659.Xr route 8 ,
660.Xr sysctl 8 ,
fbd254cb 661.Xr tunefs 8
984263bc
MD
662.Sh HISTORY
663The
664.Nm
8e6d36ef
SW
665manual page was inherited from
666.Fx
667and first appeared in
668.Fx 4.3 ,
984263bc 669May 2001.
8e6d36ef
SW
670.Sh AUTHORS
671The
672.Nm
673manual page was originally written by
674.An Matthew Dillon .