Implement periodic hammer2 snapshots.
[dragonfly.git] / sbin / hammer2 / hammer2.8
CommitLineData
acbbd0ef 1.\" Copyright (c) 2015-2019 The DragonFly Project. All rights reserved.
3c198419
MD
2.\"
3.\" This code is derived from software contributed to The DragonFly Project
4.\" by Matthew Dillon <dillon@backplane.com>
5.\"
6.\" Redistribution and use in source and binary forms, with or without
7.\" modification, are permitted provided that the following conditions
8.\" are met:
9.\"
10.\" 1. Redistributions of source code must retain the above copyright
11.\" notice, this list of conditions and the following disclaimer.
12.\" 2. Redistributions in binary form must reproduce the above copyright
13.\" notice, this list of conditions and the following disclaimer in
14.\" the documentation and/or other materials provided with the
15.\" distribution.
16.\" 3. Neither the name of The DragonFly Project nor the names of its
17.\" contributors may be used to endorse or promote products derived
18.\" from this software without specific, prior written permission.
19.\"
20.\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
21.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
22.\" LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
23.\" FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
24.\" COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
25.\" INCIDENTAL, SPECIAL, EXEMPLARY OR CONSEQUENTIAL DAMAGES (INCLUDING,
26.\" BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
27.\" LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
28.\" AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
29.\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
30.\" OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
31.\" SUCH DAMAGE.
32.\"
52d59648 33.Dd June 8, 2020
3c198419
MD
34.Dt HAMMER2 8
35.Os
36.Sh NAME
37.Nm hammer2
38.Nd hammer2 file system utility
39.Sh SYNOPSIS
40.Nm
41.Fl h
42.Nm
43.Op Fl s Ar path
44.Op Fl t Ar type
45.Op Fl u Ar uuid
944ddad0 46.Op Fl m Ar mem
3c198419
MD
47.Ar command
48.Op Ar argument ...
49.Sh DESCRIPTION
50The
51.Nm
52utility provides miscellaneous support functions for a
53HAMMER2 file system.
54.Pp
55The options are as follows:
56.Bl -tag -width indent
57.It Fl s Ar path
58Specify the path to a mounted HAMMER2 filesystem.
59At least one PFS on a HAMMER2 filesystem must be mounted for the system
60to act on all PFSs managed by it.
61Every HAMMER2 filesystem typically has a PFS called "LOCAL" for this purpose.
62.It Fl t Ar type
63Specify the type when creating, upgrading, or downgrading a PFS.
8db69c9f 64Supported types are MASTER, SLAVE, SOFT_MASTER, SOFT_SLAVE, CACHE, and DUMMY.
3c198419 65If not specified the pfs-create directive will default to MASTER if no
ff8206e7 66UUID is specified, and SLAVE if a UUID is specified.
3c198419 67.It Fl u Ar uuid
ff8206e7
SW
68Specify the cluster UUID when creating a PFS.
69If not specified, a unique, random UUID will be generated.
3c198419
MD
70Note that every PFS also has a unique pfs_id which is always generated
71and cannot be overridden with an option.
72The { pfs_clid, pfs_fsid } tuple uniquely identifies a component of a cluster.
944ddad0
MD
73.It Fl m Ar mem
74Specify how much tracking memory to use for certain directives.
75At the moment, this option is only applicable to the
76.Cm bulkfree
77directive, allowing it to operate in fewer passes when given more memory.
78A nominal value for a 4TB drive with a ton of stuff on it would be around
79a gigabyte '-m 1g'.
3c198419
MD
80.El
81.Pp
82.Nm
83directives are as shown below.
84Note that most directives require you to either be CD'd into a hammer2
85filesystem, specify a path to a mounted hammer2 filesystem via the
86.Fl s
87option, or specify a path after the directive.
88It depends on the directive.
89All hammer2 filesystem have a PFS called "LOCAL" which is typically mounted
90locally on the host in order to be able to issue commands for other PFSs
91on the filesystem.
92The mount also enables PFS configuration scanning for that filesystem.
93.Bl -tag -width indent
4aedc17b 94.\" ==== cleanup ====
95.It Cm cleanup Op path
96Perform manual cleanup passes on paths or all mounted partitions.
3c198419
MD
97.\" ==== connect ====
98.It Cm connect Ar target
99Add a cluster link entry to the volume header.
100The volume header can support up to 255 link entries.
101This feature is not currently used.
f6aebb44 102.\" ==== destroy ====
d371ccd2 103.It Cm destroy Ar path...
ff8206e7
SW
104Destroy the specified directory entry in a hammer2 filesystem.
105This bypasses
f6aebb44
MD
106all normal checks and will unconditionally destroy the directory entry.
107The underlying inode is not checked and, if it does exist, its nlinks count
108is not decremented.
109This directive should only be used to destroy a corrupted directory entry
110which no longer has a working inode.
111.Pp
112Note that this command may desynchronize the system namecache for the
ff8206e7
SW
113specified entry.
114If this happens, you may have to unmount and remount the filesystem.
d371ccd2
TK
115.\" ==== destroy-inum ====
116.It Cm destroy-inum Ar path...
117Destroy the specified inode in a hammer2 filesystem.
3c198419
MD
118.\" ==== disconnect ====
119.It Cm disconnect Ar target
120Delete a cluster link entry from the volume header.
121This feature is not currently used.
acbbd0ef
MD
122.\" ==== emergency-mode-enable ===
123.It Cm emergency-mode-enable Ar target
ff8206e7
SW
124Flag emergency operations mode in the filesystem.
125This mode may be used
acbbd0ef
MD
126as a last resort to delete files and directories from a full filesystem.
127Inode creation, file writes, and certain meta-data cleanups are disallowed
128while emergency mode is active.
129File and directory removal and mode/attr setting is still allowed.
130This mode is extremely dangerous and should only be used as a last resort.
131.Pp
132This mode allows the filesystem to modify blocks in-place when it is unable
ff8206e7
SW
133to allocate a copy.
134Thus it is possible to chflags and remove files and
135directories even when the filesystem is completely full.
136However, there is a price.
137This mode of operation WILL LIKELY CORRUPT ANY SNAPSHOTS related
138to this filesystem.
139The filesystem will report this condition if it encounters
acbbd0ef 140it but if you are forced to use this mode to fix a filesystem full condition
ff8206e7
SW
141your snapshots can get a bit dicey.
142It is usually safest to delete any related snapshots when using this mode.
acbbd0ef
MD
143.Pp
144You can detect whether related snapshots have been corrupted by running
145a bulkfree pass and checking the console output for reported CRC errors.
ff8206e7
SW
146If no errors are reported, your snapshots are fine.
147If errors are reported
acbbd0ef
MD
148you should delete related snapshots until bulkfree reports no further errors.
149.Pp
150The emergency mode will also make meta-data updates unsafe due to the lack of
151copy-on-write, causing potential harm if the system unexpectedly panics or
ff8206e7
SW
152loses power.
153GREAT CARE MUST BE TAKEN WHILE THIS MODE IS ACTIVE.
154.Bl -enum
155.It
acbbd0ef 156Determine that you are unable to recover space with normal file and directory
ff8206e7
SW
157removal commands due to
158.Er ENOSPC
159errors being returned by 'rm', or through the
acbbd0ef
MD
160removal of snapshots (if any). The 'bulkfree' directive must be issued to
161scan the filesystem and free up the actual space, then check with 'df'.
162Continue if you still have insufficient space and are unable to remove items
163normally.
ff8206e7 164.It
acbbd0ef 165If you need any related snapshots, this is a good time to copy them elsewhere.
ff8206e7 166.It
acbbd0ef 167Idle or kill any processes trying to use the filesystem.
ff8206e7 168.It
acbbd0ef
MD
169Issue the emergency-mode-enable directive on the filesystem.
170Once enabled, run 'sync' to update any dirty inodes which may still
ff8206e7
SW
171be dirty due to not being able to flush.
172Please remember that this
acbbd0ef
MD
173directive is a LAST RESORT, is dangerous, and will likely corrupt any
174other snapshots you have based on the filesystem you are removing files
175from.
ff8206e7 176.It
acbbd0ef
MD
177Remove file trees as necessary with 'rm -rf' to free space, being cognizant
178of any warnings issued by the kernel on the console (via 'dmesg') while
179doing so.
ff8206e7 180.It
acbbd0ef
MD
181Issue the 'bulkfree' directive to actually free the space and check that
182sufficient space has been freed with 'df'.
ff8206e7 183.It
acbbd0ef 184If bulkfree reports CHECK errors, or if you have snapshots and insufficient
ff8206e7
SW
185space has been freed, you will need to delete snapshots.
186Re-run bulkfree and delete snapshots until no errors are reported.
187.It
188Issue the emergency-mode-disable directive when done.
189It might also be a
acbbd0ef
MD
190good idea to reboot after using this mode, but theoretically you should not
191have to.
ff8206e7 192.It
acbbd0ef
MD
193Restore services using the filesystem.
194.El
acbbd0ef
MD
195.\" ==== emergency-mode-disable ===
196.It Cm emergency-mode-disable Ar target
197Turn off the emergency operations mode on a filesystem, restoring normal
198operation.
b92bfd39 199.\" ==== info ====
d371ccd2 200.It Cm info Op devpath...
b92bfd39
MD
201Access and print the status and super-root entries for all HAMMER2
202partitions found in /dev/serno or the specified device path(s).
203The partitions do not have to be mounted.
204Note that only mounted partitions will be under active management.
205This is accomplished by mounting at least one PFS within the partition.
206Typically at least the @LOCAL PFS is mounted.
207.\" ==== mountall ====
d371ccd2 208.It Cm mountall Op devpath...
b92bfd39
MD
209This directive mounts the @LOCAL PFS on all HAMMER2 partitions found
210in /dev/serno, or the specified device path(s).
211The partitions are mounted as /var/hammer2/LOCAL.<id>.
212Mounts are executed in the background and this command will wait a
213limited amount of time for the mounts to complete before returning.
3c198419 214.\" ==== status ====
d371ccd2 215.It Cm status Op path...
3c198419
MD
216Dump a list of all cluster link entries configured in the volume header.
217.\" ==== hash ====
d371ccd2 218.It Cm hash Op filename...
3c198419 219Compute and print the directory hash for any number of filenames.
d371ccd2
TK
220.\" ==== dhash ====
221.It Cm dhash Op filename...
222Compute and print the data hash for long directory entry for any number of filenames.
3c198419
MD
223.\" ==== pfs-list ====
224.It Cm pfs-list Op path...
83d90983
MD
225List all PFSs associated with all mounted hammer2 storage devices.
226The list may be restricted to a particular filesystem using
227.Fl s Ar mount .
228.Pp
229Note that hammer2 PFSs associated with storage devices which have not been
230mounted in any fashion will not be listed. At least one hammer2 label must
231be mounted for the PFSs on that device to be visible.
3c198419
MD
232.\" ==== pfs-clid ====
233.It Cm pfs-clid Ar label
234Print the cluster id for a PFS specified by name.
235.\" ==== pfs-fsid ====
236.It Cm pfs-fsid Ar label
237Print the unique filesystem id for a PFS specified by name.
238.\" ==== pfs-create ====
239.It Cm pfs-create Ar label
83d90983
MD
240Create a local PFS on the mounted HAMMER2 filesystem represented
241by the current directory, or specified via
242.Fl s Ar mount .
ff8206e7
SW
243If no UUID is specified the pfs-type defaults to MASTER.
244If a UUID is specified via the
3c198419
MD
245.Fl u
246option the pfs-type defaults to SLAVE.
3c198419
MD
247Other types can be specified with the
248.Fl t
249option.
8db69c9f
MD
250.Pp
251If you wish to add a MASTER to an existing cluster, you must first add it as
252a SLAVE and then upgrade it to MASTER to properly synchronize it.
253.Pp
254The DUMMY pfs-type is used to tie network-accessible clusters into the local
255machine when no local storage is desired.
256This type should be used on minimal H2 partitions or entirely in ram for
257netboot-centric systems to provide a tie-in point for the mount command,
258or on more complex systems where you need to also access network-centric
259clusters.
260.Pp
261The CACHE or SLAVE pfs-type is typically used when the main store is on
262the network but local storage is desired to improve performance.
263SLAVE is also used when a backup is desired.
264.Pp
265Generally speaking, you can mount any PFS element of a cluster in order to
3c198419
MD
266access the cluster via the full cluster protocol.
267There are two exceptions.
268If you mount a SOFT_SLAVE or a SOFT_MASTER then soft quorum semantics are
269employed... the soft slave or soft master's current state will always be used
ff8206e7
SW
270and the quorum protocol will not be used.
271The soft PFS will still be
4b47781a 272synchronized to masters in the background when available.
d6bb1ac9
SW
273Also, you can use
274.Sq mount -o local
275to mount ONLY a local HAMMER2 PFS and
3c198419
MD
276not run any network or quorum protocols for the mount.
277All such mounts except for a SOFT_MASTER mount will be read-only.
278Other than that, you will be mounting the whole cluster when you mount any
279PFS within the cluster.
280.Pp
281DUMMY - Create a PFS skeleton intended to be the mount point for a
282more complex cluster, probably one that is entirely network based.
283No data will be synchronized to this PFS so it is suitable for use
284in a network boot image or memory filesystem.
285This allows you to create placeholders for mount points on your local
286disk, SSD, or memory disk.
287.Pp
288CACHE - Create a PFS for caching portions of the cluster piecemeal.
289This is similar to a SLAVE but does not synchronize the entire contents of
290the cluster to the PFS.
291Elements found in the CACHE PFS which are validated against the cluster
292will be read, presumably a faster access than having to go to the cluster.
293Only local CACHEs will be updated.
7c5aac38 294Network-accessible CACHE PFSs might be read but will not be written to.
3c198419
MD
295If you have a large hard-drive-based cluster you can set up localized
296SSD CACHE PFSs to improve performance.
297.Pp
3c198419
MD
298SLAVE - Create a PFS which maintains synchronization with and provides a
299read-only copy of the cluster.
300HAMMER2 will prioritize local SLAVEs for data retrieval after validating
301their transaction id against the cluster.
302The difference between a CACHE and a SLAVE is that the SLAVE is synchronized
303to a full copy of the cluster and thus can serve as a backup or be staged
304for use as a MASTER later on.
305.Pp
306SOFT_SLAVE - Create a PFS which maintains synchronization with and provides
307a read-only copy of the cluster.
ff8206e7
SW
308This is one of the special mount cases.
309A SOFT_SLAVE will synchronize with
3c198419
MD
310the cluster when the cluster is available, but can still be accessed when
311the cluster is not available.
312.Pp
313MASTER - Create a PFS which will hold a master copy of the cluster.
314If you create several MASTER PFSs with the same cluster id you are
315effectively creating a multi-master cluster and causing a quorum and
316cache coherency protocol to be used to validate operations.
317The total number of masters is stored in each PFSs making up the cluster.
318Filesystem operations will stall for normal mounts if a quorum cannot be
319obtained to validate the operation.
320MASTER nodes which go offline and return later will synchronize in the
321background.
322Note that when adding a MASTER to an existing cluster you must add the
323new PFS as a SLAVE and then upgrade it to a MASTER.
324.Pp
325SOFT_MASTER - Create a PFS which maintains synchronization with and provides
326a read-write copy of the cluster.
ff8206e7
SW
327This is one of the special mount cases.
328A SOFT_MASTER will synchronize with
3c198419
MD
329the cluster when the cluster is available, but can still be read AND written
330to even when the cluster is not available.
331Modifications made to a SOFT_MASTER will be automatically flushed to the
332cluster when it becomes accessible again, and vise-versa.
333Manual intervention may be required if a conflict occurs during
334synchronization.
335.\" ==== pfs-delete ====
83d90983
MD
336.It Cm pfs-delete Op label...
337Destroy a PFS by name. All hammer2 mount points will be checked, however
338this directive will refuse to delete a PFS whos name is duplicated on
339multiple mount points. A specific mount point may be specified to restrict
340the deletion via the
341.Fl s Ar mount
342option.
3c198419
MD
343.\" ==== snapshot ====
344.It Cm snapshot Ar path Op label
345Create a snapshot of a directory.
83d90983
MD
346The snapshot will be created on the same hammer2 storage device as the
347directory.
3c198419
MD
348This can only be used on a local PFS, and is only really useful if the PFS
349contains a complete copy of what you desire to snapshot so that typically
350means a local MASTER, SOFT_MASTER, SLAVE, or SOFT_SLAVE must be present.
351Snapshots are created simply by flushing a PFS mount to disk and then copying
352the directory inode to the PFS.
83d90983
MD
353The topology is snapshotted without having to be copied or scanned and
354take no additional space.
355However, bulkfree scans may take longer.
3c198419
MD
356Snapshots are effectively separate from the cluster they came from
357and can be used as a starting point for a new cluster.
358So unless you build a new cluster from the snapshot, it will stay local
359to the machine it was made on.
52d59648
DF
360.Pp
361Snapshots can be maintained automatically with
362.Xr periodic 8 .
363See
364.Xr periodic.conf 5
365for details of enabling and configuring the functionality.
d371ccd2
TK
366.\" ==== snapshot-debug ====
367.It Cm snapshot-debug Ar path Op label
368Snapshot without filesystem sync.
3c198419
MD
369.\" ==== service ====
370.It Cm service
371Start the
372.Nm
373service daemon.
374This daemon is also automatically started when you run
375.Xr mount_hammer2 8 .
376The hammer2 service daemon handles incoming TCP connections and maintains
ff8206e7
SW
377outgoing TCP connections.
378It will interconnect available services on the
3c198419
MD
379machine (e.g. hammer2 mounts and xdisks) to the network.
380.\" ==== stat ====
381.It Cm stat Op path...
382Print the inode statistics, compression, and other meta-data associated
383with a list of paths.
384.\" ==== leaf ====
385.It Cm leaf
386XXX
387.\" ==== shell ====
d371ccd2 388.It Cm shell Op host
3c198419
MD
389Start a debug shell to the local hammer2 service daemon via the DMSG protocol.
390.\" ==== debugspan ====
d371ccd2 391.It Cm debugspan Ar target
3c198419
MD
392(do not use)
393.\" ==== rsainit ====
d371ccd2 394.It Cm rsainit Op path
3c198419
MD
395Create the
396.Pa /etc/hammer2
397directory and initialize a public/private keypair in that directory for
398use by the network cluster protocols.
399.\" ==== show ====
400.It Cm show Ar devpath
401Dump the radix tree for the HAMMER2 filesystem by scanning a
ff8206e7
SW
402block device directly.
403No mount is required.
3c198419 404.\" ==== freemap ====
d371ccd2 405.It Cm freemap Ar devpath
3c198419 406Dump the freemap tree for the HAMMER2 filesystem by scanning a
ff8206e7
SW
407block device directly.
408No mount is required.
eea2bdca
TK
409.\" ==== volhdr ====
410.It Cm volhdr Ar devpath
411Dump the volume header for the HAMMER2 filesystem by scanning a
ff8206e7
SW
412block device directly.
413No mount is required.
3c198419 414.\" ==== setcomp ====
d371ccd2 415.It Cm setcomp Ar mode[:level] Ar path...
3c198419
MD
416Set the compression mode as specified for any newly created elements at or
417under the path if not overridden by deeper elements.
418Available modes are none, autozero, lz4, or zlib.
419When zlib is used the compression level can be set.
420The default will be 6 which is the best trade-off between performance and
421time.
422.Pp
423newfs_hammer2 will set the default compression to lz4 which prioritizes
424speed over performance.
425Also note that HAMMER2 contains a heuristic and will not attempt to
426compress every block if it detects a sufficient amount of uncompressable
427data.
428.Pp
429Hammer2 compression is only effective when it can reduce the size of dataset
430(typically a 64KB block) by one or more powers of 2. A 64K block which
431only compresses to 40K will not yield any storage improvement.
3d4f397a 432.Pp
d6bb1ac9
SW
433Generally speaking you do not want to set the compression mode to
434.Sq none ,
3d4f397a 435as this will cause blocks of all-zeros to be written as all-zero blocks,
ff8206e7
SW
436instead of holes.
437The
d6bb1ac9
SW
438.Sq autozero
439compression mode detects blocks of all-zeros
ff8206e7 440and writes them as holes.
3c198419 441.\" ==== setcheck ====
d371ccd2 442.It Cm setcheck Ar check Ar path...
3c198419
MD
443Set the check code as specified for any newly created elements at or under
444the path if not overridden by deeper elements.
b83c55fc 445Available codes are default, disabled, crc32, xxhash64, or sha192.
3983183a
MD
446.Pp
447Normally HAMMER2 does not overwrite data blocks on the media in order
448to ensure snapshot integrity. Replacement data blocks will be reallocated.
449However, if the compression mode is set to
450.Sq none
451and the check code is set to
452.Sq disabled
453HAMMER2 will overwrite data on the media in-place.
454In this mode of operation,
455snapshots will not be able to snapshot the data against later changes
456made to the file, and de-duplication will no longer function on any
457data related to the file.
458However, you can still recover the most recent data from previously
fc13af18 459taken snapshots if you accidentally remove the file.
3c198419
MD
460.\" ==== clrcheck ====
461.It Cm clrcheck Op path...
462Clear the check code override for the specified paths.
463Overrides may still be present in deeper elements.
464.\" ==== setcrc32 ====
465.It Cm setcrc32 Op path...
466Set the check code to the ISCSI 32-bit CRC for any newly created elements
467at or under the path if not overridden by deeper elements.
b83c55fc
MD
468.\" ==== setxxhash64 ====
469.It Cm setxxhash64 Op path...
470Set the check code to XXHASH64, a fast 64-bit hash
3c198419
MD
471.\" ==== setsha192 ====
472.It Cm setsha192 Op path...
473Set the check code to SHA192 for any newly created elements at or under
474the path if not overridden by deeper elements.
475.\" ==== bulkfree ====
d371ccd2 476.It Cm bulkfree Ar path
3c198419
MD
477Run a bulkfree pass on a HAMMER2 mount.
478You can specify any PFS for the mount, the bulkfree pass is run on the
479entire partition.
b83c55fc 480Note that it takes two passes to actually free space.
944ddad0 481By default this directive will use up to 1/16 physical memory to track
ff8206e7
SW
482the freemap.
483The amount of memory used may be overridden with the
944ddad0
MD
484.Op Fl m Ar mem
485option.
d371ccd2
TK
486.\" ==== printinode ====
487.It Cm printinode Ar path
488Dump inode.
fb9bceea 489.\" ==== dumpchain ====
d371ccd2
TK
490.It Cm dumpchain Op path Op chnflags
491Dump in-memory chain topology.
3c198419 492.El
3d4f397a
MD
493.Sh SYSCTLS
494.Bl -tag -width indent
495.It Va vfs.hammer2.dedup_enable (default on)
ff8206e7
SW
496Enables live de-duplication.
497Any recently read data that is on-media
3d4f397a 498(already synchronized to media) is tested against pending writes for
ff8206e7
SW
499compatibility.
500If a match is found, the write will reference the
3d4f397a
MD
501existing on-media data instead of writing new data.
502.It Va vfs.hammer2.always_compress (default off)
503This disables the H2 compression heuristic and forces H2 to always
504try to compress data blocks, even if they look uncompressable.
505Enabling this option reduces performance but has higher de-duplication
506repeatability.
4ff0b408
MD
507.It Va vfs.hammer2.cluster_data_read (default 4)
508.It Va vfs.hammer2.cluster_meta_read (default 1)
509Set the amount of read-ahead clustering to perform on data and meta-data
510blocks.
f6aebb44 511.It Va vfs.hammer2.cluster_write (default 4)
ff8206e7
SW
512Set the amount of write-behind clustering to perform in buffers.
513Each buffer represents 64KB.
514The default is 4 and higher values typically do not improve performance.
515A value of 0 disables clustered writes.
f6aebb44
MD
516This variable applies to the underlying media device, not to logical
517file writes, so it should not interfere with temporary file optimization.
518Generally speaking you want this enabled to generate smoothly pipelined
519writes to the media.
4ff0b408 520.It Va vfs.hammer2.bulkfree_tps (default 5000)
ff8206e7
SW
521Set bulkfree's maximum scan rate.
522This is primarily intended to limit
523I/O utilization on SSDs and CPU utilization when the meta-data is mostly
4ff0b408 524cached in memory.
3d4f397a 525.El
3c198419 526.Sh SETTING UP /etc/hammer2
d6bb1ac9
SW
527The
528.Sq rsainit
529directive will create the
3c198419
MD
530.Pa /etc/hammer2
531directory with appropriate permissions and also generate a public key
ff8206e7
SW
532pair in this directory for the machine.
533These files will be
3c198419
MD
534.Pa rsa.pub
535and
536.Pa rsa.prv
537and needless to say, the private key shouldn't leave the host.
538.Pp
539The service daemon will also scan the
540.Pa /etc/hammer2/autoconn
541file which contains a list of hosts which it will automatically maintain
542connections to to form your cluster.
543The service daemon will automatically reconnect on any failure and will
544also monitor the file for changes.
545.Pp
546When the service daemon receives a connection it expects to find a
547public key for that connection in a file in
548.Pa /etc/hammer2/remote/
549called
550.Pa <IPADDR>.pub .
551You normally copy the
552.Pa rsa.pub
553key from the host in question to this file.
554The IP address must match exactly or the connection will not be allowed.
555.Pp
556If you want to use an unencrypted connection you can create empty,
557dummy files in the remote directory in the form
558.Pa <IPADDR>.none .
559We do not recommend using unencrypted connections.
560.Sh CLUSTER SERVICES
561Currently there are two services which use the cluster network infrastructure,
562HAMMER2 mounts and XDISK.
563Any HAMMER2 mount will make all PFSs for that filesystem available to the
564cluster.
565And if the XDISK kernel module is loaded, the hammer2 service daemon will make
566your machine's block devices available to the cluster (you must load the
567xdisk.ko kernel module before starting the hammer2 service).
568They will show up as
569.Pa /dev/xa*
570and
571.Pa /dev/serno/*
572devices on the remote machines making up the cluster.
573Remote block devices are just what they appear to be... direct access to a
ff8206e7
SW
574block device on a remote machine.
575If the link goes down remote accesses
3c198419
MD
576will stall until it comes back up again, then automatically requeue any
577pending I/O and resume as if nothing happened.
578However, if the server hosting the physical disks crashes or is rebooted,
579any remote opens to its devices will see a permanent I/O failure requiring a
580close and open sequence to re-establish.
581The latter is necessary because the server's drives might not have committed
582the data before the crash, but had already acknowledged the transfer.
583.Pp
584Data commits work exactly the same as they do for real block devices.
585The originater must issue a BUF_CMD_FLUSH.
586.Sh ADDING A NEW MASTER TO A CLUSTER
587When you
588.Xr newfs_hammer2 8
d6bb1ac9
SW
589a HAMMER2 filesystem or use the
590.Sq pfs-create
591directive on one already mounted
3c198419 592to create a new PFS, with no special options, you wind up with a PFS
ff8206e7 593typed as a MASTER and a unique cluster UUID, but because there is only one
3c198419
MD
594PFS for that cluster (for each PFS you create via pfs-create), it will
595act just like a normal filesystem would act and does not require any special
596protocols to operate.
597.Pp
d6bb1ac9
SW
598If you use the
599.Sq pfs-create
600directive along with the
3c198419 601.Fl u
ff8206e7 602option to specify a cluster UUID that already exists in the cluster,
3c198419
MD
603you are adding a PFS to an existing cluster and this can trigger a whole
604series of events in the background.
605When you specify the
606.Fl u
d6bb1ac9
SW
607option in a
608.Sq pfs-create ,
3c198419
MD
609.Nm
610will by default create a SLAVE PFS.
611In fact, this is what must be created first even if you want to add a new
612MASTER to your cluster.
613.Pp
614The most common action a system admin will want to take is to upgrade or
615downgrade a PFS.
616A new MASTER can be added to the cluster by upgrading an existing SLAVE
617to MASTER.
618A MASTER can be removed from the cluster by downgrading it to a SLAVE.
619Upgrades and downgrades will put nodes in the cluster in a transition state
620until the operation is complete.
621For downgrades the transition state is fleeting unless one or more other
622masters has not acknowledged the change.
623For upgrades a background synchronization process must complete before the
624transition can be said to be complete, and the node remains (really) a SLAVE
625until that transition is complete.
626.Sh USE CASES FOR A SOFT_MASTER
627The SOFT_MASTER PFS type is a special type which must be specifically
628mounted by a machine.
629It is a R/W mount which does not use the quorum protocol and is not
630cache coherent with the cluster, but which synchronizes from the cluster
631and allows modifying operations which will synchronize to the cluster.
632The most common case is to use a SOFT_MASTER PFS in a laptop allowing you
633to work on your laptop when you are on the road and not connected to
634your main servers, and for the laptop to synchronize when a connection is
635available.
636.Sh USE CASES FOR A SOFT_SLAVE
637A SOFT_SLAVE PFS type is a special type which must be specifically mounted
638by a machine.
639It is a RO mount which does not use the quorum protocol and is not
ff8206e7
SW
640cache coherent with the cluster.
641It will receive synchronization from
3c198419
MD
642the cluster when network connectivity is available but will not stall if
643network connectivity is lost.
644.Sh FSYNC FLUSH MODES
645TODO.
646.Sh RESTORING FROM A SNAPSHOT BACKUP
647TODO.
125966e8 648.Sh PERFORMANCE TUNING
2af25173 649Because HAMMER2 implements compression, decompression, and dedup natively,
ff8206e7
SW
650it always double-buffers file data.
651This means that the file data is
125966e8
MD
652cached via the device vnode (in compressed / dedupped-form) and the same
653data is also cached by the file vnode (in decompressed / non-dedupped form).
654.Pp
655While HAMMER2 will try to age the logical file buffers on its, some
656additional performance tuning may be necessary for optimal operation
ff8206e7
SW
657whether swapcache is used or not.
658Our recommendation is to reduce the
125966e8
MD
659number of vnodes (and thus also the logical buffer cache behind the
660vnodes) that the system caches via the
661.Va kern.maxvnodes
662sysctl.
663.Pp
664Too-large a value will result in excessive double-caching and can cause
665unnecessary read disk I/O.
666We recommend a number between 25000 and 250000 vnodes, depending on your
667use case.
668Keep in mind that even though the vnode cache is smaller, this will make
669room for a great deal more device-level buffer caching which can encompasses
670far more data and meta-data than the vnode-level caching.
3c198419
MD
671.Sh ENVIRONMENT
672TODO.
673.Sh FILES
674.Bl -tag -width ".It Pa <fs>/abc/defghi/<name>" -compact
675.It Pa /etc/hammer2/
676.It Pa /etc/hammer2/rsa.pub
677.It Pa /etc/hammer2/rsa.prv
678.It Pa /etc/hammer2/autoconn
679.It Pa /etc/hammer2/remote/<IP>.pub
680.It Pa /etc/hammer2/remote/<IP>.none
681.El
682.Sh EXIT STATUS
683.Ex -std
684.Sh SEE ALSO
685.Xr mount_hammer2 8 ,
686.Xr mount_null 8 ,
687.Xr newfs_hammer2 8 ,
688.Xr swapcache 8 ,
52d59648
DF
689.Xr sysctl 8 ,
690.Xr periodic.conf 5
3c198419
MD
691.Sh HISTORY
692The
693.Nm
694utility first appeared in
695.Dx 4.1 .
696.Sh AUTHORS
697.An Matthew Dillon Aq Mt dillon@backplane.com