.\" .\" Copyright (c) 2008 .\" The DragonFly Project. All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in .\" the documentation and/or other materials provided with the .\" distribution. .\" 3. Neither the name of The DragonFly Project nor the names of its .\" contributors may be used to endorse or promote products derived .\" from this software without specific, prior written permission. .\" .\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS .\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT .\" LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS .\" FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE .\" COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, .\" INCIDENTAL, SPECIAL, EXEMPLARY OR CONSEQUENTIAL DAMAGES (INCLUDING, .\" BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; .\" LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED .\" AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, .\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT .\" OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .Dd April 8, 2010 .Os .Dt HAMMER 5 .Sh NAME .Nm HAMMER .Nd HAMMER file system .Sh SYNOPSIS To compile this driver into the kernel, place the following line in your kernel configuration file: .Bd -ragged -offset indent .Cd options HAMMER .Ed .Pp Alternatively, to load the driver as a module at boot time, place the following line in .Xr loader.conf 5 : .Bd -literal -offset indent hammer_load="YES" .Ed .Pp To mount via .Xr fstab 5 : .Bd -literal -offset indent /dev/ad0s1d[:/dev/ad1s1d:...] /mnt hammer rw 2 0 .Ed .Sh DESCRIPTION The .Nm file system provides facilities to store file system data onto disk devices and is intended to replace .Xr ffs 5 as the default file system for .Dx . Among its features are instant crash recovery, large file systems spanning multiple volumes, data integrity checking, fine grained history retention, mirroring capability, and pseudo file systems. .Pp All functions related to managing .Nm file systems are provided by the .Xr newfs_hammer 8 , .Xr mount_hammer 8 , .Xr hammer 8 , .Xr chflags 1 , and .Xr undo 1 utilities. .Pp For a more detailed introduction refer to the paper and slides listed in the .Sx SEE ALSO section. For some common usages of .Nm see the .Sx EXAMPLES section below. .Ss Instant Crash Recovery After a non-graceful system shutdown, .Nm file systems will be brought back into a fully coherent state when mounting the file system, usually within a few seconds. .Ss Large File Systems & Multi Volume A .Nm file system can be up to 1 Exabyte in size. It can span up to 256 volumes, each volume occupies a .Dx disk slice or partition, or another special file, and can be up to 4096 TB in size. Minimum recommended .Nm file system size is 50 GB. For volumes over 2 TB in size .Xr gpt 8 and .Xr disklabel64 8 normally need to be used. .Ss Data Integrity Checking .Nm has high focus on data integrity, CRC checks are made for all major structures and data. .Nm snapshots implements features to make data integrity checking easier: The atime and mtime fields are locked to the ctime for files accessed via a snapshot. The .Fa st_dev field is based on the PFS .Ar shared-uuid and not on any real device. This means that archiving the contents of a snapshot with e.g.\& .Xr tar 1 and piping it to something like .Xr md5 1 will yield a consistent result. The consistency is also retained on mirroring targets. .Ss Transaction IDs The .Nm file system uses 64 bit, hexadecimal transaction IDs to refer to historical file or directory data. An ID has the .Xr printf 3 format .Li %#016llx , such as .Li 0x00000001061a8ba6 . .Pp Related .Xr hammer 8 commands: .Ar snapshot , .Ar synctid .Ss History & Snapshots History metadata on the media is written with every sync operation, so that by default the resolution of a file's history is 30-60 seconds until the next prune operation. Prior versions of files or directories are generally accessible by appending .Li @@ and a transaction ID to the name. The common way of accessing history, however, is by taking snapshots. .Pp Snapshots are softlinks to prior versions of directories and their files. Their data will be retained across prune operations for as long as the softlink exists. Removing the softlink enables the file system to reclaim the space again upon the next prune & reblock operations. .Pp Related .Xr hammer 8 commands: .Ar cleanup , .Ar history , .Ar snapshot ; see also .Xr undo 1 .Ss Pruning & Reblocking Pruning is the act of deleting file system history. By default only history used by the given snapshots and history from after the latest snapshot will be retained. By setting the per PFS parameter .Cm prune-min , history is guaranteed to be saved at least this time interval. All other history is deleted. Reblocking will reorder all elements and thus defragment the file system and free space for reuse. After pruning a file system must be reblocked to recover all available space. Reblocking is needed even when using the .Ar nohistory .Xr mount_hammer 8 option or .Xr chflags 1 flag. .Pp Related .Xr hammer 8 commands: .Ar cleanup , .Ar snapshot , .Ar prune , .Ar prune-everything , .Ar rebalance , .Ar reblock , .Ar reblock-btree , .Ar reblock-inodes , .Ar reblock-dirs , .Ar reblock-data .Ss Mirroring & Pseudo File Systems In order to allow inode numbers to be duplicated on the slaves .Nm Ap s mirroring feature uses .Dq Pseudo File Systems (PFSs). A .Nm file system supports up to 65535 PFSs. Multiple slaves per master are supported, but multiple masters per slave are not. Slaves are always read-only. Upgrading slaves to masters and downgrading masters to slaves are supported. .Pp It is recommended to use a .Nm null mount to access a PFS; this way no tools are confused by the PFS root being a symlink and inodes not being unique across a .Nm file system. .Pp Related .Xr hammer 8 commands: .Ar pfs-master , .Ar pfs-slave , .Ar pfs-cleanup , .Ar pfs-status , .Ar pfs-update , .Ar pfs-destroy , .Ar pfs-upgrade , .Ar pfs-downgrade , .Ar mirror-copy , .Ar mirror-stream , .Ar mirror-read , .Ar mirror-read-stream , .Ar mirror-write , .Ar mirror-dump .Ss NFS Export .Nm file systems support NFS export. NFS export of PFSs is done using .Nm null mounts. For example, to export the PFS .Pa /hammer/pfs/data , create a .Nm null mount, e.g.\& to .Pa /hammer/data and export the latter path. .Pp Don't export a directory containing a PFS (e.g.\& .Pa /hammer/pfs above). Only .Nm null mount for PFS root (e.g.\& .Pa /hammer/data above) should be exported (subdirectory may be escaped if exported). .Sh EXAMPLES .Ss Preparing the File System To create and mount a .Nm file system use the .Xr newfs_hammer 8 and .Xr mount_hammer 8 commands. Note that all .Nm file systems must have a unique name on a per-machine basis. .Bd -literal -offset indent newfs_hammer -L HOME /dev/ad0s1d mount_hammer /dev/ad0s1d /home .Ed .Pp Similarly, multi volume file systems can be created and mounted by specifying additional arguments. .Bd -literal -offset indent newfs_hammer -L MULTIHOME /dev/ad0s1d /dev/ad1s1d mount_hammer /dev/ad0s1d /dev/ad1s1d /home .Ed .Pp Once created and mounted, .Nm file systems need periodic clean up making snapshots, pruning and reblocking, in order to have access to history and file system not to fill up. For this it is recommended to use the .Xr hammer 8 .Ar cleanup metacommand. .Pp By default, .Dx is set up to run .Nm hammer Ar cleanup nightly via .Xr periodic 8 . .Pp It is also possible to perform these operations individually via .Xr crontab 5 . For example, to reblock the .Pa /home file system every night at 2:15 for up to 5 minutes: .Bd -literal -offset indent 15 2 * * * hammer -c /var/run/HOME.reblock -t 300 reblock /home \e >/dev/null 2>&1 .Ed .Ss Snapshots The .Xr hammer 8 utility's .Ar snapshot command provides several ways of taking snapshots. They all assume a directory where snapshots are kept. .Bd -literal -offset indent mkdir /snaps hammer snapshot /home /snaps/snap1 (...after some changes in /home...) hammer snapshot /home /snaps/snap2 .Ed .Pp The softlinks in .Pa /snaps point to the state of the .Pa /home directory at the time each snapshot was taken, and could now be used to copy the data somewhere else for backup purposes. .Pp By default, .Dx is set up to create nightly snapshots of all .Nm file systems via .Xr periodic 8 and to keep them for 60 days. .Ss Pruning A snapshot directory is also the argument to the .Xr hammer 8 Ap s .Ar prune command which frees historical data from the file system that is not pointed to by any snapshot link and is not from after the latest snapshot. .Bd -literal -offset indent rm /snaps/snap1 hammer prune /snaps .Ed .Ss Mirroring Mirroring can be set up using .Nm Ap s pseudo file systems. To associate the slave with the master its shared UUID should be set to the master's shared UUID as output by the .Nm hammer Ar pfs-master command. .Bd -literal -offset indent hammer pfs-master /home/pfs/master hammer pfs-slave /home/pfs/slave shared-uuid= .Ed .Pp The .Pa /home/pfs/slave link is unusable for as long as no mirroring operation has taken place. .Pp To mirror the master's data, either pipe a .Fa mirror-read command into a .Fa mirror-write or, as a short-cut, use the .Fa mirror-copy command (which works across a .Xr ssh 1 connection as well). Initial mirroring operation has to be done to the PFS path (as .Xr mount_null 8 can't access it yet). .Bd -literal -offset indent hammer mirror-copy /home/pfs/master /home/pfs/slave .Ed .Pp After this initial step .Nm null mount can be setup for .Pa /home/pfs/slave . Further operations can use .Nm null mounts. .Bd -literal -offset indent mount_null /home/pfs/master /home/master mount_null /home/pfs/slave /home/slave hammer mirror-copy /home/master /home/slave .Ed .Ss NFS Export To NFS export from the .Nm file system .Pa /hammer the directory .Pa /hammer/non-pfs without PFSs, and the PFS .Pa /hammer/pfs/data , the latter is null mounted to .Pa /hammer/data . .Pp Add to .Pa /etc/fstab (see .Xr fstab 5 ) : .Bd -literal -offset indent /hammer/pfs/data /hammer/data null rw .Ed .Pp Add to .Pa /etc/exports (see .Xr exports 5 ) : .Bd -literal -offset indent /hammer/non-pfs /hammer/data .Ed .Sh SEE ALSO .Xr chflags 1 , .Xr md5 1 , .Xr tar 1 , .Xr undo 1 , .Xr exports 5 , .Xr ffs 5 , .Xr fstab 5 , .Xr disklabel64 8 , .Xr gpt 8 , .Xr hammer 8 , .Xr mount_hammer 8 , .Xr mount_null 8 , .Xr newfs_hammer 8 .Rs .%A Matthew Dillon .%D June 2008 .%O http://www.dragonflybsd.org/hammer/hammer.pdf .%T "The HAMMER Filesystem" .Re .Rs .%A Matthew Dillon .%D October 2008 .%O http://www.dragonflybsd.org/hammer/nycbsdcon/ .%T "Slideshow from NYCBSDCon 2008" .Re .Rs .%A Michael Neumann .%D January 2010 .%O http://www.ntecs.de/sysarch09/HAMMER.pdf .%T "Slideshow for a presentation held at KIT (http://www.kit.edu)." .Re .Sh FILESYSTEM PERFORMANCE The .Nm file system has a front-end which processes VNOPS and issues necessary block reads from disk, and a back-end which handles meta-data updates on-media and performs all meta-data write operations. Bulk file write operations are handled by the front-end. Because .Nm defers meta-data updates virtually no meta-data read operations will be issued by the frontend while writing large amounts of data to the file system or even when creating new files or directories, and even though the kernel prioritizes reads over writes the fact that writes are cached by the drive itself tends to lead to excessive priority given to writes. .Pp There are four bioq sysctls, shown below with default values, which can be adjusted to give reads a higher priority: .Bd -literal -offset indent kern.bioq_reorder_minor_bytes: 262144 kern.bioq_reorder_burst_bytes: 3000000 kern.bioq_reorder_minor_interval: 5 kern.bioq_reorder_burst_interval: 60 .Ed .Pp If a higher read priority is desired it is recommended that the .Fa kern.bioq_reorder_minor_interval be increased to 15, 30, or even 60, and the .Fa kern.bioq_reorder_burst_bytes be decreased to 262144 or 524288. .Sh HISTORY The .Nm file system first appeared in .Dx 1.11 . .Sh AUTHORS .An -nosplit The .Nm file system was designed and implemented by .An Matthew Dillon Aq dillon@backplane.com . This manual page was written by .An Sascha Wildner .