1 .\" Copyright (c) 2007 The DragonFly Project. All rights reserved.
3 .\" This code is derived from software contributed to The DragonFly Project
4 .\" by Matthew Dillon <dillon@backplane.com>
6 .\" Redistribution and use in source and binary forms, with or without
7 .\" modification, are permitted provided that the following conditions
10 .\" 1. Redistributions of source code must retain the above copyright
11 .\" notice, this list of conditions and the following disclaimer.
12 .\" 2. Redistributions in binary form must reproduce the above copyright
13 .\" notice, this list of conditions and the following disclaimer in
14 .\" the documentation and/or other materials provided with the
16 .\" 3. Neither the name of The DragonFly Project nor the names of its
17 .\" contributors may be used to endorse or promote products derived
18 .\" from this software without specific, prior written permission.
20 .\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
21 .\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
22 .\" LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
23 .\" FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
24 .\" COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
25 .\" INCIDENTAL, SPECIAL, EXEMPLARY OR CONSEQUENTIAL DAMAGES (INCLUDING,
26 .\" BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
27 .\" LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
28 .\" AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
29 .\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
30 .\" OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
38 .Nd HAMMER file system utility
45 .Op Fl C Ar cachesize Ns Op Ns Cm \&: Ns Ar readahead
46 .Op Fl R Ar restrictcmd
47 .Op Fl T Ar restrictpath
49 .Op Fl e Ar scoreboardfile
51 .\" .Op Fl s Ar linkpath
60 This manual page documents the
62 utility which provides miscellaneous functions related to managing a
65 For a general introduction to the
67 file system, its features, and
68 examples on how to set up and maintain one, see
71 The options are as follows:
72 .Bl -tag -width indent
74 Tell the mirror commands to use a 2-way protocol, which allows
75 automatic negotiation of transaction id ranges.
76 This option is automatically enabled by the
82 will not attempt to break-up large initial bulk transfers into smaller
84 This can save time but if the link is lost in the middle of the
85 initial bulk transfer you will have to start over from scratch.
86 For more information see the
90 Specify a bandwidth limit in bytes per second for mirroring streams.
91 This option is typically used to prevent batch mirroring operations from
92 loading down the machine.
93 The bandwidth may be suffixed with
97 to specify values in kilobytes, megabytes, and gigabytes per second.
98 If no suffix is specified, bytes per second is assumed.
100 Unfortunately this is only applicable to the pre-compression bandwidth
101 when compression is used, so a better solution would probably be to
107 .It Fl C Ar cachesize Ns Op Ns Cm \&: Ns Ar readahead
108 Set the memory cache size for any raw
115 for megabytes is allowed,
116 else the cache size is specified in bytes.
118 The read-behind/read-ahead defaults to 4
122 This option is typically only used with diagnostic commands
123 as kernel-supported commands will use the kernel's buffer cache.
124 .It Fl R Ar restrictcmd
125 This option is used by hammer ssh-remote to restrict the command later
126 on in the argument list. Multiple commands may be specified, separated
127 by a comma (all one argument).
128 .It Fl T Ar restrictpath
129 This option is used by hammer ssh-remote to restrict the filesystem path
130 specified later on in the argument list.
131 .It Fl c Ar cyclefile
132 When pruning, rebalancing or reblocking you can tell the utility
133 to start at the object id stored in the specified file.
134 If the file does not exist
136 will start at the beginning.
139 is told to run for a specific period of time
141 and is unable to complete the operation it will write out
142 the current object id so the next run can pick up where it left off.
145 runs to completion it will delete
147 .It Fl e Ar scoreboardfile
148 Update scoreboard file with progress, primarily used by mirror-stream.
153 will not check that time period has elapsed if this option is given.
155 Specify the volumes making up a
159 is a colon-separated list of devices, each specifying a
165 Specify delay in seconds for
166 .Cm mirror-read-stream .
167 When maintaining a streaming mirroring this option specifies the
168 minimum delay after a batch ends before the next batch is allowed
170 The default is five seconds.
172 Specify the maximum amount of memory
174 will allocate during a dedup pass.
175 Specify a suffix 'm', 'g', or 't' for megabytes, gigabytes, or terabytes.
178 will allocate up to 1G of ram to hold CRC/SHA tables while running dedup.
179 When the limit is reached the dedup code restricts the range of CRCs to
180 keep memory use within bounds and runs multiple passes as necessary until
181 the entire filesystem has been deduped.
188 specification for the source and/or destination.
190 Decrease verboseness.
191 May be specified multiple times.
193 Specify recursion for those commands which support it.
194 .It Fl S Ar splitsize
195 Specify the bulk splitup size in bytes for mirroring streams.
200 will do an initial run-through of the data to calculate good
201 transaction ids to cut up the bulk transfers, creating
202 restart points in case the stream is interrupted.
203 If we don't do this and the stream is interrupted it might
204 have to start all over again.
209 At the moment the run-through is disk-bandwidth-heavy but some
210 future version will limit the run-through to just the B-Tree
211 records and not the record data.
213 The splitsize may be suffixed with
217 to specify values in kilobytes, megabytes, or gigabytes.
218 If no suffix is specified, bytes is assumed.
220 When mirroring very large filesystems the minimum recommended
222 A small split size may wind up generating a great deal of overhead
223 but very little actual incremental data and is not recommended.
225 Specify timeout in seconds.
226 When pruning, rebalancing, reblocking or mirror-reading
227 you can tell the utility to stop after a certain period of time.
228 A value of 0 means unlimited.
229 This option is used along with the
231 option to prune, rebalance or reblock incrementally.
233 Increase verboseness.
234 May be specified multiple times.
236 Enable compression for any remote ssh specifications.
237 This option is typically used with the mirroring directives.
241 for interactive questions.
244 The commands are as follows:
245 .Bl -tag -width indent
246 .\" ==== synctid ====
247 .It Cm synctid Ar filesystem Op Cm quick
248 Generate a guaranteed, formal 64-bit transaction id representing the
249 current state of the specified
252 The file system will be synced to the media.
256 keyword is specified the file system will be soft-synced, meaning that a
257 crash might still undo the state of the file system as of the transaction
258 id returned but any new modifications will occur after the returned
259 transaction id as expected.
261 This operation does not create a snapshot.
262 It is meant to be used
263 to track temporary fine-grained changes to a subset of files and
264 will only remain valid for
266 access purposes for the
268 period configured for the PFS.
269 If you desire a real snapshot then the
271 directive may be what you are looking for.
273 .It Cm bstats Op Ar interval
276 B-Tree statistics until interrupted.
279 seconds between each display.
280 The default interval is one second.
281 .\" ==== iostats ====
282 .It Cm iostats Op Ar interval
286 statistics until interrupted.
289 seconds between each display.
290 The default interval is one second.
291 .\" ==== history ====
292 .It Cm history Ns Oo Cm @ Ns Ar offset Ns Oo Cm \&, Ns Ar length Oc Oc Ar path ...
293 Show the modification history for inode and data of
298 is given history is shown for data block at given offset,
299 otherwise history is shown for inode.
304 data bytes at given offset are dumped for each version,
309 this directive shows object id and sync status,
310 and for each object version it shows transaction id and time stamp.
311 Files has to exist for this directive to be applicable,
312 to track inodes which has been deleted or renamed see
314 .\" ==== blockmap ====
316 Dump the blockmap for the file system.
319 blockmap is two-layer
320 blockmap representing the maximum possible file system size of 1 Exabyte.
321 Needless to say the second layer is only present for blocks which exist.
323 blockmap represents 8-Megabyte blocks, called big-blocks.
324 Each big-block has an append
325 point, a free byte count, and a typed zone id which allows content to be
326 reverse engineered to some degree.
330 allocations are essentially appended to a selected big-block using
331 the append offset and deducted from the free byte count.
332 When space is freed the free byte count is adjusted but
334 does not track holes in big-blocks for reallocation.
335 A big-block must be completely freed, either
336 through normal file system operations or through reblocking, before
339 Data blocks can be shared by deducting the space used from the free byte
340 count for each shared references.
341 This means the free byte count can legally go negative.
343 This command needs the
346 .\" ==== checkmap ====
348 Check the blockmap allocation count.
350 will scan the B-Tree, collect allocation information, and
351 construct a blockmap in-memory.
352 It will then check that blockmap against the on-disk blockmap.
354 This command needs the
358 .It Cm show Op Ar localization Ns Op Cm \&: Ns Ar object_id
360 By default this command will validate all B-Tree
361 linkages and CRCs, including data CRCs, and will report the most verbose
362 information it can dig up.
363 Any errors will show up with a
365 in column 1 along with various
371 .Ar localization Ns Cm \&: Ns Ar object_id
373 search for the key printing nodes as it recurses down, and then
374 will iterate forwards.
375 These fields are specified in HEX.
376 Note that the pfsid is the top 16 bits of the 32-bit localization
377 field so PFS #1 would be 00010000.
381 the command will report less information about the inode contents.
385 the command will not report the content of the inode or other typed
390 the command will not report volume header information, big-block fill
391 ratios, mirror transaction ids, or report or check data CRCs.
392 B-Tree CRCs and linkages are still checked.
394 This command needs the
397 .\" ==== show-undo ====
401 Dump the UNDO/REDO map.
403 This command needs the
407 .\" Dump the B-Tree, record, large-data, and small-data blockmaps, showing
408 .\" physical block assignments and free space percentages.
409 .\" ==== ssh-remote ====
410 .It Cm ssh-remote Ar command Ar targetdir
411 Used in a ssh authorized_keys line such as
412 command="/sbin/hammer ssh-remote mirror-read /fubarmount" ... to allow
413 mirror-read or mirror-write access to a particular subdirectory tree.
414 This way you do not have to give shell access to the remote box.
416 will obtain the original command line from the SSH_ORIGINAL_COMMAND
417 environment variable, validate it against the restriction, and then
418 re-exec hammer with the validated arguments.
420 The remote hammer command does not allow the
424 options to be passed in.
425 .\" ==== recover ====
426 .It Cm recover Ar targetdir
427 Recover data from a corrupted
430 This is a low level command which operates on the filesystem image and
431 attempts to locate and recover files from a corrupted filesystem.
432 The entire image is scanned linearly looking for B-Tree nodes.
434 found which passes its CRC test is scanned for file, inode, and directory
435 fragments and the target directory is populated with the resulting data.
436 files and directories in the target directory are initially named after
437 the object id and are renamed as fragmentary information is processed.
439 This command keeps track of filename/object_id translations and may eat a
440 considerably amount of memory while operating.
442 This command is literally the last line of defense when it comes to
443 recovering data from a dead filesystem.
445 This command needs the
448 .\" ==== namekey1 ====
449 .It Cm namekey1 Ar filename
452 64-bit directory hash for the specified file name, using
453 the original directory hash algorithm in version 1 of the file system.
454 The low 32 bits are used as an iterator for hash collisions and will be
456 .\" ==== namekey2 ====
457 .It Cm namekey2 Ar filename
460 64-bit directory hash for the specified file name, using
461 the new directory hash algorithm in version 2 of the file system.
462 The low 32 bits are still used as an iterator but will start out containing
463 part of the hash key.
464 .\" ==== namekey32 ====
465 .It Cm namekey32 Ar filename
466 Generate the top 32 bits of a
468 64 bit directory hash for the specified file name.
470 .It Cm info Ar dirpath ...
471 Show extended information about all
473 file systems mounted in the system or the one mounted in
475 when this argument is specified.
477 The information is divided into sections:
478 .Bl -tag -width indent
479 .It Volume identification
480 General information, like the label of the
482 filesystem, the number of volumes it contains, the FSID, and the
485 .It Big block information
486 Big block statistics, such as total, used, reserved and free big blocks.
487 .It Space information
488 Information about space used on the filesystem.
489 Currently total size, used, reserved and free space are displayed.
491 Basic information about the PFSs currently present on a
496 is the ID of the PFS, with 0 being the root PFS.
498 is the current snapshot count on the PFS.
500 displays the mount point of the PFS is currently mounted on (if any).
502 .\" ==== cleanup ====
503 .It Cm cleanup Op Ar filesystem ...
504 This is a meta-command which executes snapshot, prune, rebalance, dedup
505 and reblock commands on the specified
510 is specified this command will clean-up all
512 file systems in use, including PFS's.
513 To do this it will scan all
517 mounts, extract PFS id's, and clean-up each PFS found.
519 This command will access a snapshots
520 directory and a configuration file for each
522 creating them if necessary.
523 .Bl -tag -width indent
524 .It Nm HAMMER No version 2-
525 The configuration file is
527 in the snapshots directory which defaults to
528 .Pa <pfs>/snapshots .
529 .It Nm HAMMER No version 3+
530 The configuration file is saved in file system meta-data, see
533 The snapshots directory defaults to
534 .Pa /var/hammer/<pfs>
535 .Pa ( /var/hammer/root
539 The format of the configuration file is:
540 .Bd -literal -offset indent
541 snapshots <period> <retention-time> [any]
542 prune <period> <max-runtime>
543 rebalance <period> <max-runtime>
544 dedup <period> <max-runtime>
545 reblock <period> <max-runtime>
546 recopy <period> <max-runtime>
550 .Bd -literal -offset indent
551 snapshots 1d 60d # 0d 0d for PFS /tmp, /var/tmp, /usr/obj
559 Time is given with a suffix of
565 meaning day, hour, minute and second.
569 directive has a period of 0 and a retention time of 0
570 then snapshot generation is disabled, removal of old snapshots are
571 disabled, and prunes will use
572 .Cm prune-everything .
576 directive has a period of 0 but a non-zero retention time
577 then this command will not create any new snapshots but will remove old
578 snapshots it finds based on the retention time.
580 used on PFS masters where you are generating your own snapshot softlinks
581 manually and on PFS slaves when all you wish to do is prune away existing
582 snapshots inherited via the mirroring stream.
584 By default only snapshots in the form
585 .Ql snap- Ns Ar yyyymmdd Ns Op - Ns Ar HHMM
589 directive is specified as a third argument on the
591 config line then any softlink of the form
592 .Ql *- Ns Ar yyyymmdd Ns Op - Ns Ar HHMM
594 .Ql *. Ns Ar yyyymmdd Ns Op - Ns Ar HHMM
597 A period of 0 for prune, rebalance, dedup, reblock or recopy disables the directive.
598 A max-runtime of 0 means unlimited.
600 If period hasn't passed since the previous
603 For example a day has passed when midnight is passed (localtime).
606 flag is given the period is ignored.
614 The default configuration file will create a daily snapshot, do a daily
615 pruning, rebalancing, deduping and reblocking run and a monthly recopy run.
616 Reblocking is defragmentation with a level of 95%,
617 and recopy is full defragmentation.
619 By default prune, dedup and rebalance operations are time limited to 5 minutes,
620 and reblock operations to a bit over 5 minutes,
621 and recopy operations to a bit over 10 minutes.
622 Reblocking and recopy runs are each broken down into four separate functions:
623 btree, inodes, dirs and data.
624 Each function is time limited to the time given in the configuration file,
625 but the btree, inodes and dirs functions usually does not take very long time,
626 full defragmentation is always used for these three functions.
627 Also note that this directive will by default disable snapshots on
634 The defaults may be adjusted by modifying the configuration file.
635 The pruning and reblocking commands automatically maintain a cyclefile
636 for incremental operation.
637 If you interrupt (^C) the program the cyclefile will be updated,
639 may continue to run in the background for a few seconds until the
641 ioctl detects the interrupt.
644 PFS option can be set to use another location for the snapshots directory.
646 Work on this command is still in progress.
648 An ability to remove snapshots dynamically as the
649 file system becomes full.
651 .It Cm config Op Ar filesystem Op Ar configfile
654 Show or change configuration for
656 If zero or one arguments are specified this function dumps the current
657 configuration file to stdout.
658 Zero arguments specifies the PFS containing the current directory.
659 This configuration file is stored in file system meta-data.
660 If two arguments are specified this function installs a new config file.
664 versions less than 3 the configuration file is by default stored in
665 .Pa <pfs>/snapshots/config ,
666 but in all later versions the configuration file is stored in file system
668 .\" ==== viconfig ====
669 .It Cm viconfig Op Ar filesystem
672 Edit the configuration file and reinstall into file system meta-data when done.
673 Zero arguments specifies the PFS containing the current directory.
674 .\" ==== volume-add ====
675 .It Cm volume-add Ar device Ar filesystem
682 and add all of its space to
686 file system can use up to 256 volumes.
689 All existing data contained on
691 will be destroyed by this operation!
696 file system, formatting will be denied.
697 You can overcome this sanity check by using
699 to erase the beginning sectors of the device.
701 Remember that you have to specify
703 together with any other device that make up the file system,
710 is root file system, also remember to add
713 .Va vfs.root.mountfrom
715 .Pa /boot/loader.conf ,
718 .\" ==== volume-del ====
719 .It Cm volume-del Ar device Ar filesystem
725 Remember that you have to remove
727 from the colon-separated list in
733 is root file system, also remember to remove
736 .Va vfs.root.mountfrom
738 .Pa /boot/loader.conf ,
741 .\" ==== volume-list ====
742 .It Cm volume-list Ar filesystem
743 List the volumes that make up
745 .\" ==== snapshot ====
746 .It Cm snapshot Oo Ar filesystem Oc Ar snapshot-dir
747 .It Cm snapshot Ar filesystem Ar snapshot-dir Op Ar note
748 Take a snapshot of the file system either explicitly given by
750 or implicitly derived from the
752 argument and creates a symlink in the directory provided by
754 pointing to the snapshot.
757 is not a directory, it is assumed to be a format string passed to
759 with the current time as parameter.
762 refers to an existing directory, a default format string of
764 is assumed and used as name for the newly created symlink.
766 Snapshot is a per PFS operation, so each PFS in a
768 file system have to be snapshot separately.
770 Example, assuming that
778 are file systems on their own, the following invocations:
779 .Bd -literal -offset indent
780 hammer snapshot /mysnapshots
782 hammer snapshot /mysnapshots/%Y-%m-%d
784 hammer snapshot /obj /mysnapshots/obj-%Y-%m-%d
786 hammer snapshot /usr /my/snaps/usr "note"
789 Would create symlinks similar to:
790 .Bd -literal -offset indent
791 /mysnapshots/snap-20080627-1210 -> /@@0x10d2cd05b7270d16
793 /mysnapshots/2008-06-27 -> /@@0x10d2cd05b7270d16
795 /mysnapshots/obj-2008-06-27 -> /obj@@0x10d2cd05b7270d16
797 /my/snaps/usr/snap-20080627-1210 -> /usr@@0x10d2cd05b7270d16
802 version 3+ file system the snapshot is also recorded in file system meta-data
803 along with the optional
809 .It Cm snap Ar path Op Ar note
812 Create a snapshot for the PFS containing
814 and create a snapshot softlink.
815 If the path specified is a
816 directory a standard snapshot softlink will be created in the directory.
817 The snapshot softlink points to the base of the mounted PFS.
818 .It Cm snaplo Ar path Op Ar note
821 Create a snapshot for the PFS containing
823 and create a snapshot softlink.
824 If the path specified is a
825 directory a standard snapshot softlink will be created in the directory.
826 The snapshot softlink points into the directory it is contained in.
827 .It Cm snapq Ar dir Op Ar note
830 Create a snapshot for the PFS containing the specified directory but do
831 not create a softlink.
832 Instead output a path which can be used to access
833 the directory via the snapshot.
835 An absolute or relative path may be specified.
836 The path will be used as-is as a prefix in the path output to stdout.
838 snap and snapshot directives the snapshot transaction id will be registered
839 in the file system meta-data.
840 .It Cm snaprm Ar path Ar ...
841 .It Cm snaprm Ar transaction_id Ar ...
842 .It Cm snaprm Ar filesystem Ar transaction_id Ar ...
845 Remove a snapshot given its softlink or transaction id.
846 If specifying a transaction id
847 the snapshot is removed from file system meta-data but you are responsible
848 for removing any related softlinks.
850 If a softlink path is specified the filesystem and transaction id
851 is derived from the contents of the softlink.
852 If just a transaction id is specified it is assumed to be a snapshot in the
854 filesystem you are currently chdir'd into.
855 You can also specify the filesystem and transaction id explicitly.
856 .It Cm snapls Op Ar path ...
859 Dump the snapshot meta-data for PFSs containing each
861 listing all available snapshots and their notes.
862 If no arguments are specified snapshots for the PFS containing the
863 current directory are listed.
864 This is the definitive list of snapshots for the file system.
866 .It Cm prune Ar softlink-dir
867 Prune the file system based on previously created snapshot softlinks.
868 Pruning is the act of deleting file system history.
871 command will delete file system history such that
872 the file system state is retained for the given snapshots,
873 and all history after the latest snapshot.
874 By setting the per PFS parameter
876 history is guaranteed to be saved at least this time interval.
877 All other history is deleted.
879 The target directory is expected to contain softlinks pointing to
880 snapshots of the file systems you wish to retain.
881 The directory is scanned non-recursively and the mount points and
882 transaction ids stored in the softlinks are extracted and sorted.
883 The file system is then explicitly pruned according to what is found.
884 Cleaning out portions of the file system is as simple as removing a
885 snapshot softlink and then running the
889 As a safety measure pruning only occurs if one or more softlinks are found
892 snapshot id extension.
893 Currently the scanned softlink directory must contain softlinks pointing
897 The softlinks may specify absolute or relative paths.
898 Softlinks must use 20-character
900 transaction ids, as might be returned from
901 .Nm Cm synctid Ar filesystem .
903 Pruning is a per PFS operation, so each PFS in a
905 file system have to be pruned separately.
907 Note that pruning a file system may not immediately free-up space,
908 though typically some space will be freed if a large number of records are
910 The file system must be reblocked to completely recover all available space.
912 Example, lets say your that you didn't set
914 and snapshot directory contains the following links:
915 .Bd -literal -offset indent
916 lrwxr-xr-x 1 root wheel 29 May 31 17:57 snap1 ->
917 /usr/obj/@@0x10d2cd05b7270d16
919 lrwxr-xr-x 1 root wheel 29 May 31 17:58 snap2 ->
920 /usr/obj/@@0x10d2cd13f3fde98f
922 lrwxr-xr-x 1 root wheel 29 May 31 17:59 snap3 ->
923 /usr/obj/@@0x10d2cd222adee364
926 If you were to run the
928 command on this directory, then the
931 mount will be pruned to retain the above three snapshots.
932 In addition, history for modifications made to the file system older than
933 the oldest snapshot will be destroyed and history for potentially fine-grained
934 modifications made to the file system more recently than the most recent
935 snapshot will be retained.
937 If you then delete the
939 softlink and rerun the
942 history for modifications pertaining to that snapshot would be destroyed.
946 file system versions 3+ this command also scans the snapshots stored
947 in the file system meta-data and includes them in the prune.
948 .\" ==== prune-everything ====
949 .It Cm prune-everything Ar filesystem
950 Remove all historical records from
952 Use this directive with caution on PFSs where you intend to use history.
954 This command does not remove snapshot softlinks but will delete all
955 snapshots recorded in file system meta-data (for file system version 3+).
956 The user is responsible for deleting any softlinks.
958 Pruning is a per PFS operation, so each PFS in a
960 file system have to be pruned separately.
961 .\" ==== rebalance ====
962 .It Cm rebalance Ar filesystem Op Ar saturation_percentage
963 Rebalance the B-Tree, nodes with small number of
964 elements will be combined and element counts will be smoothed out
967 The saturation percentage is between 50% and 100%.
968 The default is 85% (the
970 suffix is not needed).
972 Rebalancing is a per PFS operation, so each PFS in a
974 file system have to be rebalanced separately.
976 .It Cm dedup Ar filesystem
979 Perform offline (post-process) deduplication.
980 Deduplication occurs at
981 the block level, currently only data blocks of the same size can be
982 deduped, metadata blocks can not.
983 The hash function used for comparing
984 data blocks is CRC-32 (CRCs are computed anyways as part of
986 data integrity features, so there's no additional overhead).
987 Since CRC is a weak hash function a byte-by-byte comparison is done
988 before actual deduping.
989 In case of a CRC collision (two data blocks have the same CRC
990 but different contents) the checksum is upgraded to SHA-256.
994 reblocker may partially blow up (re-expand) dedup (reblocker's normal
995 operation is to reallocate every record, so it's possible for deduped
996 blocks to be re-expanded back).
998 Deduplication is a per PFS operation, so each PFS in a
1000 file system have to be deduped separately.
1002 means that if you have duplicated data in two different PFSs that data
1003 won't be deduped, however the addition of such feature is planned.
1007 option should be used to limit memory use during the dedup run if the
1008 default 1G limit is too much for the machine.
1009 .\" ==== dedup-simulate ====
1010 .It Cm dedup-simulate Ar filesystem
1011 Shows potential space savings (simulated dedup ratio) one can get after
1015 If the estimated dedup ratio is greater than 1.00 you will see
1016 dedup space savings.
1017 Remember that this is an estimated number, in
1018 practice real dedup ratio will be slightly smaller because of
1020 bigblock underflows, B-Tree locking issues and other factors.
1022 Note that deduplication currently works only on bulk data so if you
1027 commands on a PFS that contains metadata only (directory entries,
1028 softlinks) you will get a 0.00 dedup ratio.
1032 option should be used to limit memory use during the dedup run if the
1033 default 1G limit is too much for the machine.
1034 .\" ==== reblock* ====
1035 .It Cm reblock Ar filesystem Op Ar fill_percentage
1036 .It Cm reblock-btree Ar filesystem Op Ar fill_percentage
1037 .It Cm reblock-inodes Ar filesystem Op Ar fill_percentage
1038 .It Cm reblock-dirs Ar filesystem Op Ar fill_percentage
1039 .It Cm reblock-data Ar filesystem Op Ar fill_percentage
1040 Attempt to defragment and free space for reuse by reblocking a live
1043 Big-blocks cannot be reused by
1045 until they are completely free.
1046 This command also has the effect of reordering all elements, effectively
1047 defragmenting the file system.
1049 The default fill percentage is 100% and will cause the file system to be
1050 completely defragmented.
1051 All specified element types will be reallocated and rewritten.
1052 If you wish to quickly free up space instead try specifying
1053 a smaller fill percentage, such as 90% or 80% (the
1055 suffix is not needed).
1057 Since this command may rewrite the entire contents of the disk it is
1058 best to do it incrementally from a
1064 options to limit the run time.
1065 The file system would thus be defragmented over long period of time.
1067 It is recommended that separate invocations be used for each data type.
1068 B-Tree nodes, inodes, and directories are typically the most important
1069 elements needing defragmentation.
1070 Data can be defragmented over a longer period of time.
1072 Reblocking is a per PFS operation, so each PFS in a
1074 file system have to be reblocked separately.
1075 .\" ==== pfs-status ====
1076 .It Cm pfs-status Ar dirpath ...
1077 Retrieve the mirroring configuration parameters for the specified
1079 file systems or pseudo-filesystems (PFS's).
1080 .\" ==== pfs-master ====
1081 .It Cm pfs-master Ar dirpath Op Ar options
1082 Create a pseudo-filesystem (PFS) inside a
1085 Up to 65536 PFSs can be created.
1086 Each PFS uses an independent inode numbering space making it suitable
1091 directive creates a PFS that you can read, write, and use as a mirroring
1094 A PFS can only be truly destroyed with the
1097 Removing the softlink will not destroy the underlying PFS.
1099 A PFS can only be created in the root PFS (PFS# 0),
1100 not in a PFS created by
1106 It is recommended that
1112 directory at root of
1116 It is recommended to use a
1118 mount to access a PFS, except for root PFS, for more information see
1120 .\" ==== pfs-slave ====
1121 .It Cm pfs-slave Ar dirpath Op Ar options
1122 Create a pseudo-filesystem (PFS) inside a
1125 Up to 65536 PFSs can be created.
1126 Each PFS uses an independent inode numbering space making it suitable
1131 directive creates a PFS that you can use as a mirroring source or target.
1132 You will not be able to access a slave PFS until you have completed the
1133 first mirroring operation with it as the target (its root directory will
1134 not exist until then).
1136 Access to the pfs-slave via the special softlink, as described in the
1137 .Sx PSEUDO-FILESYSTEM (PFS) NOTES
1141 dynamically modify the snapshot transaction id by returning a dynamic result
1146 A PFS can only be truly destroyed with the
1149 Removing the softlink will not destroy the underlying PFS.
1151 A PFS can only be created in the root PFS (PFS# 0),
1152 not in a PFS created by
1158 It is recommended that
1164 directory at root of
1168 It is recommended to use a
1170 mount to access a PFS, except for root PFS, for more information see
1172 .\" ==== pfs-update ====
1173 .It Cm pfs-update Ar dirpath Op Ar options
1174 Update the configuration parameters for an existing
1176 file system or pseudo-filesystem.
1177 Options that may be specified:
1178 .Bl -tag -width indent
1179 .It Cm sync-beg-tid= Ns Ar 0x16llx
1180 This is the automatic snapshot access starting transaction id for
1182 This parameter is normally updated automatically by the
1186 It is important to note that accessing a mirroring slave
1187 with a transaction id greater than the last fully synchronized transaction
1188 id can result in an unreliable snapshot since you will be accessing
1189 data that is still undergoing synchronization.
1191 Manually modifying this field is dangerous and can result in a broken mirror.
1192 .It Cm sync-end-tid= Ns Ar 0x16llx
1193 This is the current synchronization point for mirroring slaves.
1194 This parameter is normally updated automatically by the
1198 Manually modifying this field is dangerous and can result in a broken mirror.
1199 .It Cm shared-uuid= Ns Ar uuid
1200 Set the shared UUID for this file system.
1201 All mirrors must have the same shared UUID.
1202 For safety purposes the
1204 directives will refuse to operate on a target with a different shared UUID.
1206 Changing the shared UUID on an existing, non-empty mirroring target,
1207 including an empty but not completely pruned target,
1208 can lead to corruption of the mirroring target.
1209 .It Cm unique-uuid= Ns Ar uuid
1210 Set the unique UUID for this file system.
1211 This UUID should not be used anywhere else,
1212 even on exact copies of the file system.
1213 .It Cm label= Ns Ar string
1214 Set a descriptive label for this file system.
1215 .It Cm snapshots= Ns Ar string
1216 Specify the snapshots directory which
1219 will use to manage this PFS.
1220 .Bl -tag -width indent
1221 .It Nm HAMMER No version 2-
1222 The snapshots directory does not need to be configured for
1223 PFS masters and will default to
1224 .Pa <pfs>/snapshots .
1226 PFS slaves are mirroring slaves so you cannot configure a snapshots
1227 directory on the slave itself to be managed by the slave's machine.
1228 In fact, the slave will likely have a
1230 sub-directory mirrored
1231 from the master, but that directory contains the configuration the master
1232 is using for its copy of the file system, not the configuration that we
1233 want to use for our slave.
1235 It is recommended that
1236 .Pa <fs>/var/slaves/<name>
1237 be configured for a PFS slave, where
1243 is an appropriate label.
1244 .It Nm HAMMER No version 3+
1245 The snapshots directory does not need to be configured for PFS masters or
1247 The snapshots directory defaults to
1248 .Pa /var/hammer/<pfs>
1249 .Pa ( /var/hammer/root
1253 You can control snapshot retention on your slave independent of the master.
1254 .It Cm snapshots-clear
1257 directory path for this PFS.
1258 .It Cm prune-min= Ns Ar N Ns Cm d
1259 .It Cm prune-min= Ns Oo Ar N Ns Cm d/ Oc Ns \
1260 Ar hh Ns Op Cm \&: Ns Ar mm Ns Op Cm \&: Ns Ar ss
1261 Set the minimum fine-grained data retention period.
1263 always retains fine-grained history up to the most recent snapshot.
1264 You can extend the retention period further by specifying a non-zero
1266 Any snapshot softlinks within the retention period are ignored
1267 for the purposes of pruning (i.e.\& the fine grained history is retained).
1268 Number of days, hours, minutes and seconds are given as
1273 Because the transaction id in the snapshot softlink cannot be used
1274 to calculate a timestamp,
1276 uses the earlier of the
1280 field of the softlink to
1281 determine which snapshots fall within the retention period.
1282 Users must be sure to retain one of these two fields when manipulating
1285 .\" ==== pfs-upgrade ====
1286 .It Cm pfs-upgrade Ar dirpath
1287 Upgrade a PFS from slave to master operation.
1288 The PFS will be rolled back to the current end synchronization transaction id
1289 (removing any partial synchronizations), and will then become writable.
1293 currently supports only single masters and using
1294 this command can easily result in file system corruption
1295 if you don't know what you are doing.
1297 This directive will refuse to run if any programs have open descriptors
1298 in the PFS, including programs chdir'd into the PFS.
1299 .\" ==== pfs-downgrade ====
1300 .It Cm pfs-downgrade Ar dirpath
1301 Downgrade a master PFS from master to slave operation.
1302 The PFS becomes read-only and access will be locked to its
1305 This directive will refuse to run if any programs have open descriptors
1306 in the PFS, including programs chdir'd into the PFS.
1307 .\" ==== pfs-destroy ====
1308 .It Cm pfs-destroy Ar dirpath
1309 This permanently destroys a PFS.
1311 This directive will refuse to run if any programs have open descriptors
1312 in the PFS, including programs chdir'd into the PFS.
1313 As safety measure the
1315 flag have no effect on this directive.
1316 .\" ==== mirror-read ====
1317 .It Cm mirror-read Ar filesystem Op Ar begin-tid
1318 Generate a mirroring stream to stdout.
1319 The stream ends when the transaction id space has been exhausted.
1321 may be a master or slave PFS.
1322 .\" ==== mirror-read-stream ====
1323 .It Cm mirror-read-stream Ar filesystem Op Ar begin-tid
1324 Generate a mirroring stream to stdout.
1325 Upon completion the stream is paused until new data is synced to the
1328 Operation continues until the pipe is broken.
1331 command for more details.
1332 .\" ==== mirror-write ====
1333 .It Cm mirror-write Ar filesystem
1334 Take a mirroring stream on stdin.
1336 must be a slave PFS.
1338 This command will fail if the
1340 configuration field for the two file systems do not match.
1343 command for more details.
1345 If the target PFS does not exist this command will ask you whether
1346 you want to create a compatible PFS slave for the target or not.
1347 .\" ==== mirror-dump ====
1348 .It Ar mirror-dump Ar [header]
1353 to dump an ASCII representation of the mirroring stream.
1356 is specified, only the header information is shown.
1357 .\" ==== mirror-copy ====
1358 .\".It Cm mirror-copy Ar [[user@]host:]filesystem [[user@]host:]filesystem
1359 .It Cm mirror-copy \
1360 Oo Oo Ar user Ns Cm @ Oc Ns Ar host Ns Cm \&: Oc Ns Ar filesystem \
1361 Oo Oo Ar user Ns Cm @ Oc Ns Ar host Ns Cm \&: Oc Ns Ar filesystem
1362 This is a shortcut which pipes a
1367 If a remote host specification is made the program forks a
1369 (or other program as specified by the
1371 environment variable) and execs the
1375 on the appropriate host.
1376 The source may be a master or slave PFS, and the target must be a slave PFS.
1378 This command also establishes full duplex communication and turns on
1379 the 2-way protocol feature
1381 which automatically negotiates transaction id
1382 ranges without having to use a cyclefile.
1383 If the operation completes successfully the target PFS's
1386 Note that you must re-chdir into the target PFS to see the updated information.
1387 If you do not you will still be in the previous snapshot.
1389 If the target PFS does not exist this command will ask you whether
1390 you want to create a compatible PFS slave for the target or not.
1391 .\" ==== mirror-stream ====
1392 .\".It Cm mirror-stream Ar [[user@]host:]filesystem [[user@]host:]filesystem
1393 .It Cm mirror-stream \
1394 Oo Oo Ar user Ns Cm @ Oc Ns Ar host Ns Cm \&: Oc Ns Ar filesystem \
1395 Oo Oo Ar user Ns Cm @ Oc Ns Ar host Ns Cm \&: Oc Ns Ar filesystem
1396 This is a shortcut which pipes a
1397 .Cm mirror-read-stream
1401 This command works similarly to
1403 but does not exit after the initial mirroring completes.
1404 The mirroring operation will resume as changes continue to be made to the
1406 The command is commonly used with
1410 options to keep the mirroring target in sync with the source on a continuing
1413 If the pipe is broken the command will automatically retry after sleeping
1415 The time slept will be 15 seconds plus the time given in the
1419 This command also detects the initial-mirroring case and spends some
1420 time scanning the B-Tree to find good break points, allowing the initial
1421 bulk mirroring operation to be broken down into 4GB pieces.
1422 This means that the user can kill and restart the operation and it will
1423 not have to start from scratch once it has gotten past the first chunk.
1426 option may be used to change the size of pieces and the
1428 option may be used to disable this feature and perform an initial bulk
1430 .\" ==== version ====
1431 .It Cm version Ar filesystem
1432 This command returns the
1434 file system version for the specified
1436 as well as the range of versions supported in the kernel.
1439 option may be used to remove the summary at the end.
1440 .\" ==== version-upgrade ====
1441 .It Cm version-upgrade Ar filesystem Ar version Op Cm force
1447 Once upgraded a file system may not be downgraded.
1448 If you wish to upgrade a file system to a version greater or equal to the
1449 work-in-progress (WIP) version number you must specify the
1452 Use of WIP versions should be relegated to testing and may require wiping
1453 the file system as development progresses, even though the WIP version might
1457 This command operates on the entire
1459 file system and is not a per PFS operation.
1460 All PFS's will be affected.
1461 .Bl -tag -width indent
1464 default version, first
1469 New directory entry layout.
1470 This version is using a new directory hash key.
1473 New snapshot management, using file system meta-data for saving
1474 configuration file and snapshots (transaction ids etc.).
1475 Also default snapshots directory has changed.
1479 New undo/redo/flush, giving
1481 a much faster sync and fsync.
1484 Deduplication support.
1487 Directory hash ALG1.
1488 Tends to maintain inode number / directory name entry ordering better
1489 for files after minor renaming.
1492 .Sh PSEUDO-FILESYSTEM (PFS) NOTES
1493 The root of a PFS is not hooked into the primary
1495 file system as a directory.
1498 creates a special softlink called
1500 (exactly 10 characters long) in the primary
1504 then modifies the contents of the softlink as read by
1506 and thus what you see with an
1508 command or if you were to
1511 If the PFS is a master the link reflects the current state of the PFS.
1512 If the PFS is a slave the link reflects the last completed snapshot, and the
1513 contents of the link will change when the next snapshot is completed, and
1518 utility employs numerous safeties to reduce user foot-shooting.
1521 directive requires that the target be configured as a slave and that the
1523 field of the mirroring source and target match.
1524 .Sh DOUBLE_BUFFER MODE
1525 There is a limit to the number of vnodes the kernel can cache, and because
1526 file buffers are associated with a vnode the related data cache can get
1527 blown away when operating on large numbers of files even if the system has
1528 sufficient memory to hold the file data.
1532 double buffer mode by setting the
1535 .Va vfs.hammer.double_buffer
1538 will cache file data via the block device and copy it into the per-file
1539 buffers as needed. The data will be double-cached at least until the
1540 buffer cache throws away the file buffer.
1541 This mode is typically used in conjunction with
1544 .Va vm.swapcache.data_enable
1545 is turned on in order to prevent unnecessary re-caching of file data
1546 due to vnode recycling.
1547 The swapcache will save the cached VM pages related to
1550 device (which doesn't recycle unless you umount the filesystem) instead
1551 of the cached VM pages backing the file vnodes.
1553 .\"Double buffering should also be turned on if live dedup is enabled via
1554 .\"Va vfs.hammer.live_dedup .
1555 .\"This is because the live dedup must validate the contents of a potential
1556 .\"duplicate file block and it must run through the block device to do that
1557 .\"and not the file vnode.
1558 .\"If double buffering is not enabled then live dedup will create extra disk
1559 .\"reads to validate potential data duplicates.
1560 .Sh UPGRADE INSTRUCTIONS HAMMER V1 TO V2
1561 This upgrade changes the way directory entries are stored.
1562 It is possible to upgrade a V1 file system to V2 in place, but
1563 directories created prior to the upgrade will continue to use
1566 Note that the slave mirroring code in the target kernel had bugs in
1567 V1 which can create an incompatible root directory on the slave.
1570 master created after the upgrade with a
1572 slave created prior to the upgrade.
1574 Any directories created after upgrading will use a new layout.
1575 .Sh UPGRADE INSTRUCTIONS HAMMER V2 TO V3
1576 This upgrade adds meta-data elements to the B-Tree.
1577 It is possible to upgrade a V2 file system to V3 in place.
1578 After issuing the upgrade be sure to run a
1581 to perform post-upgrade tasks.
1583 After making this upgrade running a
1588 directory for each PFS mount into
1589 .Pa /var/hammer/<pfs> .
1592 root mount will migrate
1595 .Pa /var/hammer/root .
1596 Migration occurs only once and only if you have not specified
1597 a snapshots directory in the PFS configuration.
1598 If you have specified a snapshots directory in the PFS configuration no
1599 automatic migration will occur.
1601 For slaves, if you desire, you can migrate your snapshots
1602 config to the new location manually and then clear the
1603 snapshot directory configuration in the slave PFS.
1604 The new snapshots hierarchy is designed to work with
1605 both master and slave PFSs equally well.
1607 In addition, the old config file will be moved to file system meta-data,
1608 editable via the new
1612 The old config file will be deleted.
1613 Migration occurs only once.
1615 The V3 file system has new
1617 directives for creating snapshots.
1618 All snapshot directives, including the original, will create
1619 meta-data entries for the snapshots and the pruning code will
1620 automatically incorporate these entries into its list and
1621 expire them the same way it expires softlinks.
1622 If you by accident blow away your snapshot softlinks you can use the
1624 directive to get a definitive list from the file system meta-data and
1625 regenerate them from that list.
1630 to backup file systems your scripts may be using the
1632 directive to generate transaction ids.
1633 This directive does not create a snapshot.
1634 You will have to modify your scripts to use the
1636 directive to generate the linkbuf for the softlink you create, or
1637 use one of the other
1642 directive will continue to work as expected and in V3 it will also
1643 record the snapshot transaction id in file system meta-data.
1644 You may also want to make use of the new
1646 tag for the meta-data.
1649 If you used to remove snapshot softlinks with
1651 you should probably start using the
1653 directive instead to also remove the related meta-data.
1654 The pruning code scans the meta-data so just removing the
1655 softlink is not sufficient.
1656 .Sh UPGRADE INSTRUCTIONS HAMMER V3 TO V4
1657 This upgrade changes undo/flush, giving faster sync.
1658 It is possible to upgrade a V3 file system to V4 in place.
1659 This upgrade reformats the UNDO/REDO FIFO (typically 1GB),
1660 so upgrade might take a minute or two depending.
1662 Version 4 allows the UNDO/REDO FIFO to be flushed without also having
1663 to flush the volume header, removing 2 of the 4 disk syncs typically
1666 and removing 1 of the 2 disk syncs typically
1667 required for a flush sequence.
1668 Version 4 also implements the REDO log (see
1669 .Sx FSYNC FLUSH MODES
1670 below) which is capable
1671 of fsync()ing with either one disk flush or zero disk flushes.
1672 .Sh UPGRADE INSTRUCTIONS HAMMER V4 TO V5
1673 This upgrade brings in deduplication support.
1674 It is possible to upgrade a V4 file system to V5 in place.
1675 Technically it makes the layer2
1677 field a signed value instead of unsigned, allowing it to go negative.
1678 A version 5 filesystem is required for dedup operation.
1679 .Sh UPGRADE INSTRUCTIONS HAMMER V5 TO V6
1680 It is possible to upgrade a V5 file system to V6 in place.
1681 .Sh FSYNC FLUSH MODES
1683 implements five different fsync flush modes via the
1684 .Va vfs.hammer.fsync_mode
1687 version 4+ file systems.
1691 fsync mode 3 is set by default.
1692 REDO operation and recovery is enabled by default.
1693 .Bl -tag -width indent
1695 Full synchronous fsync semantics without REDO.
1698 will not generate REDOs.
1701 will completely sync
1702 the data and meta-data and double-flush the FIFO, including
1703 issuing two disk synchronization commands.
1704 The data is guaranteed
1705 to be on the media as of when
1708 Needless to say, this is slow.
1710 Relaxed asynchronous fsync semantics without REDO.
1712 This mode works the same as mode 0 except the last disk synchronization
1713 command is not issued.
1714 It is faster than mode 0 but not even remotely
1715 close to the speed you get with mode 2 or mode 3.
1717 Note that there is no chance of meta-data corruption when using this
1718 mode, it simply means that the data you wrote and then
1720 might not have made it to the media if the storage system crashes at a bad
1723 Full synchronous fsync semantics using REDO.
1724 NOTE: If not running a
1726 version 4 filesystem or later mode 0 is silently used.
1729 will generate REDOs in the UNDO/REDO FIFO based on a heuristic.
1730 If this is sufficient to satisfy the
1732 operation the blocks will be written out and
1734 will wait for the I/Os to complete,
1735 and then followup with a disk sync command to guarantee the data
1736 is on the media before returning.
1737 This is slower than mode 3 and can result in significant disk or
1738 SSDs overheads, though not as bad as mode 0 or mode 1.
1740 Relaxed asynchronous fsync semantics using REDO.
1741 NOTE: If not running a
1743 version 4 filesystem or later mode 1 is silently used.
1746 will generate REDOs in the UNDO/REDO FIFO based on a heuristic.
1747 If this is sufficient to satisfy the
1749 operation the blocks
1750 will be written out and
1752 will wait for the I/Os to complete,
1755 issue a disk synchronization command.
1757 Note that there is no chance of meta-data corruption when using this
1758 mode, it simply means that the data you wrote and then
1761 not have made it to the media if the storage system crashes at a bad
1764 This mode is the fastest production fsyncing mode available.
1765 This mode is equivalent to how the UFS fsync in the
1774 This mode is primarily designed
1775 for testing and should not be used on a production system.
1777 .Sh RESTORING FROM A SNAPSHOT BACKUP
1778 You restore a snapshot by copying it over to live, but there is a caveat.
1779 The mtime and atime fields for files accessed via a snapshot is locked
1780 to the ctime in order to keep the snapshot consistent, because neither
1781 mtime nor atime changes roll any history.
1783 In order to avoid unnecessary copying it is recommended that you use
1787 when doing the copyback.
1788 Also make sure you traverse the snapshot softlink by appending a ".",
1789 as in "<snapshotpath>/.", and you match up the directory properly.
1790 .Sh RESTORING A PFS FROM A MIRROR
1791 A PFS can be restored from a mirror with
1794 data must be copied separately.
1795 At last the PFS can be upgraded to master using
1798 It is not possible to restore the root PFS (PFS# 0) by using mirroring,
1799 as the root PFS is always a master PFS.
1800 A normal copy (e.g.\& using
1802 must be done, ignoring history.
1803 If history is important, old root PFS can me restored to a new PFS, and
1804 important directories/files can be
1806 mounted to the new PFS.
1808 The following environment variables affect the execution of
1810 .Bl -tag -width ".Ev EDITOR"
1812 The editor program specified in the variable
1814 will be invoked instead of the default editor, which is
1817 The command specified in the variable
1819 will be used to initiate remote operations for the mirror-copy and
1820 mirror-stream commands instead of the default command, which is
1822 The program will be invoked via
1827 .Cm -l user host <remote-command>
1835 .Bl -tag -width ".It Pa <fs>/var/slaves/<name>" -compact
1836 .It Pa <pfs>/snapshots
1837 default per PFS snapshots directory
1840 .It Pa /var/hammer/<pfs>
1841 default per PFS snapshots directory (not root)
1844 .It Pa /var/hammer/root
1845 default snapshots directory for root directory
1848 .It Pa <snapshots>/config
1855 .It Pa <fs>/var/slaves/<name>
1856 recommended slave PFS snapshots directory
1860 recommended PFS directory
1868 .Xr periodic.conf 5 ,
1870 .Xr mount_hammer 8 ,
1872 .Xr newfs_hammer 8 ,
1878 utility first appeared in
1881 .An Matthew Dillon Aq Mt dillon@backplane.com