1 .\" Copyright (c) 2007 The DragonFly Project. All rights reserved.
3 .\" This code is derived from software contributed to The DragonFly Project
4 .\" by Matthew Dillon <dillon@backplane.com>
6 .\" Redistribution and use in source and binary forms, with or without
7 .\" modification, are permitted provided that the following conditions
10 .\" 1. Redistributions of source code must retain the above copyright
11 .\" notice, this list of conditions and the following disclaimer.
12 .\" 2. Redistributions in binary form must reproduce the above copyright
13 .\" notice, this list of conditions and the following disclaimer in
14 .\" the documentation and/or other materials provided with the
16 .\" 3. Neither the name of The DragonFly Project nor the names of its
17 .\" contributors may be used to endorse or promote products derived
18 .\" from this software without specific, prior written permission.
20 .\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
21 .\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
22 .\" LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
23 .\" FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
24 .\" COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
25 .\" INCIDENTAL, SPECIAL, EXEMPLARY OR CONSEQUENTIAL DAMAGES (INCLUDING,
26 .\" BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
27 .\" LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
28 .\" AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
29 .\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
30 .\" OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
38 .Nd HAMMER file system utility
45 .Op Fl C Ar cachesize Ns Op Ns Cm \&: Ns Ar readahead
46 .Op Fl R Ar restrictcmd
47 .Op Fl T Ar restrictpath
49 .Op Fl e Ar scoreboardfile
51 .\" .Op Fl s Ar linkpath
60 This manual page documents the
62 utility which provides miscellaneous functions related to managing a
65 For a general introduction to the
67 file system, its features, and
68 examples on how to set up and maintain one, see
71 The options are as follows:
72 .Bl -tag -width indent
74 Tell the mirror commands to use a 2-way protocol, which allows
75 automatic negotiation of transaction id ranges.
76 This option is automatically enabled by the
82 will not attempt to break-up large initial bulk transfers into smaller
84 This can save time but if the link is lost in the middle of the
85 initial bulk transfer you will have to start over from scratch.
86 For more information see the
90 Specify a bandwidth limit in bytes per second for mirroring streams.
91 This option is typically used to prevent batch mirroring operations from
92 loading down the machine.
93 The bandwidth may be suffixed with
97 to specify values in kilobytes, megabytes, and gigabytes per second.
98 If no suffix is specified, bytes per second is assumed.
100 Unfortunately this is only applicable to the pre-compression bandwidth
101 when compression is used, so a better solution would probably be to
107 .It Fl C Ar cachesize Ns Op Ns Cm \&: Ns Ar readahead
108 Set the memory cache size for any raw
115 for megabytes is allowed,
116 else the cache size is specified in bytes.
118 The read-behind/read-ahead defaults to 4
122 This option is typically only used with diagnostic commands
123 as kernel-supported commands will use the kernel's buffer cache.
124 .It Fl R Ar restrictcmd
125 This option is used by hammer ssh-remote to restrict the command later
126 on in the argument list. Multiple commands may be specified, separated
127 by a comma (all one argument).
128 .It Fl T Ar restrictpath
129 This option is used by hammer ssh-remote to restrict the filesystem path
130 specified later on in the argument list.
131 .It Fl c Ar cyclefile
132 When pruning, rebalancing or reblocking you can tell the utility
133 to start at the object id stored in the specified file.
134 If the file does not exist
136 will start at the beginning.
139 is told to run for a specific period of time
141 and is unable to complete the operation it will write out
142 the current object id so the next run can pick up where it left off.
145 runs to completion it will delete
147 .It Fl e Ar scoreboardfile
148 Update scoreboard file with progress, primarily used by mirror-stream.
153 will not check that time period has elapsed if this option is given.
155 Specify the volumes making up a
159 is a colon-separated list of devices, each specifying a
165 Specify delay in seconds for
166 .Cm mirror-read-stream .
167 When maintaining a streaming mirroring this option specifies the
168 minimum delay after a batch ends before the next batch is allowed
170 The default is five seconds.
172 Specify the maximum amount of memory
174 will allocate during a dedup pass.
175 Specify a suffix 'm', 'g', or 't' for megabytes, gigabytes, or terrabytes.
178 will allocate up to 1G of ram to hold CRC/SHA tables while running dedup.
179 When the limit is reached the dedup code restricts the range of CRCs to
180 keep memory use within bounds and runs multiple passes as necessary until
181 the entire filesystem has been deduped.
188 specification for the source and/or destination.
190 Decrease verboseness.
191 May be specified multiple times.
193 Specify recursion for those commands which support it.
194 .It Fl S Ar splitsize
195 Specify the bulk splitup size in bytes for mirroring streams.
200 will do an initial run-through of the data to calculate good
201 transaction ids to cut up the bulk transfers, creating
202 restart points in case the stream is interrupted.
203 If we don't do this and the stream is interrupted it might
204 have to start all over again.
209 At the moment the run-through is disk-bandwidth-heavy but some
210 future version will limit the run-through to just the B-Tree
211 records and not the record data.
213 The splitsize may be suffixed with
217 to specify values in kilobytes, megabytes, or gigabytes.
218 If no suffix is specified, bytes is assumed.
220 When mirroring very large filesystems the minimum recommended
222 A small split size may wind up generating a great deal of overhead
223 but very little actual incremental data and is not recommended.
225 Specify timeout in seconds.
226 When pruning, rebalancing, reblocking or mirror-reading
227 you can tell the utility to stop after a certain period of time.
228 A value of 0 means unlimited.
229 This option is used along with the
231 option to prune, rebalance or reblock incrementally.
233 Increase verboseness.
234 May be specified multiple times.
236 Enable compression for any remote ssh specifications.
237 This option is typically used with the mirroring directives.
241 for interactive questions.
244 The commands are as follows:
245 .Bl -tag -width indent
246 .\" ==== synctid ====
247 .It Cm synctid Ar filesystem Op Cm quick
248 Generate a guaranteed, formal 64-bit transaction id representing the
249 current state of the specified
252 The file system will be synced to the media.
256 keyword is specified the file system will be soft-synced, meaning that a
257 crash might still undo the state of the file system as of the transaction
258 id returned but any new modifications will occur after the returned
259 transaction id as expected.
261 This operation does not create a snapshot.
262 It is meant to be used
263 to track temporary fine-grained changes to a subset of files and
264 will only remain valid for
266 access purposes for the
268 period configured for the PFS.
269 If you desire a real snapshot then the
271 directive may be what you are looking for.
273 .It Cm bstats Op Ar interval
276 B-Tree statistics until interrupted.
279 seconds between each display.
280 The default interval is one second.
281 .\" ==== iostats ====
282 .It Cm iostats Op Ar interval
286 statistics until interrupted.
289 seconds between each display.
290 The default interval is one second.
291 .\" ==== history ====
292 .It Cm history Ns Oo Cm @ Ns Ar offset Ns Oo Cm \&, Ns Ar length Oc Oc Ar path ...
293 Show the modification history for inode and data of
298 is given history is shown for data block at given offset,
299 otherwise history is shown for inode.
304 data bytes at given offset are dumped for each version,
309 this directive shows object id and sync status,
310 and for each object version it shows transaction id and time stamp.
311 Files has to exist for this directive to be applicable,
312 to track inodes which has been deleted or renamed see
314 .\" ==== blockmap ====
316 Dump the blockmap for the file system.
319 blockmap is two-layer
320 blockmap representing the maximum possible file system size of 1 Exabyte.
321 Needless to say the second layer is only present for blocks which exist.
323 blockmap represents 8-Megabyte blocks, called big-blocks.
324 Each big-block has an append
325 point, a free byte count, and a typed zone id which allows content to be
326 reverse engineered to some degree.
330 allocations are essentially appended to a selected big-block using
331 the append offset and deducted from the free byte count.
332 When space is freed the free byte count is adjusted but
334 does not track holes in big-blocks for reallocation.
335 A big-block must be completely freed, either
336 through normal file system operations or through reblocking, before
339 Data blocks can be shared by deducting the space used from the free byte
340 count for each shared references.
341 This means the free byte count can legally go negative.
343 This command needs the
346 .\" ==== checkmap ====
348 Check the blockmap allocation count.
350 will scan the B-Tree, collect allocation information, and
351 construct a blockmap in-memory.
352 It will then check that blockmap against the on-disk blockmap.
354 This command needs the
358 .It Cm show Op Ar localization Ns Op Cm \&: Ns Ar object_id
360 By default this command will validate all B-Tree
361 linkages and CRCs, including data CRCs, and will report the most verbose
362 information it can dig up.
363 Any errors will show up with a
365 in column 1 along with various
371 .Ar localization Ns Cm \&: Ns Ar object_id
373 search for the key printing nodes as it recurses down, and then
374 will iterate forwards.
375 These fields are specified in HEX.
376 Note that the pfsid is the top 16 bits of the 32-bit localization
377 field so PFS #1 would be 00010000.
381 the command will report less information about the inode contents.
385 the command will not report the content of the inode or other typed
390 the command will not report volume header information, big-block fill
391 ratios, mirror transaction ids, or report or check data CRCs.
392 B-Tree CRCs and linkages are still checked.
394 This command needs the
397 .\" ==== show-undo ====
401 Dump the UNDO/REDO map.
403 This command needs the
407 .\" Dump the B-Tree, record, large-data, and small-data blockmaps, showing
408 .\" physical block assignments and free space percentages.
409 .\" ==== ssh-remote ====
410 .It Cm ssh-remote Ar command Ar targetdir
411 Used in a ssh authorized_keys line such as
412 command="/sbin/hammer ssh-remote mirror-read /fubarmount" ... to allow
413 mirror-read or mirror-write access to a particular subdirectory tree.
414 This way you do not have to give shell access to the remote box.
416 will obtain the original command line from the SSH_ORIGINAL_COMMAND
417 environment variable, validate it against the restriction, and then
418 re-exec hammer with the validated arguments.
420 The remote hammer command does not allow the
424 options to be passed in.
425 .\" ==== recover ====
426 .It Cm recover Ar targetdir
427 Recover data from a corrupted
430 This is a low level command which operates on the filesystem image and
431 attempts to locate and recover files from a corrupted filesystem.
432 The entire image is scanned linearly looking for B-Tree nodes.
434 found which passes its CRC test is scanned for file, inode, and directory
435 fragments and the target directory is populated with the resulting data.
436 files and directories in the target directory are initially named after
437 the object id and are renamed as fragmentary information is processed.
439 This command keeps track of filename/object_id translations and may eat a
440 considerably amount of memory while operating.
442 This command is literally the last line of defense when it comes to
443 recovering data from a dead filesystem.
445 This command needs the
448 .\" ==== namekey1 ====
449 .It Cm namekey1 Ar filename
452 64-bit directory hash for the specified file name, using
453 the original directory hash algorithm in version 1 of the file system.
454 The low 32 bits are used as an iterator for hash collisions and will be
456 .\" ==== namekey2 ====
457 .It Cm namekey2 Ar filename
460 64-bit directory hash for the specified file name, using
461 the new directory hash algorithm in version 2 of the file system.
462 The low 32 bits are still used as an iterator but will start out containing
463 part of the hash key.
464 .\" ==== namekey32 ====
465 .It Cm namekey32 Ar filename
466 Generate the top 32 bits of a
468 64 bit directory hash for the specified file name.
471 Show extended information about
474 The information is divided into sections:
475 .Bl -tag -width indent
476 .It Volume identification
477 General information, like the label of the
479 filesystem, the number of volumes it contains, the FSID, and the
482 .It Big block information
483 Big block statistics, such as total, used, reserved and free big blocks.
484 .It Space information
485 Information about space used on the filesystem.
486 Currently total size, used, reserved and free space are displayed.
488 Basic information about the PFSs currently present on a
493 is the ID of the PFS, with 0 being the root PFS.
495 is the current snapshot count on the PFS.
497 displays the mount point of the PFS is currently mounted on (if any).
499 .\" ==== cleanup ====
500 .It Cm cleanup Op Ar filesystem ...
501 This is a meta-command which executes snapshot, prune, rebalance, dedup
502 and reblock commands on the specified
507 is specified this command will clean-up all
509 file systems in use, including PFS's.
510 To do this it will scan all
514 mounts, extract PFS id's, and clean-up each PFS found.
516 This command will access a snapshots
517 directory and a configuration file for each
519 creating them if necessary.
520 .Bl -tag -width indent
521 .It Nm HAMMER No version 2-
522 The configuration file is
524 in the snapshots directory which defaults to
525 .Pa <pfs>/snapshots .
526 .It Nm HAMMER No version 3+
527 The configuration file is saved in file system meta-data, see
530 The snapshots directory defaults to
531 .Pa /var/hammer/<pfs>
532 .Pa ( /var/hammer/root
536 The format of the configuration file is:
537 .Bd -literal -offset indent
538 snapshots <period> <retention-time> [any]
539 prune <period> <max-runtime>
540 rebalance <period> <max-runtime>
541 dedup <period> <max-runtime>
542 reblock <period> <max-runtime>
543 recopy <period> <max-runtime>
547 .Bd -literal -offset indent
548 snapshots 1d 60d # 0d 0d for PFS /tmp, /var/tmp, /usr/obj
556 Time is given with a suffix of
562 meaning day, hour, minute and second.
566 directive has a period of 0 and a retention time of 0
567 then snapshot generation is disabled, removal of old snapshots are
568 disabled, and prunes will use
569 .Cm prune-everything .
573 directive has a period of 0 but a non-zero retention time
574 then this command will not create any new snapshots but will remove old
575 snapshots it finds based on the retention time.
577 used on PFS masters where you are generating your own snapshot softlinks
578 manually and on PFS slaves when all you wish to do is prune away existing
579 snapshots inherited via the mirroring stream.
581 By default only snapshots in the form
582 .Ql snap- Ns Ar yyyymmdd Ns Op - Ns Ar HHMM
586 directive is specified as a third argument on the
588 config line then any softlink of the form
589 .Ql *- Ns Ar yyyymmdd Ns Op - Ns Ar HHMM
591 .Ql *. Ns Ar yyyymmdd Ns Op - Ns Ar HHMM
594 A period of 0 for prune, rebalance, dedup, reblock or recopy disables the directive.
595 A max-runtime of 0 means unlimited.
597 If period hasn't passed since the previous
600 For example a day has passed when midnight is passed (localtime).
603 flag is given the period is ignored.
611 The default configuration file will create a daily snapshot, do a daily
612 pruning, rebalancing, deduping and reblocking run and a monthly recopy run.
613 Reblocking is defragmentation with a level of 95%,
614 and recopy is full defragmentation.
616 By default prune, dedup and rebalance operations are time limited to 5 minutes,
617 and reblock operations to a bit over 5 minutes,
618 and recopy operations to a bit over 10 minutes.
619 Reblocking and recopy runs are each broken down into four separate functions:
620 btree, inodes, dirs and data.
621 Each function is time limited to the time given in the configuration file,
622 but the btree, inodes and dirs functions usually does not take very long time,
623 full defragmentation is always used for these three functions.
624 Also note that this directive will by default disable snapshots on
631 The defaults may be adjusted by modifying the configuration file.
632 The pruning and reblocking commands automatically maintain a cyclefile
633 for incremental operation.
634 If you interrupt (^C) the program the cyclefile will be updated,
636 may continue to run in the background for a few seconds until the
638 ioctl detects the interrupt.
641 PFS option can be set to use another location for the snapshots directory.
643 Work on this command is still in progress.
645 An ability to remove snapshots dynamically as the
646 file system becomes full.
648 .It Cm config Op Ar filesystem Op Ar configfile
651 Show or change configuration for
653 If zero or one arguments are specified this function dumps the current
654 configuration file to stdout.
655 Zero arguments specifies the PFS containing the current directory.
656 This configuration file is stored in file system meta-data.
657 If two arguments are specified this function installs a new config file.
661 versions less than 3 the configuration file is by default stored in
662 .Pa <pfs>/snapshots/config ,
663 but in all later versions the configuration file is stored in file system
665 .\" ==== viconfig ====
666 .It Cm viconfig Op Ar filesystem
669 Edit the configuration file and reinstall into file system meta-data when done.
670 Zero arguments specifies the PFS containing the current directory.
671 .\" ==== volume-add ====
672 .It Cm volume-add Ar device Ar filesystem
679 and add all of its space to
683 file system can use up to 256 volumes.
686 All existing data contained on
688 will be destroyed by this operation!
693 file system, formatting will be denied.
694 You can overcome this sanity check by using
696 to erase the beginning sectors of the device.
698 Remember that you have to specify
700 together with any other device that make up the file system,
707 is root file system, also remember to add
710 .Va vfs.root.mountfrom
712 .Pa /boot/loader.conf ,
715 .\" ==== volume-del ====
716 .It Cm volume-del Ar device Ar filesystem
722 Remember that you have to remove
724 from the colon-separated list in
730 is root file system, also remember to remove
733 .Va vfs.root.mountfrom
735 .Pa /boot/loader.conf ,
738 .\" ==== volume-list ====
739 .It Cm volume-list Ar filesystem
740 List the volumes that make up
742 .\" ==== snapshot ====
743 .It Cm snapshot Oo Ar filesystem Oc Ar snapshot-dir
744 .It Cm snapshot Ar filesystem Ar snapshot-dir Op Ar note
745 Take a snapshot of the file system either explicitly given by
747 or implicitly derived from the
749 argument and creates a symlink in the directory provided by
751 pointing to the snapshot.
754 is not a directory, it is assumed to be a format string passed to
756 with the current time as parameter.
759 refers to an existing directory, a default format string of
761 is assumed and used as name for the newly created symlink.
763 Snapshot is a per PFS operation, so each PFS in a
765 file system have to be snapshot separately.
767 Example, assuming that
775 are file systems on their own, the following invocations:
776 .Bd -literal -offset indent
777 hammer snapshot /mysnapshots
779 hammer snapshot /mysnapshots/%Y-%m-%d
781 hammer snapshot /obj /mysnapshots/obj-%Y-%m-%d
783 hammer snapshot /usr /my/snaps/usr "note"
786 Would create symlinks similar to:
787 .Bd -literal -offset indent
788 /mysnapshots/snap-20080627-1210 -> /@@0x10d2cd05b7270d16
790 /mysnapshots/2008-06-27 -> /@@0x10d2cd05b7270d16
792 /mysnapshots/obj-2008-06-27 -> /obj@@0x10d2cd05b7270d16
794 /my/snaps/usr/snap-20080627-1210 -> /usr@@0x10d2cd05b7270d16
799 version 3+ file system the snapshot is also recorded in file system meta-data
800 along with the optional
806 .It Cm snap Ar path Op Ar note
809 Create a snapshot for the PFS containing
811 and create a snapshot softlink.
812 If the path specified is a
813 directory a standard snapshot softlink will be created in the directory.
814 The snapshot softlink points to the base of the mounted PFS.
815 .It Cm snaplo Ar path Op Ar note
818 Create a snapshot for the PFS containing
820 and create a snapshot softlink.
821 If the path specified is a
822 directory a standard snapshot softlink will be created in the directory.
823 The snapshot softlink points into the directory it is contained in.
824 .It Cm snapq Ar dir Op Ar note
827 Create a snapshot for the PFS containing the specified directory but do
828 not create a softlink.
829 Instead output a path which can be used to access
830 the directory via the snapshot.
832 An absolute or relative path may be specified.
833 The path will be used as-is as a prefix in the path output to stdout.
835 snap and snapshot directives the snapshot transaction id will be registered
836 in the file system meta-data.
837 .It Cm snaprm Ar path Ar ...
838 .It Cm snaprm Ar transaction_id Ar ...
839 .It Cm snaprm Ar filesystem Ar transaction_id Ar ...
842 Remove a snapshot given its softlink or transaction id.
843 If specifying a transaction id
844 the snapshot is removed from file system meta-data but you are responsible
845 for removing any related softlinks.
847 If a softlink path is specified the filesystem and transaction id
848 is derived from the contents of the softlink.
849 If just a transaction id is specified it is assumed to be a snapshot in the
851 filesystem you are currently chdir'd into.
852 You can also specify the filesystem and transaction id explicitly.
853 .It Cm snapls Op Ar path ...
856 Dump the snapshot meta-data for PFSs containing each
858 listing all available snapshots and their notes.
859 If no arguments are specified snapshots for the PFS containing the
860 current directory are listed.
861 This is the definitive list of snapshots for the file system.
863 .It Cm prune Ar softlink-dir
864 Prune the file system based on previously created snapshot softlinks.
865 Pruning is the act of deleting file system history.
868 command will delete file system history such that
869 the file system state is retained for the given snapshots,
870 and all history after the latest snapshot.
871 By setting the per PFS parameter
873 history is guaranteed to be saved at least this time interval.
874 All other history is deleted.
876 The target directory is expected to contain softlinks pointing to
877 snapshots of the file systems you wish to retain.
878 The directory is scanned non-recursively and the mount points and
879 transaction ids stored in the softlinks are extracted and sorted.
880 The file system is then explicitly pruned according to what is found.
881 Cleaning out portions of the file system is as simple as removing a
882 snapshot softlink and then running the
886 As a safety measure pruning only occurs if one or more softlinks are found
889 snapshot id extension.
890 Currently the scanned softlink directory must contain softlinks pointing
894 The softlinks may specify absolute or relative paths.
895 Softlinks must use 20-character
897 transaction ids, as might be returned from
898 .Nm Cm synctid Ar filesystem .
900 Pruning is a per PFS operation, so each PFS in a
902 file system have to be pruned separately.
904 Note that pruning a file system may not immediately free-up space,
905 though typically some space will be freed if a large number of records are
907 The file system must be reblocked to completely recover all available space.
909 Example, lets say your that you didn't set
911 and snapshot directory contains the following links:
912 .Bd -literal -offset indent
913 lrwxr-xr-x 1 root wheel 29 May 31 17:57 snap1 ->
914 /usr/obj/@@0x10d2cd05b7270d16
916 lrwxr-xr-x 1 root wheel 29 May 31 17:58 snap2 ->
917 /usr/obj/@@0x10d2cd13f3fde98f
919 lrwxr-xr-x 1 root wheel 29 May 31 17:59 snap3 ->
920 /usr/obj/@@0x10d2cd222adee364
923 If you were to run the
925 command on this directory, then the
928 mount will be pruned to retain the above three snapshots.
929 In addition, history for modifications made to the file system older than
930 the oldest snapshot will be destroyed and history for potentially fine-grained
931 modifications made to the file system more recently than the most recent
932 snapshot will be retained.
934 If you then delete the
936 softlink and rerun the
939 history for modifications pertaining to that snapshot would be destroyed.
943 file system versions 3+ this command also scans the snapshots stored
944 in the file system meta-data and includes them in the prune.
945 .\" ==== prune-everything ====
946 .It Cm prune-everything Ar filesystem
947 Remove all historical records from
949 Use this directive with caution on PFSs where you intend to use history.
951 This command does not remove snapshot softlinks but will delete all
952 snapshots recorded in file system meta-data (for file system version 3+).
953 The user is responsible for deleting any softlinks.
955 Pruning is a per PFS operation, so each PFS in a
957 file system have to be pruned separately.
958 .\" ==== rebalance ====
959 .It Cm rebalance Ar filesystem Op Ar saturation_percentage
960 Rebalance the B-Tree, nodes with small number of
961 elements will be combined and element counts will be smoothed out
964 The saturation percentage is between 50% and 100%.
965 The default is 85% (the
967 suffix is not needed).
969 Rebalancing is a per PFS operation, so each PFS in a
971 file system have to be rebalanced separately.
973 .It Cm dedup Ar filesystem
976 Perform offline (post-process) deduplication.
977 Deduplication occurs at
978 the block level, currently only data blocks of the same size can be
979 deduped, metadata blocks can not.
980 The hash function used for comparing
981 data blocks is CRC-32 (CRCs are computed anyways as part of
983 data integrity features, so there's no additional overhead).
984 Since CRC is a weak hash function a byte-by-byte comparison is done
985 before actual deduping.
986 In case of a CRC collision (two data blocks have the same CRC
987 but different contents) the checksum is upgraded to SHA-256.
991 reblocker may partially blow up (re-expand) dedup (reblocker's normal
992 operation is to reallocate every record, so it's possible for deduped
993 blocks to be re-expanded back).
995 Deduplication is a per PFS operation, so each PFS in a
997 file system have to be deduped separately.
999 means that if you have duplicated data in two different PFSs that data
1000 won't be deduped, however the addition of such feature is planned.
1004 option should be used to limit memory use during the dedup run if the
1005 default 1G limit is too much for the machine.
1006 .\" ==== dedup-simulate ====
1007 .It Cm dedup-simulate Ar filesystem
1008 Shows potential space savings (simulated dedup ratio) one can get after
1012 If the estimated dedup ratio is greater than 1.00 you will see
1013 dedup space savings.
1014 Remember that this is an estimated number, in
1015 practice real dedup ratio will be slightly smaller because of
1017 bigblock underflows, B-Tree locking issues and other factors.
1019 Note that deduplication currently works only on bulk data so if you
1024 commands on a PFS that contains metadata only (directory entries,
1025 softlinks) you will get a 0.00 dedup ratio.
1029 option should be used to limit memory use during the dedup run if the
1030 default 1G limit is too much for the machine.
1031 .\" ==== reblock* ====
1032 .It Cm reblock Ar filesystem Op Ar fill_percentage
1033 .It Cm reblock-btree Ar filesystem Op Ar fill_percentage
1034 .It Cm reblock-inodes Ar filesystem Op Ar fill_percentage
1035 .It Cm reblock-dirs Ar filesystem Op Ar fill_percentage
1036 .It Cm reblock-data Ar filesystem Op Ar fill_percentage
1037 Attempt to defragment and free space for reuse by reblocking a live
1040 Big-blocks cannot be reused by
1042 until they are completely free.
1043 This command also has the effect of reordering all elements, effectively
1044 defragmenting the file system.
1046 The default fill percentage is 100% and will cause the file system to be
1047 completely defragmented.
1048 All specified element types will be reallocated and rewritten.
1049 If you wish to quickly free up space instead try specifying
1050 a smaller fill percentage, such as 90% or 80% (the
1052 suffix is not needed).
1054 Since this command may rewrite the entire contents of the disk it is
1055 best to do it incrementally from a
1061 options to limit the run time.
1062 The file system would thus be defragmented over long period of time.
1064 It is recommended that separate invocations be used for each data type.
1065 B-Tree nodes, inodes, and directories are typically the most important
1066 elements needing defragmentation.
1067 Data can be defragmented over a longer period of time.
1069 Reblocking is a per PFS operation, so each PFS in a
1071 file system have to be reblocked separately.
1072 .\" ==== pfs-status ====
1073 .It Cm pfs-status Ar dirpath ...
1074 Retrieve the mirroring configuration parameters for the specified
1076 file systems or pseudo-filesystems (PFS's).
1077 .\" ==== pfs-master ====
1078 .It Cm pfs-master Ar dirpath Op Ar options
1079 Create a pseudo-filesystem (PFS) inside a
1082 Up to 65536 PFSs can be created.
1083 Each PFS uses an independent inode numbering space making it suitable
1088 directive creates a PFS that you can read, write, and use as a mirroring
1091 A PFS can only be truly destroyed with the
1094 Removing the softlink will not destroy the underlying PFS.
1096 A PFS can only be created in the root PFS (PFS# 0),
1097 not in a PFS created by
1103 It is recommended that
1109 directory at root of
1113 It is recommended to use a
1115 mount to access a PFS, except for root PFS, for more information see
1117 .\" ==== pfs-slave ====
1118 .It Cm pfs-slave Ar dirpath Op Ar options
1119 Create a pseudo-filesystem (PFS) inside a
1122 Up to 65536 PFSs can be created.
1123 Each PFS uses an independent inode numbering space making it suitable
1128 directive creates a PFS that you can use as a mirroring source or target.
1129 You will not be able to access a slave PFS until you have completed the
1130 first mirroring operation with it as the target (its root directory will
1131 not exist until then).
1133 Access to the pfs-slave via the special softlink, as described in the
1138 dynamically modify the snapshot transaction id by returning a dynamic result
1143 A PFS can only be truly destroyed with the
1146 Removing the softlink will not destroy the underlying PFS.
1148 A PFS can only be created in the root PFS (PFS# 0),
1149 not in a PFS created by
1155 It is recommended that
1161 directory at root of
1165 It is recommended to use a
1167 mount to access a PFS, except for root PFS, for more information see
1169 .\" ==== pfs-update ====
1170 .It Cm pfs-update Ar dirpath Op Ar options
1171 Update the configuration parameters for an existing
1173 file system or pseudo-filesystem.
1174 Options that may be specified:
1175 .Bl -tag -width indent
1176 .It Cm sync-beg-tid= Ns Ar 0x16llx
1177 This is the automatic snapshot access starting transaction id for
1179 This parameter is normally updated automatically by the
1183 It is important to note that accessing a mirroring slave
1184 with a transaction id greater than the last fully synchronized transaction
1185 id can result in an unreliable snapshot since you will be accessing
1186 data that is still undergoing synchronization.
1188 Manually modifying this field is dangerous and can result in a broken mirror.
1189 .It Cm sync-end-tid= Ns Ar 0x16llx
1190 This is the current synchronization point for mirroring slaves.
1191 This parameter is normally updated automatically by the
1195 Manually modifying this field is dangerous and can result in a broken mirror.
1196 .It Cm shared-uuid= Ns Ar uuid
1197 Set the shared UUID for this file system.
1198 All mirrors must have the same shared UUID.
1199 For safety purposes the
1201 directives will refuse to operate on a target with a different shared UUID.
1203 Changing the shared UUID on an existing, non-empty mirroring target,
1204 including an empty but not completely pruned target,
1205 can lead to corruption of the mirroring target.
1206 .It Cm unique-uuid= Ns Ar uuid
1207 Set the unique UUID for this file system.
1208 This UUID should not be used anywhere else,
1209 even on exact copies of the file system.
1210 .It Cm label= Ns Ar string
1211 Set a descriptive label for this file system.
1212 .It Cm snapshots= Ns Ar string
1213 Specify the snapshots directory which
1216 will use to manage this PFS.
1217 .Bl -tag -width indent
1218 .It Nm HAMMER No version 2-
1219 The snapshots directory does not need to be configured for
1220 PFS masters and will default to
1221 .Pa <pfs>/snapshots .
1223 PFS slaves are mirroring slaves so you cannot configure a snapshots
1224 directory on the slave itself to be managed by the slave's machine.
1225 In fact, the slave will likely have a
1227 sub-directory mirrored
1228 from the master, but that directory contains the configuration the master
1229 is using for its copy of the file system, not the configuration that we
1230 want to use for our slave.
1232 It is recommended that
1233 .Pa <fs>/var/slaves/<name>
1234 be configured for a PFS slave, where
1240 is an appropriate label.
1241 .It Nm HAMMER No version 3+
1242 The snapshots directory does not need to be configured for PFS masters or
1244 The snapshots directory defaults to
1245 .Pa /var/hammer/<pfs>
1246 .Pa ( /var/hammer/root
1250 You can control snapshot retention on your slave independent of the master.
1251 .It Cm snapshots-clear
1254 directory path for this PFS.
1255 .It Cm prune-min= Ns Ar N Ns Cm d
1256 .It Cm prune-min= Ns Oo Ar N Ns Cm d/ Oc Ns \
1257 Ar hh Ns Op Cm \&: Ns Ar mm Ns Op Cm \&: Ns Ar ss
1258 Set the minimum fine-grained data retention period.
1260 always retains fine-grained history up to the most recent snapshot.
1261 You can extend the retention period further by specifying a non-zero
1263 Any snapshot softlinks within the retention period are ignored
1264 for the purposes of pruning (i.e.\& the fine grained history is retained).
1265 Number of days, hours, minutes and seconds are given as
1270 Because the transaction id in the snapshot softlink cannot be used
1271 to calculate a timestamp,
1273 uses the earlier of the
1277 field of the softlink to
1278 determine which snapshots fall within the retention period.
1279 Users must be sure to retain one of these two fields when manipulating
1282 .\" ==== pfs-upgrade ====
1283 .It Cm pfs-upgrade Ar dirpath
1284 Upgrade a PFS from slave to master operation.
1285 The PFS will be rolled back to the current end synchronization transaction id
1286 (removing any partial synchronizations), and will then become writable.
1290 currently supports only single masters and using
1291 this command can easily result in file system corruption
1292 if you don't know what you are doing.
1294 This directive will refuse to run if any programs have open descriptors
1295 in the PFS, including programs chdir'd into the PFS.
1296 .\" ==== pfs-downgrade ====
1297 .It Cm pfs-downgrade Ar dirpath
1298 Downgrade a master PFS from master to slave operation.
1299 The PFS becomes read-only and access will be locked to its
1302 This directive will refuse to run if any programs have open descriptors
1303 in the PFS, including programs chdir'd into the PFS.
1304 .\" ==== pfs-destroy ====
1305 .It Cm pfs-destroy Ar dirpath
1306 This permanently destroys a PFS.
1308 This directive will refuse to run if any programs have open descriptors
1309 in the PFS, including programs chdir'd into the PFS.
1310 As safety measure the
1312 flag have no effect on this directive.
1313 .\" ==== mirror-read ====
1314 .It Cm mirror-read Ar filesystem Op Ar begin-tid
1315 Generate a mirroring stream to stdout.
1316 The stream ends when the transaction id space has been exhausted.
1318 may be a master or slave PFS.
1319 .\" ==== mirror-read-stream ====
1320 .It Cm mirror-read-stream Ar filesystem Op Ar begin-tid
1321 Generate a mirroring stream to stdout.
1322 Upon completion the stream is paused until new data is synced to the
1325 Operation continues until the pipe is broken.
1328 command for more details.
1329 .\" ==== mirror-write ====
1330 .It Cm mirror-write Ar filesystem
1331 Take a mirroring stream on stdin.
1333 must be a slave PFS.
1335 This command will fail if the
1337 configuration field for the two file systems do not match.
1340 command for more details.
1342 If the target PFS does not exist this command will ask you whether
1343 you want to create a compatible PFS slave for the target or not.
1344 .\" ==== mirror-dump ====
1350 to dump an ASCII representation of the mirroring stream.
1351 .\" ==== mirror-copy ====
1352 .\".It Cm mirror-copy Ar [[user@]host:]filesystem [[user@]host:]filesystem
1353 .It Cm mirror-copy \
1354 Oo Oo Ar user Ns Cm @ Oc Ns Ar host Ns Cm \&: Oc Ns Ar filesystem \
1355 Oo Oo Ar user Ns Cm @ Oc Ns Ar host Ns Cm \&: Oc Ns Ar filesystem
1356 This is a shortcut which pipes a
1361 If a remote host specification is made the program forks a
1363 (or other program as specified by the
1365 environment variable) and execs the
1369 on the appropriate host.
1370 The source may be a master or slave PFS, and the target must be a slave PFS.
1372 This command also establishes full duplex communication and turns on
1373 the 2-way protocol feature
1375 which automatically negotiates transaction id
1376 ranges without having to use a cyclefile.
1377 If the operation completes successfully the target PFS's
1380 Note that you must re-chdir into the target PFS to see the updated information.
1381 If you do not you will still be in the previous snapshot.
1383 If the target PFS does not exist this command will ask you whether
1384 you want to create a compatible PFS slave for the target or not.
1385 .\" ==== mirror-stream ====
1386 .\".It Cm mirror-stream Ar [[user@]host:]filesystem [[user@]host:]filesystem
1387 .It Cm mirror-stream \
1388 Oo Oo Ar user Ns Cm @ Oc Ns Ar host Ns Cm \&: Oc Ns Ar filesystem \
1389 Oo Oo Ar user Ns Cm @ Oc Ns Ar host Ns Cm \&: Oc Ns Ar filesystem
1390 This is a shortcut which pipes a
1391 .Cm mirror-read-stream
1395 This command works similarly to
1397 but does not exit after the initial mirroring completes.
1398 The mirroring operation will resume as changes continue to be made to the
1400 The command is commonly used with
1404 options to keep the mirroring target in sync with the source on a continuing
1407 If the pipe is broken the command will automatically retry after sleeping
1409 The time slept will be 15 seconds plus the time given in the
1413 This command also detects the initial-mirroring case and spends some
1414 time scanning the B-Tree to find good break points, allowing the initial
1415 bulk mirroring operation to be broken down into 4GB pieces.
1416 This means that the user can kill and restart the operation and it will
1417 not have to start from scratch once it has gotten past the first chunk.
1420 option may be used to change the size of pieces and the
1422 option may be used to disable this feature and perform an initial bulk
1424 .\" ==== version ====
1425 .It Cm version Ar filesystem
1426 This command returns the
1428 file system version for the specified
1430 as well as the range of versions supported in the kernel.
1433 option may be used to remove the summary at the end.
1434 .\" ==== version-upgrade ====
1435 .It Cm version-upgrade Ar filesystem Ar version Op Cm force
1441 Once upgraded a file system may not be downgraded.
1442 If you wish to upgrade a file system to a version greater or equal to the
1443 work-in-progress (WIP) version number you must specify the
1446 Use of WIP versions should be relegated to testing and may require wiping
1447 the file system as development progresses, even though the WIP version might
1451 This command operates on the entire
1453 file system and is not a per PFS operation.
1454 All PFS's will be affected.
1455 .Bl -tag -width indent
1458 default version, first
1463 New directory entry layout.
1464 This version is using a new directory hash key.
1467 New snapshot management, using file system meta-data for saving
1468 configuration file and snapshots (transaction ids etc.).
1469 Also default snapshots directory has changed.
1473 New undo/redo/flush, giving
1475 a much faster sync and fsync.
1478 Deduplication support.
1481 Directory hash ALG1.
1482 Tends to maintain inode number / directory name entry ordering better
1483 for files after minor renaming.
1486 .Sh PSEUDO-FILESYSTEM (PFS) NOTES
1487 The root of a PFS is not hooked into the primary
1489 file system as a directory.
1492 creates a special softlink called
1494 (exactly 10 characters long) in the primary
1498 then modifies the contents of the softlink as read by
1500 and thus what you see with an
1502 command or if you were to
1505 If the PFS is a master the link reflects the current state of the PFS.
1506 If the PFS is a slave the link reflects the last completed snapshot, and the
1507 contents of the link will change when the next snapshot is completed, and
1512 utility employs numerous safeties to reduce user foot-shooting.
1515 directive requires that the target be configured as a slave and that the
1517 field of the mirroring source and target match.
1518 .Sh DOUBLE_BUFFER MODE
1519 There is a limit to the number of vnodes the kernel can cache, and because
1520 file buffers are associated with a vnode the related data cache can get
1521 blown away when operating on large numbers of files even if the system has
1522 sufficient memory to hold the file data.
1526 double buffer mode by setting the
1529 .Va vfs.hammer.double_buffer
1532 will cache file data via the block device and copy it into the per-file
1533 buffers as needed. The data will be double-cached at least until the
1534 buffer cache throws away the file buffer.
1535 This mode is typically used in conjunction with
1538 .Va vm.swapcache.data_enable
1539 is turned on in order to prevent unnecessary re-caching of file data
1540 due to vnode recycling.
1541 The swapcache will save the cached VM pages related to
1544 device (which doesn't recycle unless you umount the filesystem) instead
1545 of the cached VM pages backing the file vnodes.
1547 .\"Double buffering should also be turned on if live dedup is enabled via
1548 .\"Va vfs.hammer.live_dedup .
1549 .\"This is because the live dedup must validate the contents of a potential
1550 .\"duplicate file block and it must run through the block device to do that
1551 .\"and not the file vnode.
1552 .\"If double buffering is not enabled then live dedup will create extra disk
1553 .\"reads to validate potential data duplicates.
1554 .Sh UPGRADE INSTRUCTIONS HAMMER V1 TO V2
1555 This upgrade changes the way directory entries are stored.
1556 It is possible to upgrade a V1 file system to V2 in place, but
1557 directories created prior to the upgrade will continue to use
1560 Note that the slave mirroring code in the target kernel had bugs in
1561 V1 which can create an incompatible root directory on the slave.
1564 master created after the upgrade with a
1566 slave created prior to the upgrade.
1568 Any directories created after upgrading will use a new layout.
1569 .Sh UPGRADE INSTRUCTIONS HAMMER V2 TO V3
1570 This upgrade adds meta-data elements to the B-Tree.
1571 It is possible to upgrade a V2 file system to V3 in place.
1572 After issuing the upgrade be sure to run a
1575 to perform post-upgrade tasks.
1577 After making this upgrade running a
1582 directory for each PFS mount into
1583 .Pa /var/hammer/<pfs> .
1586 root mount will migrate
1589 .Pa /var/hammer/root .
1590 Migration occurs only once and only if you have not specified
1591 a snapshots directory in the PFS configuration.
1592 If you have specified a snapshots directory in the PFS configuration no
1593 automatic migration will occur.
1595 For slaves, if you desire, you can migrate your snapshots
1596 config to the new location manually and then clear the
1597 snapshot directory configuration in the slave PFS.
1598 The new snapshots hierarchy is designed to work with
1599 both master and slave PFSs equally well.
1601 In addition, the old config file will be moved to file system meta-data,
1602 editable via the new
1606 The old config file will be deleted.
1607 Migration occurs only once.
1609 The V3 file system has new
1611 directives for creating snapshots.
1612 All snapshot directives, including the original, will create
1613 meta-data entries for the snapshots and the pruning code will
1614 automatically incorporate these entries into its list and
1615 expire them the same way it expires softlinks.
1616 If you by accident blow away your snapshot softlinks you can use the
1618 directive to get a definitive list from the file system meta-data and
1619 regenerate them from that list.
1624 to backup file systems your scripts may be using the
1626 directive to generate transaction ids.
1627 This directive does not create a snapshot.
1628 You will have to modify your scripts to use the
1630 directive to generate the linkbuf for the softlink you create, or
1631 use one of the other
1636 directive will continue to work as expected and in V3 it will also
1637 record the snapshot transaction id in file system meta-data.
1638 You may also want to make use of the new
1640 tag for the meta-data.
1643 If you used to remove snapshot softlinks with
1645 you should probably start using the
1647 directive instead to also remove the related meta-data.
1648 The pruning code scans the meta-data so just removing the
1649 softlink is not sufficient.
1650 .Sh UPGRADE INSTRUCTIONS HAMMER V3 TO V4
1651 This upgrade changes undo/flush, giving faster sync.
1652 It is possible to upgrade a V3 file system to V4 in place.
1653 This upgrade reformats the UNDO/REDO FIFO (typically 1GB),
1654 so upgrade might take a minute or two depending.
1656 Version 4 allows the UNDO/REDO FIFO to be flushed without also having
1657 to flush the volume header, removing 2 of the 4 disk syncs typically
1660 and removing 1 of the 2 disk syncs typically
1661 required for a flush sequence.
1662 Version 4 also implements the REDO log (see
1663 .Sx FSYNC FLUSH MODES
1664 below) which is capable
1665 of fsync()ing with either one disk flush or zero disk flushes.
1666 .Sh UPGRADE INSTRUCTIONS HAMMER V4 TO V5
1667 This upgrade brings in deduplication support.
1668 It is possible to upgrade a V4 file system to V5 in place.
1669 Technically it makes the layer2
1671 field a signed value instead of unsigned, allowing it to go negative.
1672 A version 5 filesystem is required for dedup operation.
1673 .Sh UPGRADE INSTRUCTIONS HAMMER V5 TO V6
1674 It is possible to upgrade a V5 file system to V6 in place.
1675 .Sh FSYNC FLUSH MODES
1677 implements five different fsync flush modes via the
1678 .Va vfs.hammer.fsync_mode
1681 version 4+ file systems.
1685 fsync mode 3 is set by default.
1686 REDO operation and recovery is enabled by default.
1687 .Bl -tag -width indent
1689 Full synchronous fsync semantics without REDO.
1692 will not generate REDOs.
1695 will completely sync
1696 the data and meta-data and double-flush the FIFO, including
1697 issuing two disk synchronization commands.
1698 The data is guaranteed
1699 to be on the media as of when
1702 Needless to say, this is slow.
1704 Relaxed asynchronous fsync semantics without REDO.
1706 This mode works the same as mode 0 except the last disk synchronization
1707 command is not issued.
1708 It is faster than mode 0 but not even remotely
1709 close to the speed you get with mode 2 or mode 3.
1711 Note that there is no chance of meta-data corruption when using this
1712 mode, it simply means that the data you wrote and then
1714 might not have made it to the media if the storage system crashes at a bad
1718 Full synchronous fsync semantics using REDO.
1719 NOTE: If not running a
1721 version 4 filesystem or later mode 0 is silently used.
1724 will generate REDOs in the UNDO/REDO FIFO based on a heuristic.
1725 If this is sufficient to satisfy the
1727 operation the blocks will be written out and
1729 will wait for the I/Os to complete,
1730 and then followup with a disk sync command to guarantee the data
1731 is on the media before returning.
1732 This is slower than mode 3 and can result in significant disk or
1733 SSDs overheads, though not as bad as mode 0 or mode 1.
1736 Relaxed asynchronous fsync semantics using REDO.
1737 NOTE: If not running a
1739 version 4 filesystem or later mode 1 is silently used.
1742 will generate REDOs in the UNDO/REDO FIFO based on a heuristic.
1743 If this is sufficient to satisfy the
1745 operation the blocks
1746 will be written out and
1748 will wait for the I/Os to complete,
1751 issue a disk synchronization command.
1753 Note that there is no chance of meta-data corruption when using this
1754 mode, it simply means that the data you wrote and then
1757 not have made it to the media if the storage system crashes at a bad
1760 This mode is the fastest production fsyncing mode available.
1761 This mode is equivalent to how the UFS fsync in the
1771 This mode is primarily designed
1772 for testing and should not be used on a production system.
1774 .Sh RESTORING FROM A SNAPSHOT BACKUP
1775 You restore a snapshot by copying it over to live, but there is a caveat.
1776 The mtime and atime fields for files accessed via a snapshot is locked
1777 to the ctime in order to keep the snapshot consistent, because neither
1778 mtime nor atime changes roll any history.
1780 In order to avoid unnecessary copying it is recommended that you use
1784 when doing the copyback.
1785 Also make sure you traverse the snapshot softlink by appending a ".",
1786 as in "<snapshotpath>/.", and you match up the directory properly.
1787 .Sh RESTORING A PFS FROM A MIRROR
1788 A PFS can be restored from a mirror with
1791 data must be copied separately.
1792 At last the PFS can be upgraded to master using
1795 It is not possible to restore the root PFS (PFS# 0) by using mirroring,
1796 as the root PFS is always a master PFS.
1797 A normal copy (e.g.\& using
1799 must be done, ignoring history.
1800 If history is important, old root PFS can me restored to a new PFS, and
1801 important directories/files can be
1803 mounted to the new PFS.
1805 The following environment variables affect the execution of
1807 .Bl -tag -width ".Ev EDITOR"
1809 The editor program specified in the variable
1811 will be invoked instead of the default editor, which is
1814 The command specified in the variable
1816 will be used to initiate remote operations for the mirror-copy and
1817 mirror-stream commands instead of the default command, which is
1819 The program will be invoked via
1824 .Cm -l user host <remote-command>
1832 .Bl -tag -width ".It Pa <fs>/var/slaves/<name>" -compact
1833 .It Pa <pfs>/snapshots
1834 default per PFS snapshots directory
1837 .It Pa /var/hammer/<pfs>
1838 default per PFS snapshots directory (not root)
1841 .It Pa /var/hammer/root
1842 default snapshots directory for root directory
1845 .It Pa <snapshots>/config
1852 .It Pa <fs>/var/slaves/<name>
1853 recommended slave PFS snapshots directory
1857 recommended PFS directory
1866 .Xr periodic.conf 5 ,
1868 .Xr mount_hammer 8 ,
1870 .Xr newfs_hammer 8 ,
1876 utility first appeared in
1879 .An Matthew Dillon Aq dillon@backplane.com