1 .\" Copyright (c) 2007 The DragonFly Project. All rights reserved.
3 .\" This code is derived from software contributed to The DragonFly Project
4 .\" by Matthew Dillon <dillon@backplane.com>
6 .\" Redistribution and use in source and binary forms, with or without
7 .\" modification, are permitted provided that the following conditions
10 .\" 1. Redistributions of source code must retain the above copyright
11 .\" notice, this list of conditions and the following disclaimer.
12 .\" 2. Redistributions in binary form must reproduce the above copyright
13 .\" notice, this list of conditions and the following disclaimer in
14 .\" the documentation and/or other materials provided with the
16 .\" 3. Neither the name of The DragonFly Project nor the names of its
17 .\" contributors may be used to endorse or promote products derived
18 .\" from this software without specific, prior written permission.
20 .\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
21 .\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
22 .\" LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
23 .\" FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
24 .\" COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
25 .\" INCIDENTAL, SPECIAL, EXEMPLARY OR CONSEQUENTIAL DAMAGES (INCLUDING,
26 .\" BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
27 .\" LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
28 .\" AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
29 .\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
30 .\" OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
38 .Nd HAMMER file system utility
45 .Op Fl C Ar cachesize Ns Op Ns Cm \&: Ns Ar readahead
48 .\" .Op Fl s Ar linkpath
57 This manual page documents the
59 utility which provides miscellaneous functions related to managing a
62 For a general introduction to the
64 file system, its features, and
65 examples on how to set up and maintain one, see
68 The options are as follows:
69 .Bl -tag -width indent
71 Tell the mirror commands to use a 2-way protocol, which allows
72 automatic negotiation of transaction id ranges.
73 This option is automatically enabled by the
79 will not attempt to break-up large initial bulk transfers into smaller
81 This can save time but if the link is lost in the middle of the
82 initial bulk transfer you will have to start over from scratch.
83 For more information see the
87 Specify a bandwidth limit in bytes per second for mirroring streams.
88 This option is typically used to prevent batch mirroring operations from
89 loading down the machine.
90 The bandwidth may be suffixed with
94 to specify values in kilobytes, megabytes, and gigabytes per second.
95 If no suffix is specified, bytes per second is assumed.
97 Unfortunately this is only applicable to the pre-compression bandwidth
98 when compression is used, so a better solution would probably be to
104 .It Fl C Ar cachesize Ns Op Ns Cm \&: Ns Ar readahead
105 Set the memory cache size for any raw
112 for megabytes is allowed,
113 else the cache size is specified in bytes.
115 The read-behind/read-ahead defaults to 4
119 This option is typically only used with diagnostic commands
120 as kernel-supported commands will use the kernel's buffer cache.
121 .It Fl c Ar cyclefile
122 When pruning, rebalancing or reblocking you can tell the utility
123 to start at the object id stored in the specified file.
124 If the file does not exist
126 will start at the beginning.
129 is told to run for a specific period of time
131 and is unable to complete the operation it will write out
132 the current object id so the next run can pick up where it left off.
135 runs to completion it will delete
141 will not check that time period has elapsed if this option is given.
143 Specify the volumes making up a
147 is a colon-separated list of devices, each specifying a
153 Specify delay in seconds for
154 .Cm mirror-read-stream .
155 When maintaining a streaming mirroring this option specifies the
156 minimum delay after a batch ends before the next batch is allowed
158 The default is five seconds.
160 Specify the maximum amount of memory
162 will allocate during a dedup pass.
163 Specify a suffix 'm', 'g', or 't' for megabytes, gigabytes, or terrabytes.
166 will allocate up to 1G of ram to hold CRC/SHA tables while running dedup.
167 When the limit is reached the dedup code restricts the range of CRCs to
168 keep memory use within bounds and runs multiple passes as necessary until
169 the entire filesystem has been deduped.
176 specification for the source and/or destination.
178 Decrease verboseness.
179 May be specified multiple times.
181 Specify recursion for those commands which support it.
182 .It Fl S Ar splitsize
183 Specify the bulk splitup size in bytes for mirroring streams.
188 will do an initial run-through of the data to calculate good
189 transaction ids to cut up the bulk transfers, creating
190 restart points in case the stream is interrupted.
191 If we don't do this and the stream is interrupted it might
192 have to start all over again.
197 At the moment the run-through is disk-bandwidth-heavy but some
198 future version will limit the run-through to just the B-Tree
199 records and not the record data.
201 The splitsize may be suffixed with
205 to specify values in kilobytes, megabytes, or gigabytes.
206 If no suffix is specified, bytes is assumed.
208 When mirroring very large filesystems the minimum recommended
210 A small split size may wind up generating a great deal of overhead
211 but very little actual incremental data and is not recommended.
213 Specify timeout in seconds.
214 When pruning, rebalancing, reblocking or mirror-reading
215 you can tell the utility to stop after a certain period of time.
216 A value of 0 means unlimited.
217 This option is used along with the
219 option to prune, rebalance or reblock incrementally.
221 Increase verboseness.
222 May be specified multiple times.
224 Enable compression for any remote ssh specifications.
225 This option is typically used with the mirroring directives.
229 for interactive questions.
232 The commands are as follows:
233 .Bl -tag -width indent
234 .\" ==== synctid ====
235 .It Cm synctid Ar filesystem Op Cm quick
236 Generate a guaranteed, formal 64-bit transaction id representing the
237 current state of the specified
240 The file system will be synced to the media.
244 keyword is specified the file system will be soft-synced, meaning that a
245 crash might still undo the state of the file system as of the transaction
246 id returned but any new modifications will occur after the returned
247 transaction id as expected.
249 This operation does not create a snapshot.
250 It is meant to be used
251 to track temporary fine-grained changes to a subset of files and
252 will only remain valid for
254 access purposes for the
256 period configured for the PFS.
257 If you desire a real snapshot then the
259 directive may be what you are looking for.
261 .It Cm bstats Op Ar interval
264 B-Tree statistics until interrupted.
267 seconds between each display.
268 The default interval is one second.
269 .\" ==== iostats ====
270 .It Cm iostats Op Ar interval
274 statistics until interrupted.
277 seconds between each display.
278 The default interval is one second.
279 .\" ==== history ====
280 .It Cm history Ns Oo Cm @ Ns Ar offset Ns Oo Cm \&, Ns Ar length Oc Oc Ar path ...
281 Show the modification history for inode and data of
286 is given history is shown for data block at given offset,
287 otherwise history is shown for inode.
292 data bytes at given offset are dumped for each version,
297 this directive shows object id and sync status,
298 and for each object version it shows transaction id and time stamp.
299 Files has to exist for this directive to be applicable,
300 to track inodes which has been deleted or renamed see
302 .\" ==== blockmap ====
304 Dump the blockmap for the file system.
307 blockmap is two-layer
308 blockmap representing the maximum possible file system size of 1 Exabyte.
309 Needless to say the second layer is only present for blocks which exist.
311 blockmap represents 8-Megabyte blocks, called big-blocks.
312 Each big-block has an append
313 point, a free byte count, and a typed zone id which allows content to be
314 reverse engineered to some degree.
318 allocations are essentially appended to a selected big-block using
319 the append offset and deducted from the free byte count.
320 When space is freed the free byte count is adjusted but
322 does not track holes in big-blocks for reallocation.
323 A big-block must be completely freed, either
324 through normal file system operations or through reblocking, before
327 Data blocks can be shared by deducting the space used from the free byte
328 count for each shared references.
329 This means the free byte count can legally go negative.
331 This command needs the
334 .\" ==== checkmap ====
336 Check the blockmap allocation count.
338 will scan the B-Tree, collect allocation information, and
339 construct a blockmap in-memory.
340 It will then check that blockmap against the on-disk blockmap.
342 This command needs the
346 .It Cm show Op Ar localization Ns Op Cm \&: Ns Ar object_id
348 By default this command will validate all B-Tree
349 linkages and CRCs, including data CRCs, and will report the most verbose
350 information it can dig up.
351 Any errors will show up with a
353 in column 1 along with various
359 .Ar localization Ns Cm \&: Ns Ar object_id
361 search for the key printing nodes as it recurses down, and then
362 will iterate forwards.
363 These fields are specified in HEX.
364 Note that the pfsid is the top 16 bits of the 32-bit localization
365 field so PFS #1 would be 00010000.
369 the command will report less information about the inode contents.
373 the command will not report the content of the inode or other typed
378 the command will not report volume header information, big-block fill
379 ratios, mirror transaction ids, or report or check data CRCs.
380 B-Tree CRCs and linkages are still checked.
382 This command needs the
385 .\" ==== show-undo ====
389 Dump the UNDO/REDO map.
391 This command needs the
395 .\" Dump the B-Tree, record, large-data, and small-data blockmaps, showing
396 .\" physical block assignments and free space percentages.
397 .\" ==== recover ====
398 .It Cm recover Ar targetdir
399 Recover data from a corrupted
402 This is a low level command which operates on the filesystem image and
403 attempts to locate and recover files from a corrupted filesystem.
404 The entire image is scanned linearly looking for B-Tree nodes.
406 found which passes its CRC test is scanned for file, inode, and directory
407 fragments and the target directory is populated with the resulting data.
408 files and directories in the target directory are initially named after
409 the object id and are renamed as fragmentary information is processed.
411 This command keeps track of filename/object_id translations and may eat a
412 considerably amount of memory while operating.
414 This command is literally the last line of defense when it comes to
415 recovering data from a dead filesystem.
417 This command needs the
420 .\" ==== namekey1 ====
421 .It Cm namekey1 Ar filename
424 64-bit directory hash for the specified file name, using
425 the original directory hash algorithm in version 1 of the file system.
426 The low 32 bits are used as an iterator for hash collisions and will be
428 .\" ==== namekey2 ====
429 .It Cm namekey2 Ar filename
432 64-bit directory hash for the specified file name, using
433 the new directory hash algorithm in version 2 of the file system.
434 The low 32 bits are still used as an iterator but will start out containing
435 part of the hash key.
436 .\" ==== namekey32 ====
437 .It Cm namekey32 Ar filename
438 Generate the top 32 bits of a
440 64 bit directory hash for the specified file name.
443 Show extended information about
446 The information is divided into sections:
447 .Bl -tag -width indent
448 .It Volume identification
449 General information, like the label of the
451 filesystem, the number of volumes it contains, the FSID, and the
454 .It Big block information
455 Big block statistics, such as total, used, reserved and free big blocks.
456 .It Space information
457 Information about space used on the filesystem.
458 Currently total size, used, reserved and free space are displayed.
460 Basic information about the PFSs currently present on a
465 is the ID of the PFS, with 0 being the root PFS.
467 is the current snapshot count on the PFS.
469 displays the mount point of the PFS is currently mounted on (if any).
471 .\" ==== cleanup ====
472 .It Cm cleanup Op Ar filesystem ...
473 This is a meta-command which executes snapshot, prune, rebalance, dedup
474 and reblock commands on the specified
479 is specified this command will clean-up all
481 file systems in use, including PFS's.
482 To do this it will scan all
486 mounts, extract PFS id's, and clean-up each PFS found.
488 This command will access a snapshots
489 directory and a configuration file for each
491 creating them if necessary.
492 .Bl -tag -width indent
493 .It Nm HAMMER No version 2-
494 The configuration file is
496 in the snapshots directory which defaults to
497 .Pa <pfs>/snapshots .
498 .It Nm HAMMER No version 3+
499 The configuration file is saved in file system meta-data, see
502 The snapshots directory defaults to
503 .Pa /var/hammer/<pfs>
504 .Pa ( /var/hammer/root
508 The format of the configuration file is:
509 .Bd -literal -offset indent
510 snapshots <period> <retention-time> [any]
511 prune <period> <max-runtime>
512 rebalance <period> <max-runtime>
513 dedup <period> <max-runtime>
514 reblock <period> <max-runtime>
515 recopy <period> <max-runtime>
519 .Bd -literal -offset indent
520 snapshots 1d 60d # 0d 0d for PFS /tmp, /var/tmp, /usr/obj
528 Time is given with a suffix of
534 meaning day, hour, minute and second.
538 directive has a period of 0 and a retention time of 0
539 then snapshot generation is disabled, removal of old snapshots are
540 disabled, and prunes will use
541 .Cm prune-everything .
545 directive has a period of 0 but a non-zero retention time
546 then this command will not create any new snapshots but will remove old
547 snapshots it finds based on the retention time.
549 used on PFS masters where you are generating your own snapshot softlinks
550 manually and on PFS slaves when all you wish to do is prune away existing
551 snapshots inherited via the mirroring stream.
553 By default only snapshots in the form
554 .Ql snap- Ns Ar yyyymmdd Ns Op - Ns Ar HHMM
558 directive is specified as a third argument on the
560 config line then any softlink of the form
561 .Ql *- Ns Ar yyyymmdd Ns Op - Ns Ar HHMM
563 .Ql *. Ns Ar yyyymmdd Ns Op - Ns Ar HHMM
566 A period of 0 for prune, rebalance, dedup, reblock or recopy disables the directive.
567 A max-runtime of 0 means unlimited.
569 If period hasn't passed since the previous
572 For example a day has passed when midnight is passed (localtime).
575 flag is given the period is ignored.
583 The default configuration file will create a daily snapshot, do a daily
584 pruning, rebalancing, deduping and reblocking run and a monthly recopy run.
585 Reblocking is defragmentation with a level of 95%,
586 and recopy is full defragmentation.
588 By default prune, dedup and rebalance operations are time limited to 5 minutes,
589 and reblock operations to a bit over 5 minutes,
590 and recopy operations to a bit over 10 minutes.
591 Reblocking and recopy runs are each broken down into four separate functions:
592 btree, inodes, dirs and data.
593 Each function is time limited to the time given in the configuration file,
594 but the btree, inodes and dirs functions usually does not take very long time,
595 full defragmentation is always used for these three functions.
596 Also note that this directive will by default disable snapshots on
603 The defaults may be adjusted by modifying the configuration file.
604 The pruning and reblocking commands automatically maintain a cyclefile
605 for incremental operation.
606 If you interrupt (^C) the program the cyclefile will be updated,
608 may continue to run in the background for a few seconds until the
610 ioctl detects the interrupt.
613 PFS option can be set to use another location for the snapshots directory.
615 Work on this command is still in progress.
617 An ability to remove snapshots dynamically as the
618 file system becomes full.
620 .It Cm config Op Ar filesystem Op Ar configfile
623 Show or change configuration for
625 If zero or one arguments are specified this function dumps the current
626 configuration file to stdout.
627 Zero arguments specifies the PFS containing the current directory.
628 This configuration file is stored in file system meta-data.
629 If two arguments are specified this function installs a new config file.
633 versions less than 3 the configuration file is by default stored in
634 .Pa <pfs>/snapshots/config ,
635 but in all later versions the configuration file is stored in file system
637 .\" ==== viconfig ====
638 .It Cm viconfig Op Ar filesystem
641 Edit the configuration file and reinstall into file system meta-data when done.
642 Zero arguments specifies the PFS containing the current directory.
643 .\" ==== volume-add ====
644 .It Cm volume-add Ar device Ar filesystem
651 and add all of its space to
655 file system can use up to 256 volumes.
658 All existing data contained on
660 will be destroyed by this operation!
665 file system, formatting will be denied.
666 You can overcome this sanity check by using
668 to erase the beginning sectors of the device.
670 Remember that you have to specify
672 together with any other device that make up the file system,
679 is root file system, also remember to add
682 .Va vfs.root.mountfrom
684 .Pa /boot/loader.conf ,
687 .\" ==== volume-del ====
688 .It Cm volume-del Ar device Ar filesystem
694 Remember that you have to remove
696 from the colon-separated list in
702 is root file system, also remember to remove
705 .Va vfs.root.mountfrom
707 .Pa /boot/loader.conf ,
710 .\" ==== volume-list ====
711 .It Cm volume-list Ar filesystem
712 List the volumes that make up
714 .\" ==== snapshot ====
715 .It Cm snapshot Oo Ar filesystem Oc Ar snapshot-dir
716 .It Cm snapshot Ar filesystem Ar snapshot-dir Op Ar note
717 Take a snapshot of the file system either explicitly given by
719 or implicitly derived from the
721 argument and creates a symlink in the directory provided by
723 pointing to the snapshot.
726 is not a directory, it is assumed to be a format string passed to
728 with the current time as parameter.
731 refers to an existing directory, a default format string of
733 is assumed and used as name for the newly created symlink.
735 Snapshot is a per PFS operation, so each PFS in a
737 file system have to be snapshot separately.
739 Example, assuming that
747 are file systems on their own, the following invocations:
748 .Bd -literal -offset indent
749 hammer snapshot /mysnapshots
751 hammer snapshot /mysnapshots/%Y-%m-%d
753 hammer snapshot /obj /mysnapshots/obj-%Y-%m-%d
755 hammer snapshot /usr /my/snaps/usr "note"
758 Would create symlinks similar to:
759 .Bd -literal -offset indent
760 /mysnapshots/snap-20080627-1210 -> /@@0x10d2cd05b7270d16
762 /mysnapshots/2008-06-27 -> /@@0x10d2cd05b7270d16
764 /mysnapshots/obj-2008-06-27 -> /obj@@0x10d2cd05b7270d16
766 /my/snaps/usr/snap-20080627-1210 -> /usr@@0x10d2cd05b7270d16
771 version 3+ file system the snapshot is also recorded in file system meta-data
772 along with the optional
778 .It Cm snap Ar path Op Ar note
781 Create a snapshot for the PFS containing
783 and create a snapshot softlink.
784 If the path specified is a
785 directory a standard snapshot softlink will be created in the directory.
786 The snapshot softlink points to the base of the mounted PFS.
787 .It Cm snaplo Ar path Op Ar note
790 Create a snapshot for the PFS containing
792 and create a snapshot softlink.
793 If the path specified is a
794 directory a standard snapshot softlink will be created in the directory.
795 The snapshot softlink points into the directory it is contained in.
796 .It Cm snapq Ar dir Op Ar note
799 Create a snapshot for the PFS containing the specified directory but do
800 not create a softlink.
801 Instead output a path which can be used to access
802 the directory via the snapshot.
804 An absolute or relative path may be specified.
805 The path will be used as-is as a prefix in the path output to stdout.
807 snap and snapshot directives the snapshot transaction id will be registered
808 in the file system meta-data.
809 .It Cm snaprm Ar path Ar ...
810 .It Cm snaprm Ar transaction_id Ar ...
811 .It Cm snaprm Ar filesystem Ar transaction_id Ar ...
814 Remove a snapshot given its softlink or transaction id.
815 If specifying a transaction id
816 the snapshot is removed from file system meta-data but you are responsible
817 for removing any related softlinks.
819 If a softlink path is specified the filesystem and transaction id
820 is derived from the contents of the softlink.
821 If just a transaction id is specified it is assumed to be a snapshot in the
823 filesystem you are currently chdir'd into.
824 You can also specify the filesystem and transaction id explicitly.
825 .It Cm snapls Op Ar path ...
828 Dump the snapshot meta-data for PFSs containing each
830 listing all available snapshots and their notes.
831 If no arguments are specified snapshots for the PFS containing the
832 current directory are listed.
833 This is the definitive list of snapshots for the file system.
835 .It Cm prune Ar softlink-dir
836 Prune the file system based on previously created snapshot softlinks.
837 Pruning is the act of deleting file system history.
840 command will delete file system history such that
841 the file system state is retained for the given snapshots,
842 and all history after the latest snapshot.
843 By setting the per PFS parameter
845 history is guaranteed to be saved at least this time interval.
846 All other history is deleted.
848 The target directory is expected to contain softlinks pointing to
849 snapshots of the file systems you wish to retain.
850 The directory is scanned non-recursively and the mount points and
851 transaction ids stored in the softlinks are extracted and sorted.
852 The file system is then explicitly pruned according to what is found.
853 Cleaning out portions of the file system is as simple as removing a
854 snapshot softlink and then running the
858 As a safety measure pruning only occurs if one or more softlinks are found
861 snapshot id extension.
862 Currently the scanned softlink directory must contain softlinks pointing
866 The softlinks may specify absolute or relative paths.
867 Softlinks must use 20-character
869 transaction ids, as might be returned from
870 .Nm Cm synctid Ar filesystem .
872 Pruning is a per PFS operation, so each PFS in a
874 file system have to be pruned separately.
876 Note that pruning a file system may not immediately free-up space,
877 though typically some space will be freed if a large number of records are
879 The file system must be reblocked to completely recover all available space.
881 Example, lets say your that you didn't set
883 and snapshot directory contains the following links:
884 .Bd -literal -offset indent
885 lrwxr-xr-x 1 root wheel 29 May 31 17:57 snap1 ->
886 /usr/obj/@@0x10d2cd05b7270d16
888 lrwxr-xr-x 1 root wheel 29 May 31 17:58 snap2 ->
889 /usr/obj/@@0x10d2cd13f3fde98f
891 lrwxr-xr-x 1 root wheel 29 May 31 17:59 snap3 ->
892 /usr/obj/@@0x10d2cd222adee364
895 If you were to run the
897 command on this directory, then the
900 mount will be pruned to retain the above three snapshots.
901 In addition, history for modifications made to the file system older than
902 the oldest snapshot will be destroyed and history for potentially fine-grained
903 modifications made to the file system more recently than the most recent
904 snapshot will be retained.
906 If you then delete the
908 softlink and rerun the
911 history for modifications pertaining to that snapshot would be destroyed.
915 file system versions 3+ this command also scans the snapshots stored
916 in the file system meta-data and includes them in the prune.
917 .\" ==== prune-everything ====
918 .It Cm prune-everything Ar filesystem
919 Remove all historical records from
921 Use this directive with caution on PFSs where you intend to use history.
923 This command does not remove snapshot softlinks but will delete all
924 snapshots recorded in file system meta-data (for file system version 3+).
925 The user is responsible for deleting any softlinks.
927 Pruning is a per PFS operation, so each PFS in a
929 file system have to be pruned separately.
930 .\" ==== rebalance ====
931 .It Cm rebalance Ar filesystem Op Ar saturation_percentage
932 Rebalance the B-Tree, nodes with small number of
933 elements will be combined and element counts will be smoothed out
936 The saturation percentage is between 50% and 100%.
937 The default is 85% (the
939 suffix is not needed).
941 Rebalancing is a per PFS operation, so each PFS in a
943 file system have to be rebalanced separately.
945 .It Cm dedup Ar filesystem
948 Perform offline (post-process) deduplication.
949 Deduplication occurs at
950 the block level, currently only data blocks of the same size can be
951 deduped, metadata blocks can not.
952 The hash function used for comparing
953 data blocks is CRC-32 (CRCs are computed anyways as part of
955 data integrity features, so there's no additional overhead).
956 Since CRC is a weak hash function a byte-by-byte comparison is done
957 before actual deduping.
958 In case of a CRC collision (two data blocks have the same CRC
959 but different contents) the checksum is upgraded to SHA-256.
963 reblocker may partially blow up (re-expand) dedup (reblocker's normal
964 operation is to reallocate every record, so it's possible for deduped
965 blocks to be re-expanded back).
967 Deduplication is a per PFS operation, so each PFS in a
969 file system have to be deduped separately.
971 means that if you have duplicated data in two different PFSs that data
972 won't be deduped, however the addition of such feature is planned.
976 option should be used to limit memory use during the dedup run if the
977 default 1G limit is too much for the machine.
978 .\" ==== dedup-simulate ====
979 .It Cm dedup-simulate Ar filesystem
980 Shows potential space savings (simulated dedup ratio) one can get after
984 If the estimated dedup ratio is greater than 1.00 you will see
986 Remember that this is an estimated number, in
987 practice real dedup ratio will be slightly smaller because of
989 bigblock underflows, B-Tree locking issues and other factors.
991 Note that deduplication currently works only on bulk data so if you
996 commands on a PFS that contains metadata only (directory entries,
997 softlinks) you will get a 0.00 dedup ratio.
1001 option should be used to limit memory use during the dedup run if the
1002 default 1G limit is too much for the machine.
1003 .\" ==== reblock* ====
1004 .It Cm reblock Ar filesystem Op Ar fill_percentage
1005 .It Cm reblock-btree Ar filesystem Op Ar fill_percentage
1006 .It Cm reblock-inodes Ar filesystem Op Ar fill_percentage
1007 .It Cm reblock-dirs Ar filesystem Op Ar fill_percentage
1008 .It Cm reblock-data Ar filesystem Op Ar fill_percentage
1009 Attempt to defragment and free space for reuse by reblocking a live
1012 Big-blocks cannot be reused by
1014 until they are completely free.
1015 This command also has the effect of reordering all elements, effectively
1016 defragmenting the file system.
1018 The default fill percentage is 100% and will cause the file system to be
1019 completely defragmented.
1020 All specified element types will be reallocated and rewritten.
1021 If you wish to quickly free up space instead try specifying
1022 a smaller fill percentage, such as 90% or 80% (the
1024 suffix is not needed).
1026 Since this command may rewrite the entire contents of the disk it is
1027 best to do it incrementally from a
1033 options to limit the run time.
1034 The file system would thus be defragmented over long period of time.
1036 It is recommended that separate invocations be used for each data type.
1037 B-Tree nodes, inodes, and directories are typically the most important
1038 elements needing defragmentation.
1039 Data can be defragmented over a longer period of time.
1041 Reblocking is a per PFS operation, so each PFS in a
1043 file system have to be reblocked separately.
1044 .\" ==== pfs-status ====
1045 .It Cm pfs-status Ar dirpath ...
1046 Retrieve the mirroring configuration parameters for the specified
1048 file systems or pseudo-filesystems (PFS's).
1049 .\" ==== pfs-master ====
1050 .It Cm pfs-master Ar dirpath Op Ar options
1051 Create a pseudo-filesystem (PFS) inside a
1054 Up to 65536 PFSs can be created.
1055 Each PFS uses an independent inode numbering space making it suitable
1060 directive creates a PFS that you can read, write, and use as a mirroring
1063 A PFS can only be truly destroyed with the
1066 Removing the softlink will not destroy the underlying PFS.
1068 A PFS can only be created in the root PFS (PFS# 0),
1069 not in a PFS created by
1075 It is recommended that
1081 directory at root of
1085 It is recommended to use a
1087 mount to access a PFS, except for root PFS, for more information see
1089 .\" ==== pfs-slave ====
1090 .It Cm pfs-slave Ar dirpath Op Ar options
1091 Create a pseudo-filesystem (PFS) inside a
1094 Up to 65536 PFSs can be created.
1095 Each PFS uses an independent inode numbering space making it suitable
1100 directive creates a PFS that you can use as a mirroring source or target.
1101 You will not be able to access a slave PFS until you have completed the
1102 first mirroring operation with it as the target (its root directory will
1103 not exist until then).
1105 Access to the pfs-slave via the special softlink, as described in the
1110 dynamically modify the snapshot transaction id by returning a dynamic result
1115 A PFS can only be truly destroyed with the
1118 Removing the softlink will not destroy the underlying PFS.
1120 A PFS can only be created in the root PFS (PFS# 0),
1121 not in a PFS created by
1127 It is recommended that
1133 directory at root of
1137 It is recommended to use a
1139 mount to access a PFS, except for root PFS, for more information see
1141 .\" ==== pfs-update ====
1142 .It Cm pfs-update Ar dirpath Op Ar options
1143 Update the configuration parameters for an existing
1145 file system or pseudo-filesystem.
1146 Options that may be specified:
1147 .Bl -tag -width indent
1148 .It Cm sync-beg-tid= Ns Ar 0x16llx
1149 This is the automatic snapshot access starting transaction id for
1151 This parameter is normally updated automatically by the
1155 It is important to note that accessing a mirroring slave
1156 with a transaction id greater than the last fully synchronized transaction
1157 id can result in an unreliable snapshot since you will be accessing
1158 data that is still undergoing synchronization.
1160 Manually modifying this field is dangerous and can result in a broken mirror.
1161 .It Cm sync-end-tid= Ns Ar 0x16llx
1162 This is the current synchronization point for mirroring slaves.
1163 This parameter is normally updated automatically by the
1167 Manually modifying this field is dangerous and can result in a broken mirror.
1168 .It Cm shared-uuid= Ns Ar uuid
1169 Set the shared UUID for this file system.
1170 All mirrors must have the same shared UUID.
1171 For safety purposes the
1173 directives will refuse to operate on a target with a different shared UUID.
1175 Changing the shared UUID on an existing, non-empty mirroring target,
1176 including an empty but not completely pruned target,
1177 can lead to corruption of the mirroring target.
1178 .It Cm unique-uuid= Ns Ar uuid
1179 Set the unique UUID for this file system.
1180 This UUID should not be used anywhere else,
1181 even on exact copies of the file system.
1182 .It Cm label= Ns Ar string
1183 Set a descriptive label for this file system.
1184 .It Cm snapshots= Ns Ar string
1185 Specify the snapshots directory which
1188 will use to manage this PFS.
1189 .Bl -tag -width indent
1190 .It Nm HAMMER No version 2-
1191 The snapshots directory does not need to be configured for
1192 PFS masters and will default to
1193 .Pa <pfs>/snapshots .
1195 PFS slaves are mirroring slaves so you cannot configure a snapshots
1196 directory on the slave itself to be managed by the slave's machine.
1197 In fact, the slave will likely have a
1199 sub-directory mirrored
1200 from the master, but that directory contains the configuration the master
1201 is using for its copy of the file system, not the configuration that we
1202 want to use for our slave.
1204 It is recommended that
1205 .Pa <fs>/var/slaves/<name>
1206 be configured for a PFS slave, where
1212 is an appropriate label.
1213 .It Nm HAMMER No version 3+
1214 The snapshots directory does not need to be configured for PFS masters or
1216 The snapshots directory defaults to
1217 .Pa /var/hammer/<pfs>
1218 .Pa ( /var/hammer/root
1222 You can control snapshot retention on your slave independent of the master.
1223 .It Cm snapshots-clear
1226 directory path for this PFS.
1227 .It Cm prune-min= Ns Ar N Ns Cm d
1228 .It Cm prune-min= Ns Oo Ar N Ns Cm d/ Oc Ns \
1229 Ar hh Ns Op Cm \&: Ns Ar mm Ns Op Cm \&: Ns Ar ss
1230 Set the minimum fine-grained data retention period.
1232 always retains fine-grained history up to the most recent snapshot.
1233 You can extend the retention period further by specifying a non-zero
1235 Any snapshot softlinks within the retention period are ignored
1236 for the purposes of pruning (i.e.\& the fine grained history is retained).
1237 Number of days, hours, minutes and seconds are given as
1242 Because the transaction id in the snapshot softlink cannot be used
1243 to calculate a timestamp,
1245 uses the earlier of the
1249 field of the softlink to
1250 determine which snapshots fall within the retention period.
1251 Users must be sure to retain one of these two fields when manipulating
1254 .\" ==== pfs-upgrade ====
1255 .It Cm pfs-upgrade Ar dirpath
1256 Upgrade a PFS from slave to master operation.
1257 The PFS will be rolled back to the current end synchronization transaction id
1258 (removing any partial synchronizations), and will then become writable.
1262 currently supports only single masters and using
1263 this command can easily result in file system corruption
1264 if you don't know what you are doing.
1266 This directive will refuse to run if any programs have open descriptors
1267 in the PFS, including programs chdir'd into the PFS.
1268 .\" ==== pfs-downgrade ====
1269 .It Cm pfs-downgrade Ar dirpath
1270 Downgrade a master PFS from master to slave operation.
1271 The PFS becomes read-only and access will be locked to its
1274 This directive will refuse to run if any programs have open descriptors
1275 in the PFS, including programs chdir'd into the PFS.
1276 .\" ==== pfs-destroy ====
1277 .It Cm pfs-destroy Ar dirpath
1278 This permanently destroys a PFS.
1280 This directive will refuse to run if any programs have open descriptors
1281 in the PFS, including programs chdir'd into the PFS.
1282 As safety measure the
1284 flag have no effect on this directive.
1285 .\" ==== mirror-read ====
1286 .It Cm mirror-read Ar filesystem Op Ar begin-tid
1287 Generate a mirroring stream to stdout.
1288 The stream ends when the transaction id space has been exhausted.
1290 may be a master or slave PFS.
1291 .\" ==== mirror-read-stream ====
1292 .It Cm mirror-read-stream Ar filesystem Op Ar begin-tid
1293 Generate a mirroring stream to stdout.
1294 Upon completion the stream is paused until new data is synced to the
1297 Operation continues until the pipe is broken.
1300 command for more details.
1301 .\" ==== mirror-write ====
1302 .It Cm mirror-write Ar filesystem
1303 Take a mirroring stream on stdin.
1305 must be a slave PFS.
1307 This command will fail if the
1309 configuration field for the two file systems do not match.
1312 command for more details.
1314 If the target PFS does not exist this command will ask you whether
1315 you want to create a compatible PFS slave for the target or not.
1316 .\" ==== mirror-dump ====
1322 to dump an ASCII representation of the mirroring stream.
1323 .\" ==== mirror-copy ====
1324 .\".It Cm mirror-copy Ar [[user@]host:]filesystem [[user@]host:]filesystem
1325 .It Cm mirror-copy \
1326 Oo Oo Ar user Ns Cm @ Oc Ns Ar host Ns Cm \&: Oc Ns Ar filesystem \
1327 Oo Oo Ar user Ns Cm @ Oc Ns Ar host Ns Cm \&: Oc Ns Ar filesystem
1328 This is a shortcut which pipes a
1333 If a remote host specification is made the program forks a
1339 on the appropriate host.
1340 The source may be a master or slave PFS, and the target must be a slave PFS.
1342 This command also establishes full duplex communication and turns on
1343 the 2-way protocol feature
1345 which automatically negotiates transaction id
1346 ranges without having to use a cyclefile.
1347 If the operation completes successfully the target PFS's
1350 Note that you must re-chdir into the target PFS to see the updated information.
1351 If you do not you will still be in the previous snapshot.
1353 If the target PFS does not exist this command will ask you whether
1354 you want to create a compatible PFS slave for the target or not.
1355 .\" ==== mirror-stream ====
1356 .\".It Cm mirror-stream Ar [[user@]host:]filesystem [[user@]host:]filesystem
1357 .It Cm mirror-stream \
1358 Oo Oo Ar user Ns Cm @ Oc Ns Ar host Ns Cm \&: Oc Ns Ar filesystem \
1359 Oo Oo Ar user Ns Cm @ Oc Ns Ar host Ns Cm \&: Oc Ns Ar filesystem
1360 This is a shortcut which pipes a
1361 .Cm mirror-read-stream
1365 This command works similarly to
1367 but does not exit after the initial mirroring completes.
1368 The mirroring operation will resume as changes continue to be made to the
1370 The command is commonly used with
1374 options to keep the mirroring target in sync with the source on a continuing
1377 If the pipe is broken the command will automatically retry after sleeping
1379 The time slept will be 15 seconds plus the time given in the
1383 This command also detects the initial-mirroring case and spends some
1384 time scanning the B-Tree to find good break points, allowing the initial
1385 bulk mirroring operation to be broken down into 4GB pieces.
1386 This means that the user can kill and restart the operation and it will
1387 not have to start from scratch once it has gotten past the first chunk.
1390 option may be used to change the size of pieces and the
1392 option may be used to disable this feature and perform an initial bulk
1394 .\" ==== version ====
1395 .It Cm version Ar filesystem
1396 This command returns the
1398 file system version for the specified
1400 as well as the range of versions supported in the kernel.
1403 option may be used to remove the summary at the end.
1404 .\" ==== version-upgrade ====
1405 .It Cm version-upgrade Ar filesystem Ar version Op Cm force
1411 Once upgraded a file system may not be downgraded.
1412 If you wish to upgrade a file system to a version greater or equal to the
1413 work-in-progress (WIP) version number you must specify the
1416 Use of WIP versions should be relegated to testing and may require wiping
1417 the file system as development progresses, even though the WIP version might
1421 This command operates on the entire
1423 file system and is not a per PFS operation.
1424 All PFS's will be affected.
1425 .Bl -tag -width indent
1428 default version, first
1433 New directory entry layout.
1434 This version is using a new directory hash key.
1437 New snapshot management, using file system meta-data for saving
1438 configuration file and snapshots (transaction ids etc.).
1439 Also default snapshots directory has changed.
1443 New undo/redo/flush, giving
1445 a much faster sync and fsync.
1448 Deduplication support.
1451 Directory hash ALG1.
1452 Tends to maintain inode number / directory name entry ordering better
1453 for files after minor renaming.
1456 .Sh PSEUDO-FILESYSTEM (PFS) NOTES
1457 The root of a PFS is not hooked into the primary
1459 file system as a directory.
1462 creates a special softlink called
1464 (exactly 10 characters long) in the primary
1468 then modifies the contents of the softlink as read by
1470 and thus what you see with an
1472 command or if you were to
1475 If the PFS is a master the link reflects the current state of the PFS.
1476 If the PFS is a slave the link reflects the last completed snapshot, and the
1477 contents of the link will change when the next snapshot is completed, and
1482 utility employs numerous safeties to reduce user foot-shooting.
1485 directive requires that the target be configured as a slave and that the
1487 field of the mirroring source and target match.
1488 .Sh DOUBLE_BUFFER MODE
1489 There is a limit to the number of vnodes the kernel can cache, and because
1490 file buffers are associated with a vnode the related data cache can get
1491 blown away when operating on large numbers of files even if the system has
1492 sufficient memory to hold the file data.
1496 double buffer mode by setting the
1499 .Va vfs.hammer.double_buffer
1502 will cache file data via the block device and copy it into the per-file
1503 buffers as needed. The data will be double-cached at least until the
1504 buffer cache throws away the file buffer.
1505 This mode is typically used in conjunction with
1508 .Va vm.swapcache.data_enable
1509 is turned on in order to prevent unnecessary re-caching of file data
1510 due to vnode recycling.
1511 The swapcache will save the cached VM pages related to
1514 device (which doesn't recycle unless you umount the filesystem) instead
1515 of the cached VM pages backing the file vnodes.
1517 Double buffering should also be turned on if live dedup is enabled via
1518 .Va vfs.hammer.live_dedup .
1519 This is because the live dedup must validate the contents of a potential
1520 duplicate file block and it must run through the block device to do that
1521 and not the file vnode.
1522 If double buffering is not enabled then live dedup will create extra disk
1523 reads to validate potential data duplicates.
1524 .Sh UPGRADE INSTRUCTIONS HAMMER V1 TO V2
1525 This upgrade changes the way directory entries are stored.
1526 It is possible to upgrade a V1 file system to V2 in place, but
1527 directories created prior to the upgrade will continue to use
1530 Note that the slave mirroring code in the target kernel had bugs in
1531 V1 which can create an incompatible root directory on the slave.
1534 master created after the upgrade with a
1536 slave created prior to the upgrade.
1538 Any directories created after upgrading will use a new layout.
1539 .Sh UPGRADE INSTRUCTIONS HAMMER V2 TO V3
1540 This upgrade adds meta-data elements to the B-Tree.
1541 It is possible to upgrade a V2 file system to V3 in place.
1542 After issuing the upgrade be sure to run a
1545 to perform post-upgrade tasks.
1547 After making this upgrade running a
1552 directory for each PFS mount into
1553 .Pa /var/hammer/<pfs> .
1556 root mount will migrate
1559 .Pa /var/hammer/root .
1560 Migration occurs only once and only if you have not specified
1561 a snapshots directory in the PFS configuration.
1562 If you have specified a snapshots directory in the PFS configuration no
1563 automatic migration will occur.
1565 For slaves, if you desire, you can migrate your snapshots
1566 config to the new location manually and then clear the
1567 snapshot directory configuration in the slave PFS.
1568 The new snapshots hierarchy is designed to work with
1569 both master and slave PFSs equally well.
1571 In addition, the old config file will be moved to file system meta-data,
1572 editable via the new
1576 The old config file will be deleted.
1577 Migration occurs only once.
1579 The V3 file system has new
1581 directives for creating snapshots.
1582 All snapshot directives, including the original, will create
1583 meta-data entries for the snapshots and the pruning code will
1584 automatically incorporate these entries into its list and
1585 expire them the same way it expires softlinks.
1586 If you by accident blow away your snapshot softlinks you can use the
1588 directive to get a definitive list from the file system meta-data and
1589 regenerate them from that list.
1594 to backup file systems your scripts may be using the
1596 directive to generate transaction ids.
1597 This directive does not create a snapshot.
1598 You will have to modify your scripts to use the
1600 directive to generate the linkbuf for the softlink you create, or
1601 use one of the other
1606 directive will continue to work as expected and in V3 it will also
1607 record the snapshot transaction id in file system meta-data.
1608 You may also want to make use of the new
1610 tag for the meta-data.
1613 If you used to remove snapshot softlinks with
1615 you should probably start using the
1617 directive instead to also remove the related meta-data.
1618 The pruning code scans the meta-data so just removing the
1619 softlink is not sufficient.
1620 .Sh UPGRADE INSTRUCTIONS HAMMER V3 TO V4
1621 This upgrade changes undo/flush, giving faster sync.
1622 It is possible to upgrade a V3 file system to V4 in place.
1623 This upgrade reformats the UNDO/REDO FIFO (typically 1GB),
1624 so upgrade might take a minute or two depending.
1626 Version 4 allows the UNDO/REDO FIFO to be flushed without also having
1627 to flush the volume header, removing 2 of the 4 disk syncs typically
1630 and removing 1 of the 2 disk syncs typically
1631 required for a flush sequence.
1632 Version 4 also implements the REDO log (see
1633 .Sx FSYNC FLUSH MODES
1634 below) which is capable
1635 of fsync()ing with either one disk flush or zero disk flushes.
1636 .Sh UPGRADE INSTRUCTIONS HAMMER V4 TO V5
1637 This upgrade brings in deduplication support.
1638 It is possible to upgrade a V4 file system to V5 in place.
1639 Technically it makes the layer2
1641 field a signed value instead of unsigned, allowing it to go negative.
1642 A version 5 filesystem is required for dedup operation.
1643 .Sh UPGRADE INSTRUCTIONS HAMMER V5 TO V6
1644 It is possible to upgrade a V5 file system to V6 in place.
1645 .Sh FSYNC FLUSH MODES
1647 implements five different fsync flush modes via the
1648 .Va vfs.hammer.fsync_mode
1651 version 4+ file systems.
1655 fsync mode 3 is set by default.
1656 REDO operation and recovery is enabled by default.
1657 .Bl -tag -width indent
1659 Full synchronous fsync semantics without REDO.
1662 will not generate REDOs.
1665 will completely sync
1666 the data and meta-data and double-flush the FIFO, including
1667 issuing two disk synchronization commands.
1668 The data is guaranteed
1669 to be on the media as of when
1672 Needless to say, this is slow.
1674 Relaxed asynchronous fsync semantics without REDO.
1676 This mode works the same as mode 0 except the last disk synchronization
1677 command is not issued.
1678 It is faster than mode 0 but not even remotely
1679 close to the speed you get with mode 2 or mode 3.
1681 Note that there is no chance of meta-data corruption when using this
1682 mode, it simply means that the data you wrote and then
1684 might not have made it to the media if the storage system crashes at a bad
1688 Full synchronous fsync semantics using REDO.
1689 NOTE: If not running a
1691 version 4 filesystem or later mode 0 is silently used.
1694 will generate REDOs in the UNDO/REDO FIFO based on a heuristic.
1695 If this is sufficient to satisfy the
1697 operation the blocks will be written out and
1699 will wait for the I/Os to complete,
1700 and then followup with a disk sync command to guarantee the data
1701 is on the media before returning.
1702 This is slower than mode 3 and can result in significant disk or
1703 SSDs overheads, though not as bad as mode 0 or mode 1.
1706 Relaxed asynchronous fsync semantics using REDO.
1707 NOTE: If not running a
1709 version 4 filesystem or later mode 1 is silently used.
1712 will generate REDOs in the UNDO/REDO FIFO based on a heuristic.
1713 If this is sufficient to satisfy the
1715 operation the blocks
1716 will be written out and
1718 will wait for the I/Os to complete,
1721 issue a disk synchronization command.
1723 Note that there is no chance of meta-data corruption when using this
1724 mode, it simply means that the data you wrote and then
1727 not have made it to the media if the storage system crashes at a bad
1730 This mode is the fastest production fsyncing mode available.
1731 This mode is equivalent to how the UFS fsync in the
1741 This mode is primarily designed
1742 for testing and should not be used on a production system.
1744 .Sh RESTORING FROM A SNAPSHOT BACKUP
1745 You restore a snapshot by copying it over to live, but there is a caveat.
1746 The mtime and atime fields for files accessed via a snapshot is locked
1747 to the ctime in order to keep the snapshot consistent, because neither
1748 mtime nor atime changes roll any history.
1750 In order to avoid unnecessary copying it is recommended that you use
1754 when doing the copyback.
1755 Also make sure you traverse the snapshot softlink by appending a ".",
1756 as in "<snapshotpath>/.", and you match up the directory properly.
1757 .Sh RESTORING A PFS FROM A MIRROR
1758 A PFS can be restored from a mirror with
1761 data must be copied separately.
1762 At last the PFS can be upgraded to master using
1765 It is not possible to restore the root PFS (PFS# 0) by using mirroring,
1766 as the root PFS is always a master PFS.
1767 A normal copy (e.g.\& using
1769 must be done, ignoring history.
1770 If history is important, old root PFS can me restored to a new PFS, and
1771 important directories/files can be
1773 mounted to the new PFS.
1777 If the following environment variables exist, they will be used by:
1778 .Bl -tag -width ".Ev EDITOR"
1780 The editor program specified in the variable
1782 will be invoked instead of the default editor, which is
1790 .Bl -tag -width ".It Pa <fs>/var/slaves/<name>" -compact
1791 .It Pa <pfs>/snapshots
1792 default per PFS snapshots directory
1795 .It Pa /var/hammer/<pfs>
1796 default per PFS snapshots directory (not root)
1799 .It Pa /var/hammer/root
1800 default snapshots directory for root directory
1803 .It Pa <snapshots>/config
1810 .It Pa <fs>/var/slaves/<name>
1811 recommended slave PFS snapshots directory
1815 recommended PFS directory
1822 .Xr periodic.conf 5 ,
1824 .Xr mount_hammer 8 ,
1826 .Xr newfs_hammer 8 ,
1832 utility first appeared in
1835 .An Matthew Dillon Aq dillon@backplane.com