1 $DragonFly: src/bin/cpdup/BACKUPS,v 1.1 2006/09/16 21:57:08 dillon Exp $
3 INCREMENTAL BACKUP HOWTO
5 This document describes one of several ways to set up a LAN backup and
6 an off-site WAN backup system using cpdup's hardlinking capabilities.
8 The features described in this document are also encapsulated in scripts
9 which can be found in the scripts/ directory. These scripts can be used
10 to automate all backup steps except for the initial preparation of the
11 backup and off-site machine's directory topology. Operation of these
12 scripts is described in the last section of this document.
15 PART 1 - PREPARE THE LAN BACKUP BOX
17 The easiest way to create a LAN backup box is to NFS mount all your
18 backup clients onto the backup box. It is also possible to use cpdup's
19 remote host feature to access your client boxes but that requires root
20 access to the client boxes and is not described here.
22 Create a directory on the backup machine called /nfs, a subdirectory
23 foreach remote client, and subdirectories for each partition on each
24 client. Remember that cpdup does not cross mount points so you will
25 need a mount for each partition you wish to backup. For example:
34 Before you actually do the NFS mount, create a dummy file for each
35 mount point that can be used by scripts to detect when an NFS mount
36 has not been done. Scripts can thus avoid a common failure scenario
37 and not accidently cpdup an empty mount point to the backup partition
38 (destroying that day's backup in the process).
40 touch /nfs/box1/home/NOT_MOUNTED
41 touch /nfs/box1/var/NOT_MOUNTED
43 Once the directory structure has been set up, do your NFS mounts and
44 also add them to your fstab. Since you will probably wind up with a
45 lot of mounts it is a good idea to use 'ro,bg' (readonly, background
46 mount) in the fstab entries.
48 mount box1:/home /nfs/box1/home
49 mount box1:/var /nfs/box1/var
51 You should create a huge /backup partition on your backup machine which
52 is capable of holding all your mirrors. Create a subdirectory called
53 /backup/mirrors in your huge backup partition.
55 mount <huge_disk> /backup
59 PART 2 - DOING A LEVEL 0 BACKUP
61 (If you use the supplied scripts, a level 0 backup can be accomplished
62 simply by running the 'do_mirror' script with an argument of 0).
64 Create a level 0 backup using a standard cpdup with no special arguments
65 other then -i0 -s0 (tell it not to ask questions and turn off the
66 file-overwrite-with-directory safety feature). Name the mirror with
67 the date in a string-sortable format.
69 set date = `date "+%Y%m%d"`
70 mkdir /backup/mirrors/box1.${date}
71 cpdup -i0 -s0 /nfs/box1/home /backup/mirrors/box1.${date}/home
72 cpdup -i0 -s0 /nfs/box1/var /backup/mirrors/box1.${date}/var
74 Create a softlink to the most recently completed backup, which is your
78 ln -fs /backup/mirrors/box1.${date} /backup/mirrors/box1
80 PART 3 - DO AN INCREMENTAL BACKUP
82 An incremental backup is exactly the same as a level 0 backup EXCEPT
83 you use the -H option to specify the location of the most recent
84 completed backup. We simply maintain the handy softlink pointing at
85 the most recent completed backup and the cpdup required to do this
88 Each day's incremental backup will reproduce the ENTIRE directory topology
89 for the client, but cpdup will hardlink files from the most recent backup
90 instead of copying them and this is what saves you all the disk space.
92 set date = `date "+%Y%m%d"`
93 mkdir /backup/mirrors/box1.${date}
94 if ( "`readlink /backup/mirrors/box1`" == "box1.${date}" ) then
95 echo "silly boy, an incremental already exists for today"
98 cpdup -H /backup/mirrors/box1 \
99 -i0 -s0 /nfs/box1/home /backup/mirrors/box1.${date}/home
101 Be sure to update your 'most recent backup' softlink, but only do it
102 if the cpdup's for all the partitions for that client have succeeded.
103 That way the next incremental backup will be based on the previous one.
105 ln -fs /backup/mirrors/box1.${date} /backup/mirrors/box1
107 Since these backups are mirrors, locating a backup is as simple
108 as CDing into the appropriate directory. If your filesystem has a
109 hardlink limit and cpdup hits it, cpdup will 'break' the hardlink
110 and copy the file instead. Generally speaking only a few special cases
111 will hit the hardlink limit for a filesystem. For example, the
112 CVS/Root file in a checked out cvs repository is often hardlinked, and
113 the sheer number of hardlinked 'Root' files multiplied by the number
114 of backups can often hit the filesystem hardlink limit.
116 PART 4 - DO AN INCREMENTAL VERIFIED BACKUP
118 Since your incremental backups use hardlinks heavily the actual file
119 might exist on the physical /backup disk in only one place even though
120 it may be present in dozens of daily mirrors. To ensure that the
121 file being hardlinked does not get corrupted cpdup's -f option can be
122 used in conjuction with -H to force cpdup to validate the contents
123 of the file, even if all the stat info looks identical.
125 cpdup -f -H /backup/mirrors/box1 ...
127 You can create completely redundant (non-hardlinked-dependant) backups
128 by doing the equivalent of your level 0, i.e. not using -H. However I
129 do NOT recommend that you do this, or that you do it very often (maybe
130 once every 6 months at the most), because each mirror created this way
131 will have a distinct copy of all the file data and you will quickly
132 run out of space in your /backup partition.
134 MAINTAINANCE OF THE "/backup" DIRECTORY
136 Now, clearly you are going to run out of space in /backup if you keep
137 doing this, but you may be surprised at just how many daily incrementals
138 you can create before you fill up your /backup partition.
140 If /backup becomes full, simply start rm -rf'ing older mirror directories
141 until enough space is freed up. You do not have to remove the oldest
142 directory first. In fact, you might want to keep it around and remove
143 a day's backup here, a day's backup there, etc, until you free up enough
148 Making an off-site backup involves similar methodology, but you use
149 cpdup's remote host capability to generate the backup. To avoid
150 complications it is usually best to take a mirror already generated on
151 your LAN backup box and copy that to the remote box.
153 The remote backup box does not use NFS, so setup is trivial. Just
154 create your super-large /backup partition and mkdir /backup/mirrors.
155 Your LAN backup box will need root access via ssh to your remote backup
158 You can use the handy softlink to get the latest 'box1.date' mirror
159 directory and since the mirror is all in one partition you can just
160 cpdup the entire machine in one command. Use the same dated directory
161 name on the remote box, so:
163 # latest will wind up something like 'box1.20060915'
164 set latest = `readlink /backup/mirrors/box1`
165 cpdup -i0 -s0 /backup/mirrors/$latest remote.box:/backup/mirrors/$latest
167 As with your LAN backup, create a softlink on the backup box denoting the
168 latest mirror for any given site.
170 if ( $status == 0 ) then
172 "ln -fs /backup/mirrors/$latest /backup/mirrors/box1"
175 Incremental backups can be accomplished using the same cpdup command,
176 but adding the -H option to the latest backup on the remote box. Note
177 that the -H path is relative to the remote box, not the LAN backup box
178 you are running the command from.
180 set latest = `readlink /backup/mirrors/box1`
181 set remotelatest = `ssh remote.box -n "readlink /backup/mirrors/box1"`
182 if ( "$latest" == "$remotelatest" ) then
183 echo "silly boy, you already made a remote incremental backup today"
186 cpdup -H /backup/mirrors/$remotelatest \
187 -i0 -s0 /backup/mirrors/$latest remote.box:/backup/mirrors/$latest
188 if ( $status == 0 ) then
190 "ln -fs /backup/mirrors/$latest /backup/mirrors/box1"
193 Cleaning out the remote directory works the same as cleaning out the LAN
197 RESTORING FROM BACKUPS
199 Each backup is a full filesystem mirror, and depending on how much space
200 you have you should be able to restore it simply by cd'ing into the
201 appropriate backup directory and using 'cpdup blah box1:blah' (assuming
202 root access), or you can export the backup directory via NFS to your
203 client boxes and use cpdup locally on the client to extract the backup.
204 Using NFS is probably the most efficient solution.
207 PUTTING IT ALL TOGETHER - SOME SCRIPTS
209 Please refer to the scripts in the script/ subdirectory. These scripts
210 are EXAMPLES ONLY. If you want to use them, put them in your ~root/adm
211 directory on your backup box and set up a root crontab.
213 First follow the preparation rules in PART 1 above. The scripts do not
214 do this automatically. Edit the 'params' file that the scripts use
215 to set default paths and such.
217 ** FOLLOW DIRECTIONS IN PART 1 ABOVE TO SET UP THE LAN BACKUP BOX **
219 Copy the scripts to ~/adm. Do NOT install a crontab yet (but an example
220 can be found in scripts/crontab).
222 Do a manual lavel 0 LAN BACKUP using the do_mirror script.
227 Once done you can do incremental backups using './do_mirror 1' to do a
228 verified incremental, or './do_mirror 2' to do a stat-optimized
229 incremental. You can enable the cron jobs that run do_mirror and
234 Setting up an off-site backup box is trivial. The off-site backup box
235 needs to allow root ssh logins from the LAN backup box (at least for
236 now, sorry!). Set up the off-site backup directory, typically
237 /backup/mirrors. Then do a level 0 backup from your LAN backup box
238 to the off-site box using the do_remote script.
243 Once done you can do incremental backups using './do_remote 1' to do a
244 verified incremental, or './do_mirror 2' to do a stat-optimized
245 incremental. You can enable the cron jobs that run do_remote now.
247 NOTE! It is NOT recommended that you use verified-incremental backups
248 over a WAN, as all related data must be copied over the wire every single
249 day. Instead, I recommend sticking with stat-optimized backups
252 You will also need to set up a daily cleaning script on the off-site
255 SCRIPT TODOS - the ./do_cleanup script is not very smart. We really
256 should do a tower-of-hanoi removal