Mention our handbook instead of FreeBSDs.
[dragonfly.git] / share / man / man4 / vinum.4
CommitLineData
984263bc
MD
1.\" Hey, Emacs, edit this file in -*- nroff-fill -*- mode
2.\"-
3.\" Copyright (c) 1997, 1998
4.\" Nan Yang Computer Services Limited. All rights reserved.
5.\"
6.\" This software is distributed under the so-called ``Berkeley
7.\" License'':
8.\"
9.\" Redistribution and use in source and binary forms, with or without
10.\" modification, are permitted provided that the following conditions
11.\" are met:
12.\" 1. Redistributions of source code must retain the above copyright
13.\" notice, this list of conditions and the following disclaimer.
14.\" 2. Redistributions in binary form must reproduce the above copyright
15.\" notice, this list of conditions and the following disclaimer in the
16.\" documentation and/or other materials provided with the distribution.
17.\" 3. All advertising materials mentioning features or use of this software
18.\" must display the following acknowledgement:
19.\" This product includes software developed by Nan Yang Computer
20.\" Services Limited.
21.\" 4. Neither the name of the Company nor the names of its contributors
22.\" may be used to endorse or promote products derived from this software
23.\" without specific prior written permission.
24.\"
25.\" This software is provided ``as is'', and any express or implied
26.\" warranties, including, but not limited to, the implied warranties of
27.\" merchantability and fitness for a particular purpose are disclaimed.
28.\" In no event shall the company or contributors be liable for any
29.\" direct, indirect, incidental, special, exemplary, or consequential
30.\" damages (including, but not limited to, procurement of substitute
31.\" goods or services; loss of use, data, or profits; or business
32.\" interruption) however caused and on any theory of liability, whether
33.\" in contract, strict liability, or tort (including negligence or
34.\" otherwise) arising in any way out of the use of this software, even if
35.\" advised of the possibility of such damage.
36.\"
37.\" $FreeBSD: src/share/man/man4/vinum.4,v 1.22.2.9 2002/04/22 08:19:35 kuriyama Exp $
fee78af7 38.\" $DragonFly: src/share/man/man4/vinum.4,v 1.14 2008/02/11 15:59:37 matthias Exp $
984263bc 39.\"
fee78af7 40.Dd February 11, 2008
4995526a 41.Dt VINUM 4
984263bc
MD
42.Os
43.Sh NAME
44.Nm vinum
45.Nd Logical Volume Manager
46.Sh SYNOPSIS
4995526a 47.Cd "pseudo-device vinum"
984263bc
MD
48.Sh DESCRIPTION
49.Nm
50is a logical volume manager inspired by, but not derived from, the Veritas
4995526a
SW
51Volume Manager.
52It provides the following features:
984263bc
MD
53.Bl -bullet
54.It
4995526a
SW
55It provides device-independent logical disks, called
56.Em volumes .
57Volumes are
984263bc
MD
58not restricted to the size of any disk on the system.
59.It
4995526a
SW
60The volumes consist of one or more
61.Em plexes ,
62each of which contain the
63entire address space of a volume.
64This represents an implementation of RAID-1
65(mirroring).
66Multiple plexes can also be used for
984263bc 67.\" XXX What about sparse plexes? Do we want them?
984263bc
MD
68.Bl -bullet
69.It
70Increased read throughput.
71.Nm
72will read data from the least active disk, so if a volume has plexes on multiple
73disks, more data can be read in parallel.
74.Nm
75reads data from only one plex, but it writes data to all plexes.
76.It
4995526a
SW
77Increased reliability.
78By storing plexes on different disks, data will remain
79available even if one of the plexes becomes unavailable.
80In comparison with a
984263bc
MD
81RAID-5 plex (see below), using multiple plexes requires more storage space, but
82gives better performance, particularly in the case of a drive failure.
83.It
4995526a
SW
84Additional plexes can be used for on-line data reorganization.
85By attaching an
984263bc
MD
86additional plex and subsequently detaching one of the older plexes, data can be
87moved on-line without compromising access.
88.It
4995526a
SW
89An additional plex can be used to obtain a consistent dump of a file system.
90By
984263bc
MD
91attaching an additional plex and detaching at a specific time, the detached plex
92becomes an accurate snapshot of the file system at the time of detachment.
93.\" Make sure to flush!
94.El
95.It
4995526a
SW
96Each plex consists of one or more logical disk slices, called
97.Em subdisks .
98Subdisks are defined as a contiguous block of physical disk storage.
99A plex may
984263bc
MD
100consist of any reasonable number of subdisks (in other words, the real limit is
101not the number, but other factors, such as memory and performance, associated
102with maintaining a large number of subdisks).
103.It
104A number of mappings between subdisks and plexes are available:
105.Bl -bullet
106.It
4995526a
SW
107.Em "Concatenated plexes"
108consist of one or more subdisks, each of which
984263bc
MD
109is mapped to a contiguous part of the plex address space.
110.It
4995526a
SW
111.Em "Striped plexes"
112consist of two or more subdisks of equal size.
113The file
114address space is mapped in
115.Em stripes ,
116integral fractions of the subdisk
117size.
118Consecutive plex address space is mapped to stripes in each subdisk in
984263bc 119turn.
4995526a 120.if t \{\
984263bc
MD
121.ig
122.\" FIXME
123.br
124.ne 1.5i
125.PS
126move right 2i
127down
128SD0: box
129SD1: box
130SD2: box
131
132"plex 0" at SD0.n+(0,.2)
133"subdisk 0" rjust at SD0.w-(.2,0)
134"subdisk 1" rjust at SD1.w-(.2,0)
135"subdisk 2" rjust at SD2.w-(.2,0)
136.PE
137..
138.\}
139The subdisks of a striped plex must all be the same size.
140.It
4995526a
SW
141.Em "RAID-5 plexes"
142require at least three equal-sized subdisks.
143They
984263bc 144resemble striped plexes, except that in each stripe, one subdisk stores parity
4995526a
SW
145information.
146This subdisk changes in each stripe: in the first stripe, it is the
147first subdisk, in the second it is the second subdisk, etc.
148In the event of a
984263bc
MD
149single disk failure,
150.Nm
151will recover the data based on the information stored on the remaining subdisks.
4995526a
SW
152This mapping is particularly suited to read-intensive access.
153The subdisks of a
984263bc
MD
154RAID-5 plex must all be the same size.
155.\" Make sure to flush!
156.El
157.It
4995526a
SW
158.Em Drives
159are the lowest level of the storage hierarchy.
160They represent disk special
984263bc
MD
161devices.
162.It
163.Nm
4995526a
SW
164offers automatic startup.
165Unlike
166.Ux
167file systems,
984263bc
MD
168.Nm
169volumes contain all the configuration information needed to ensure that they are
4995526a
SW
170started correctly when the subsystem is enabled.
171This is also a significant
172advantage over the Veritas\(tm File System.
173This feature regards the presence
174of the volumes.
175It does not mean that the volumes will be mounted
984263bc
MD
176automatically, since the standard startup procedures with
177.Pa /etc/fstab
178perform this function.
179.El
180.Sh KERNEL CONFIGURATION
181.Nm
4995526a
SW
182is currently supplied as a KLD module, and does not require
183configuration.
184As with other klds, it is absolutely necessary to match the kld
185to the version of the operating system.
186Failure to do so will cause
984263bc
MD
187.Nm
188to issue an error message and terminate.
189.Pp
190It is possible to configure
191.Nm
4995526a
SW
192in the kernel, but this is not recommended.
193To do so, add this line to the
984263bc 194kernel configuration file:
984263bc 195.Pp
4995526a
SW
196.D1 Cd "pseudo-device vinum"
197.Ss Debug Options
984263bc
MD
198The current version of
199.Nm ,
200both the kernel module and the user program
201.Xr vinum 8 ,
4995526a
SW
202include significant debugging support.
203It is not recommended to remove
984263bc 204this support at the moment, but if you do you must remove it from both the
4995526a
SW
205kernel and the user components.
206To do this, edit the files
984263bc
MD
207.Pa /usr/src/sbin/vinum/Makefile
208and
0cd7b3a7 209.Pa /sys/dev/raid/vinum/Makefile
4995526a
SW
210and edit the
211.Va CFLAGS
212variable to remove the
213.Li -DVINUMDEBUG
214option.
215If you have
984263bc
MD
216configured
217.Nm
218into the kernel, either specify the line
984263bc 219.Pp
4995526a
SW
220.D1 Cd "options VINUMDEBUG"
221.Pp
222in the kernel configuration file or remove the
223.Li -DVINUMDEBUG
224option from
984263bc
MD
225.Pa /usr/src/sbin/vinum/Makefile
226as described above.
227.Pp
4995526a
SW
228If the
229.Va VINUMDEBUG
230variables do not match,
984263bc
MD
231.Xr vinum 8
232will fail with a message
233explaining the problem and what to do to correct it.
234.Pp
235.Nm
236was previously available in two versions: a freely available version which did
237not contain RAID-5 functionality, and a full version including RAID-5
4995526a
SW
238functionality, which was available only from Cybernet Systems Inc.
239The present
984263bc
MD
240version of
241.Nm
242includes the RAID-5 functionality.
243.Sh RUNNING VINUM
244.Nm
245is part of the base
9bb2a92d 246.Dx
4995526a
SW
247system.
248It does not require installation.
984263bc 249To start it, start the
4995526a 250.Xr vinum 8
984263bc
MD
251program, which will load the kld if it is not already present.
252Before using
253.Nm ,
4995526a
SW
254it must be configured.
255See
984263bc
MD
256.Xr vinum 8
257for information on how to create a
258.Nm
259configuration.
260.Pp
261Normally, you start a configured version of
262.Nm
4995526a
SW
263at boot time.
264Set the variable
265.Va start_vinum
984263bc
MD
266in
267.Pa /etc/rc.conf
268to
4995526a 269.Dq Li YES
984263bc
MD
270to start
271.Nm
272at boot time.
4995526a
SW
273(See
274.Xr rc.conf 5
275for more details.)
984263bc
MD
276.Pp
277If
278.Nm
279is loaded as a kld (the recommended way), the
9b5a9965 280.Nm Cm stop
4995526a
SW
281command will unload it
282(see
283.Xr vinum 8 ) .
284You can also do this with the
285.Xr kldunload 8
984263bc
MD
286command.
287.Pp
288The kld can only be unloaded when idle, in other words when no volumes are
289mounted and no other instances of the
4995526a
SW
290.Xr vinum 8
291program are active.
292Unloading the kld does not harm the data in the volumes.
293.Ss Configuring and Starting Objects
984263bc
MD
294Use the
295.Xr vinum 8
296utility to configure and start
297.Nm
298objects.
299.Sh IOCTL CALLS
4995526a 300.Xr ioctl 2
984263bc 301calls are intended for the use of the
4995526a
SW
302.Xr vinum 8
303configuration program only.
304They are described in the header file
305.Pa /sys/dev/raid/vinum/vinumio.h .
306.Ss Disk Labels
984263bc 307Conventional disk special devices have a
4995526a
SW
308.Em "disk label"
309in the second sector of the device.
310See
984263bc 311.Xr disklabel 5
4995526a
SW
312for more details.
313This disk label describes the layout of the partitions within
984263bc
MD
314the device.
315.Nm
316does not subdivide volumes, so volumes do not contain a physical disk label.
317For convenience,
318.Nm
4995526a
SW
319implements the ioctl calls
320.Dv DIOCGDINFO
321(get disk label),
322.Dv DIOCGPART
323(get partition information),
324.Dv DIOCWDINFO
325(write partition information) and
326.Dv DIOCSDINFO
327(set partition information).
328.Dv DIOCGDINFO
329and
330.Dv DIOCGPART
331refer to an internal
332representation of the disk label which is not present on the volume.
333As a
984263bc
MD
334result, the
335.Fl r
336option of
337.Xr disklabel 8 ,
338which reads the
4995526a 339.Dq "raw disk" ,
984263bc
MD
340will fail.
341.Pp
342In general,
343.Xr disklabel 8
4995526a 344serves no useful purpose on a
9b5a9965 345.Nm
4995526a 346volume.
984263bc
MD
347.Pp
348.Nm
4995526a
SW
349ignores the
350.Dv DIOCWDINFO
351and
352.Dv DIOCSDINFO ioctls, since there is nothing to change.
984263bc
MD
353As a result, any attempt to modify the disk label will be silently ignored.
354.Sh MAKING FILE SYSTEMS
355Since
356.Nm
357volumes do not contain partitions, the names do not need to conform to the
4995526a
SW
358standard rules for naming disk partitions.
359For a physical disk partition, the
de9c90f5 360last letter of the device name specifies the partition identifier (a to p).
984263bc
MD
361.Nm
362volumes need not conform to this convention, but if they do not,
4995526a
SW
363.Xr newfs 8
364will complain that it cannot determine the partition.
365To solve this problem,
984263bc
MD
366use the
367.Fl v
368flag to
4995526a 369.Xr newfs 8 .
984263bc
MD
370For example, if you have a volume
371.Pa concat ,
4995526a 372use the following command to create a UFS file system on it:
984263bc 373.Pp
4995526a 374.Dl "newfs -v /dev/vinum/concat"
984263bc
MD
375.Sh OBJECT NAMING
376.Nm
377assigns default names to plexes and subdisks, although they may be overridden.
4995526a
SW
378We do not recommend overriding the default names.
379Experience with the
380Veritas\(tm
3221afbe 381volume manager, which allows arbitrary naming of objects, has shown that this
984263bc 382flexibility does not bring a significant advantage, and it can cause confusion.
4995526a 383.Pp
984263bc 384Names may contain any non-blank character, but it is recommended to restrict
4995526a
SW
385them to letters, digits and the underscore characters.
386The names of volumes,
984263bc 387plexes and subdisks may be up to 64 characters long, and the names of drives may
4995526a
SW
388up to 32 characters long.
389When choosing volume and plex names, bear in mind
984263bc
MD
390that automatically generated plex and subdisk names are longer than the name
391from which they are derived.
392.Bl -bullet
393.It
394When
4995526a 395.Nm
984263bc
MD
396creates or deletes objects, it creates a directory
397.Pa /dev/vinum ,
4995526a
SW
398in which it makes device entries for each volume.
399It also creates the
400subdirectories,
984263bc
MD
401.Pa /dev/vinum/plex
402and
403.Pa /dev/vinum/sd ,
404in which it stores device entries for the plexes and subdisks. In addition, it
405creates two more directories,
406.Pa /dev/vinum/vol
407and
408.Pa /dev/vinum/drive ,
409in which it stores hierarchical information for volumes and drives.
410.It
411In addition,
412.Nm
413creates three super-devices,
414.Pa /dev/vinum/control ,
415.Pa /dev/vinum/Control
416and
417.Pa /dev/vinum/controld .
418.Pa /dev/vinum/control
419is used by
420.Xr vinum 8
4995526a
SW
421when it has been compiled without the
422.Dv VINUMDEBUG
423option,
984263bc
MD
424.Pa /dev/vinum/Control
425is used by
426.Xr vinum 8
4995526a
SW
427when it has been compiled with the
428.Dv VINUMDEBUG
429option, and
984263bc
MD
430.Pa /dev/vinum/controld
431is used by the
432.Nm
4995526a
SW
433daemon.
434The two control devices for
984263bc
MD
435.Xr vinum 8
436are used to synchronize the debug status of kernel and user modules.
437.It
438Unlike
6e61cee1 439.Ux
984263bc
MD
440drives,
441.Nm
442volumes are not subdivided into partitions, and thus do not contain a disk
4995526a
SW
443label.
444Unfortunately, this confuses a number of utilities, notably
445.Xr newfs 8 ,
984263bc
MD
446which normally tries to interpret the last letter of a
447.Nm
4995526a
SW
448volume name as a partition identifier.
449If you use a volume name which does not
984263bc 450end in the letters
4995526a 451.Ql a
984263bc 452to
4995526a 453.Ql c ,
984263bc
MD
454you must use the
455.Fl v
456flag to
4995526a 457.Xr newfs 8
984263bc
MD
458in order to tell it to ignore this convention.
459.\"
460.It
4995526a
SW
461Plexes do not need to be assigned explicit names.
462By default, a plex name is
463the name of the volume followed by the letters
464.Pa .p
465and the number of the
466plex.
467For example, the plexes of volume
468.Pa vol3
984263bc 469are called
4995526a
SW
470.Pa vol3.p0 , vol3.p1
471and so on.
472These names can be overridden, but it is not recommended.
984263bc
MD
473.It
474Like plexes, subdisks are assigned names automatically, and explicit naming is
4995526a
SW
475discouraged.
476A subdisk name is the name of the plex followed by the letters
477.Pa .s
478and a number identifying the subdisk.
479For example, the subdisks of
984263bc 480plex
4995526a 481.Pa vol3.p0
984263bc 482are called
4995526a 483.Pa vol3.p0.s0 , vol3.p0.s1
984263bc 484and so on.
984263bc
MD
485.It
486By contrast,
4995526a
SW
487.Em drives
488must be named.
489This makes it possible to move a drive to a different location
490and still recognize it automatically.
491Drive names may be up to 32 characters
984263bc
MD
492long.
493.El
4995526a 494.Ss Example
984263bc
MD
495Assume the
496.Nm
4995526a
SW
497objects described in the section
498.Sx "CONFIGURATION FILE"
499in
984263bc
MD
500.Xr vinum 8 .
501The directory
4995526a 502.Pa /dev/vinum
984263bc
MD
503looks like:
504.Bd -literal -offset indent
505# ls -lR /dev/vinum
506total 5
507crwxr-xr-- 1 root wheel 91, 2 Mar 30 16:08 concat
508crwx------ 1 root wheel 91, 0x40000000 Mar 30 16:08 control
509crwx------ 1 root wheel 91, 0x40000001 Mar 30 16:08 controld
510drwxrwxrwx 2 root wheel 512 Mar 30 16:08 drive
511drwxrwxrwx 2 root wheel 512 Mar 30 16:08 plex
512drwxrwxrwx 2 root wheel 512 Mar 30 16:08 rvol
513drwxrwxrwx 2 root wheel 512 Mar 30 16:08 sd
514crwxr-xr-- 1 root wheel 91, 3 Mar 30 16:08 strcon
515crwxr-xr-- 1 root wheel 91, 1 Mar 30 16:08 stripe
516crwxr-xr-- 1 root wheel 91, 0 Mar 30 16:08 tinyvol
517drwxrwxrwx 7 root wheel 512 Mar 30 16:08 vol
518crwxr-xr-- 1 root wheel 91, 4 Mar 30 16:08 vol5
519
520/dev/vinum/drive:
521total 0
522crw-r----- 1 root operator 4, 15 Oct 21 16:51 drive2
523crw-r----- 1 root operator 4, 31 Oct 21 16:51 drive4
524
525/dev/vinum/plex:
526total 0
527crwxr-xr-- 1 root wheel 91, 0x10000002 Mar 30 16:08 concat.p0
528crwxr-xr-- 1 root wheel 91, 0x10010002 Mar 30 16:08 concat.p1
529crwxr-xr-- 1 root wheel 91, 0x10000003 Mar 30 16:08 strcon.p0
530crwxr-xr-- 1 root wheel 91, 0x10010003 Mar 30 16:08 strcon.p1
531crwxr-xr-- 1 root wheel 91, 0x10000001 Mar 30 16:08 stripe.p0
532crwxr-xr-- 1 root wheel 91, 0x10000000 Mar 30 16:08 tinyvol.p0
533crwxr-xr-- 1 root wheel 91, 0x10000004 Mar 30 16:08 vol5.p0
534crwxr-xr-- 1 root wheel 91, 0x10010004 Mar 30 16:08 vol5.p1
535
536/dev/vinum/sd:
537total 0
538crwxr-xr-- 1 root wheel 91, 0x20000002 Mar 30 16:08 concat.p0.s0
539crwxr-xr-- 1 root wheel 91, 0x20100002 Mar 30 16:08 concat.p0.s1
540crwxr-xr-- 1 root wheel 91, 0x20010002 Mar 30 16:08 concat.p1.s0
541crwxr-xr-- 1 root wheel 91, 0x20000003 Mar 30 16:08 strcon.p0.s0
542crwxr-xr-- 1 root wheel 91, 0x20100003 Mar 30 16:08 strcon.p0.s1
543crwxr-xr-- 1 root wheel 91, 0x20010003 Mar 30 16:08 strcon.p1.s0
544crwxr-xr-- 1 root wheel 91, 0x20110003 Mar 30 16:08 strcon.p1.s1
545crwxr-xr-- 1 root wheel 91, 0x20000001 Mar 30 16:08 stripe.p0.s0
546crwxr-xr-- 1 root wheel 91, 0x20100001 Mar 30 16:08 stripe.p0.s1
547crwxr-xr-- 1 root wheel 91, 0x20000000 Mar 30 16:08 tinyvol.p0.s0
548crwxr-xr-- 1 root wheel 91, 0x20100000 Mar 30 16:08 tinyvol.p0.s1
549crwxr-xr-- 1 root wheel 91, 0x20000004 Mar 30 16:08 vol5.p0.s0
550crwxr-xr-- 1 root wheel 91, 0x20100004 Mar 30 16:08 vol5.p0.s1
551crwxr-xr-- 1 root wheel 91, 0x20010004 Mar 30 16:08 vol5.p1.s0
552crwxr-xr-- 1 root wheel 91, 0x20110004 Mar 30 16:08 vol5.p1.s1
553
554/dev/vinum/vol:
555total 5
556crwxr-xr-- 1 root wheel 91, 2 Mar 30 16:08 concat
557drwxr-xr-x 4 root wheel 512 Mar 30 16:08 concat.plex
558crwxr-xr-- 1 root wheel 91, 3 Mar 30 16:08 strcon
559drwxr-xr-x 4 root wheel 512 Mar 30 16:08 strcon.plex
560crwxr-xr-- 1 root wheel 91, 1 Mar 30 16:08 stripe
561drwxr-xr-x 3 root wheel 512 Mar 30 16:08 stripe.plex
562crwxr-xr-- 1 root wheel 91, 0 Mar 30 16:08 tinyvol
563drwxr-xr-x 3 root wheel 512 Mar 30 16:08 tinyvol.plex
564crwxr-xr-- 1 root wheel 91, 4 Mar 30 16:08 vol5
565drwxr-xr-x 4 root wheel 512 Mar 30 16:08 vol5.plex
566
567/dev/vinum/vol/concat.plex:
568total 2
569crwxr-xr-- 1 root wheel 91, 0x10000002 Mar 30 16:08 concat.p0
570drwxr-xr-x 2 root wheel 512 Mar 30 16:08 concat.p0.sd
571crwxr-xr-- 1 root wheel 91, 0x10010002 Mar 30 16:08 concat.p1
572drwxr-xr-x 2 root wheel 512 Mar 30 16:08 concat.p1.sd
573
574/dev/vinum/vol/concat.plex/concat.p0.sd:
575total 0
576crwxr-xr-- 1 root wheel 91, 0x20000002 Mar 30 16:08 concat.p0.s0
577crwxr-xr-- 1 root wheel 91, 0x20100002 Mar 30 16:08 concat.p0.s1
578
579/dev/vinum/vol/concat.plex/concat.p1.sd:
580total 0
581crwxr-xr-- 1 root wheel 91, 0x20010002 Mar 30 16:08 concat.p1.s0
582
583/dev/vinum/vol/strcon.plex:
584total 2
585crwxr-xr-- 1 root wheel 91, 0x10000003 Mar 30 16:08 strcon.p0
586drwxr-xr-x 2 root wheel 512 Mar 30 16:08 strcon.p0.sd
587crwxr-xr-- 1 root wheel 91, 0x10010003 Mar 30 16:08 strcon.p1
588drwxr-xr-x 2 root wheel 512 Mar 30 16:08 strcon.p1.sd
589
590/dev/vinum/vol/strcon.plex/strcon.p0.sd:
591total 0
592crwxr-xr-- 1 root wheel 91, 0x20000003 Mar 30 16:08 strcon.p0.s0
593crwxr-xr-- 1 root wheel 91, 0x20100003 Mar 30 16:08 strcon.p0.s1
594
595/dev/vinum/vol/strcon.plex/strcon.p1.sd:
596total 0
597crwxr-xr-- 1 root wheel 91, 0x20010003 Mar 30 16:08 strcon.p1.s0
598crwxr-xr-- 1 root wheel 91, 0x20110003 Mar 30 16:08 strcon.p1.s1
599
600/dev/vinum/vol/stripe.plex:
601total 1
602crwxr-xr-- 1 root wheel 91, 0x10000001 Mar 30 16:08 stripe.p0
603drwxr-xr-x 2 root wheel 512 Mar 30 16:08 stripe.p0.sd
604
605/dev/vinum/vol/stripe.plex/stripe.p0.sd:
606total 0
607crwxr-xr-- 1 root wheel 91, 0x20000001 Mar 30 16:08 stripe.p0.s0
608crwxr-xr-- 1 root wheel 91, 0x20100001 Mar 30 16:08 stripe.p0.s1
609
610/dev/vinum/vol/tinyvol.plex:
611total 1
612crwxr-xr-- 1 root wheel 91, 0x10000000 Mar 30 16:08 tinyvol.p0
613drwxr-xr-x 2 root wheel 512 Mar 30 16:08 tinyvol.p0.sd
614
615/dev/vinum/vol/tinyvol.plex/tinyvol.p0.sd:
616total 0
617crwxr-xr-- 1 root wheel 91, 0x20000000 Mar 30 16:08 tinyvol.p0.s0
618crwxr-xr-- 1 root wheel 91, 0x20100000 Mar 30 16:08 tinyvol.p0.s1
619
620/dev/vinum/vol/vol5.plex:
621total 2
622crwxr-xr-- 1 root wheel 91, 0x10000004 Mar 30 16:08 vol5.p0
623drwxr-xr-x 2 root wheel 512 Mar 30 16:08 vol5.p0.sd
624crwxr-xr-- 1 root wheel 91, 0x10010004 Mar 30 16:08 vol5.p1
625drwxr-xr-x 2 root wheel 512 Mar 30 16:08 vol5.p1.sd
626
627/dev/vinum/vol/vol5.plex/vol5.p0.sd:
628total 0
629crwxr-xr-- 1 root wheel 91, 0x20000004 Mar 30 16:08 vol5.p0.s0
630crwxr-xr-- 1 root wheel 91, 0x20100004 Mar 30 16:08 vol5.p0.s1
631
632/dev/vinum/vol/vol5.plex/vol5.p1.sd:
633total 0
634crwxr-xr-- 1 root wheel 91, 0x20010004 Mar 30 16:08 vol5.p1.s0
635crwxr-xr-- 1 root wheel 91, 0x20110004 Mar 30 16:08 vol5.p1.s1
636.Ed
637.Pp
4995526a
SW
638In the case of unattached plexes and subdisks, the naming is reversed.
639Subdisks
984263bc
MD
640are named after the disk on which they are located, and plexes are named after
641the subdisk.
642.\" XXX
4995526a
SW
643.Bf -symbolic
644This mapping is still to be determined.
645.Ef
646.Ss Object States
984263bc
MD
647Each
648.Nm
4995526a
SW
649object has a
650.Em state
651associated with it.
984263bc
MD
652.Nm
653uses this state to determine the handling of the object.
4995526a 654.Ss Volume States
984263bc 655Volumes may have the following states:
984263bc 656.Bl -hang -width 14n
4995526a 657.It Em down
984263bc 658The volume is completely inaccessible.
4995526a
SW
659.It Em up
660The volume is up and at least partially functional.
661Not all plexes may be
984263bc
MD
662available.
663.El
4995526a 664.Ss "Plex States"
984263bc 665Plexes may have the following states:
984263bc 666.Bl -hang -width 14n
4995526a 667.It Em referenced
984263bc
MD
668A plex entry which has been referenced as part of a volume, but which is
669currently not known.
4995526a 670.It Em faulty
984263bc 671A plex which has gone completely down because of I/O errors.
4995526a 672.It Em down
984263bc 673A plex which has been taken down by the administrator.
4995526a 674.It Em initializing
984263bc 675A plex which is being initialized.
4995526a
SW
676.El
677.Pp
984263bc 678The remaining states represent plexes which are at least partially up.
4995526a
SW
679.Bl -hang -width 14n
680.It Em corrupt
681A plex entry which is at least partially up.
682Not all subdisks are available,
683and an inconsistency has occurred.
684If no other plex is uncorrupted, the volume
984263bc 685is no longer consistent.
4995526a 686.It Em degraded
984263bc
MD
687A RAID-5 plex entry which is accessible, but one subdisk is down, requiring
688recovery for many I/O requests.
4995526a
SW
689.It Em flaky
690A plex which is really up, but which has a reborn subdisk which we do not
691completely trust, and which we do not want to read if we can avoid it.
692.It Em up
693A plex entry which is completely up.
694All subdisks are up.
984263bc 695.El
4995526a 696.Ss "Subdisk States"
984263bc 697Subdisks can have the following states:
984263bc 698.Bl -hang -width 14n
4995526a
SW
699.It Em empty
700A subdisk entry which has been created completely.
701All fields are correct, and
984263bc 702the disk has been updated, but the on the disk is not valid.
4995526a 703.It Em referenced
984263bc
MD
704A subdisk entry which has been referenced as part of a plex, but which is
705currently not known.
4995526a 706.It Em initializing
984263bc
MD
707A subdisk entry which has been created completely and which is currently being
708initialized.
4995526a
SW
709.El
710.Pp
984263bc 711The following states represent invalid data.
4995526a
SW
712.Bl -hang -width 14n
713.It Em obsolete
714A subdisk entry which has been created completely.
715All fields are correct, the
984263bc
MD
716config on disk has been updated, and the data was valid, but since then the
717drive has been taken down, and as a result updates have been missed.
4995526a
SW
718.It Em stale
719A subdisk entry which has been created completely.
720All fields are correct, the
984263bc
MD
721disk has been updated, and the data was valid, but since then the drive has been
722crashed and updates have been lost.
4995526a
SW
723.El
724.Pp
984263bc 725The following states represent valid, inaccessible data.
4995526a
SW
726.Bl -hang -width 14n
727.It Em crashed
728A subdisk entry which has been created completely.
729All fields are correct, the
984263bc 730disk has been updated, and the data was valid, but since then the drive has gone
4995526a
SW
731down.
732No attempt has been made to write to the subdisk since the crash, so the
984263bc 733data is valid.
4995526a 734.It Em down
984263bc 735A subdisk entry which was up, which contained valid data, and which was taken
4995526a
SW
736down by the administrator.
737The data is valid.
738.It Em reviving
739The subdisk is currently in the process of being revived.
740We can write but not
984263bc 741read.
4995526a
SW
742.El
743.Pp
984263bc 744The following states represent accessible subdisks with valid data.
4995526a
SW
745.Bl -hang -width 14n
746.It Em reborn
747A subdisk entry which has been created completely.
748All fields are correct, the
984263bc 749disk has been updated, and the data was valid, but since then the drive has gone
4995526a
SW
750down and up again.
751No updates were lost, but it is possible that the subdisk
752has been damaged.
753We won't read from this subdisk if we have a choice.
754If this
984263bc
MD
755is the only subdisk which covers this address space in the plex, we set its
756state to up under these circumstances, so this status implies that there is
757another subdisk to fulfil the request.
4995526a
SW
758.It Em up
759A subdisk entry which has been created completely.
760All fields are correct, the
984263bc
MD
761disk has been updated, and the data is valid.
762.El
4995526a 763.Ss "Drive States"
984263bc 764Drives can have the following states:
984263bc 765.Bl -hang -width 14n
4995526a 766.It Em referenced
984263bc 767At least one subdisk refers to the drive, but it is not currently accessible to
4995526a
SW
768the system.
769No device name is known.
770.It Em down
984263bc 771The drive is not accessible.
4995526a 772.It Em up
984263bc
MD
773The drive is up and running.
774.El
984263bc
MD
775.Sh DEBUGGING PROBLEMS WITH VINUM
776Solving problems with
777.Nm
4995526a
SW
778can be a difficult affair.
779This section suggests some approaches.
984263bc 780.Ss Configuration problems
984263bc
MD
781It is relatively easy (too easy) to run into problems with the
782.Nm
4995526a
SW
783configuration.
784If you do, the first thing you should do is stop configuration
984263bc 785updates:
4995526a
SW
786.Pp
787.Dl "vinum setdaemon 4"
984263bc
MD
788.Pp
789This will stop updates and any further corruption of the on-disk configuration.
790.Pp
791Next, look at the on-disk configuration with the
9b5a9965 792.Nm Cm dumpconfig
984263bc
MD
793command, for example:
794.if t .ps -3
795.if t .vs -3
796.Bd -literal
797# \fBvinum dumpconfig\fP
de9c90f5 798Drive 4: Device /dev/da3s0h
984263bc
MD
799 Created on crash.lemis.com at Sat May 20 16:32:44 2000
800 Config last updated Sat May 20 16:32:56 2000
801 Size: 601052160 bytes (573 MB)
802volume obj state up
803volume src state up
804volume raid state down
805volume r state down
806volume foo state up
807plex name obj.p0 state corrupt org concat vol obj
808plex name obj.p1 state corrupt org striped 128b vol obj
809plex name src.p0 state corrupt org striped 128b vol src
810plex name src.p1 state up org concat vol src
811plex name raid.p0 state faulty org disorg vol raid
812plex name r.p0 state faulty org disorg vol r
813plex name foo.p0 state up org concat vol foo
814plex name foo.p1 state faulty org concat vol foo
815sd name obj.p0.s0 drive drive2 plex obj.p0 state reborn len 409600b driveoffset 265b plexoffset 0b
816sd name obj.p0.s1 drive drive4 plex obj.p0 state up len 409600b driveoffset 265b plexoffset 409600b
817sd name obj.p1.s0 drive drive1 plex obj.p1 state up len 204800b driveoffset 265b plexoffset 0b
818sd name obj.p1.s1 drive drive2 plex obj.p1 state reborn len 204800b driveoffset 409865b plexoffset 128b
819sd name obj.p1.s2 drive drive3 plex obj.p1 state up len 204800b driveoffset 265b plexoffset 256b
820sd name obj.p1.s3 drive drive4 plex obj.p1 state up len 204800b driveoffset 409865b plexoffset 384b
821.Ed
4995526a
SW
822.if t .vs +3
823.if t .ps +3
984263bc 824.Pp
4995526a
SW
825The configuration on all disks should be the same.
826If this is not the case,
827please save the output to a file and report the problem.
828There is probably
984263bc
MD
829little that can be done to recover the on-disk configuration, but if you keep a
830copy of the files used to create the objects, you should be able to re-create
4995526a
SW
831them.
832The
984263bc
MD
833.Cm create
834command does not change the subdisk data, so this will not cause data
4995526a
SW
835corruption.
836You may need to use the
984263bc
MD
837.Cm resetconfig
838command if you have this kind of trouble.
839.Ss Kernel Panics
984263bc
MD
840In order to analyse a panic which you suspect comes from
841.Nm
4995526a
SW
842you will need to build a debug kernel.
843See the online handbook at
fee78af7 844.Pa http://wiki.dragonflybsd.org/index.cgi/DebugKernelCrashDumps
984263bc
MD
845for more details of how to do this.
846.Pp
847Perform the following steps to analyse a
848.Nm
849problem:
850.Bl -enum
851.It
4995526a
SW
852Copy the following files to the directory in which you will be
853performing the analysis, typically
854.Pa /var/crash :
855.Pp
856.Bl -bullet -compact
857.It
0cd7b3a7 858.Pa /sys/dev/raid/vinum/.gdbinit.crash ,
4995526a 859.It
0cd7b3a7 860.Pa /sys/dev/raid/vinum/.gdbinit.kernel ,
4995526a 861.It
0cd7b3a7 862.Pa /sys/dev/raid/vinum/.gdbinit.serial ,
4995526a 863.It
0cd7b3a7 864.Pa /sys/dev/raid/vinum/.gdbinit.vinum
984263bc 865and
4995526a 866.It
0cd7b3a7 867.Pa /sys/dev/raid/vinum/.gdbinit.vinum.paths
4995526a 868.El
984263bc
MD
869.It
870Make sure that you build the
871.Nm
4995526a
SW
872module with debugging information.
873The standard
984263bc 874.Pa Makefile
4995526a
SW
875builds a module with debugging symbols by default.
876If the version of
984263bc
MD
877.Nm
878in
879.Pa /modules
880does not contain symbols, you will not get an error message, but the stack trace
4995526a
SW
881will not show the symbols.
882Check the module before starting
de9c90f5 883.Xr kgdb 1 :
984263bc
MD
884.Bd -literal
885$ file /modules/vinum.ko
886/modules/vinum.ko: ELF 32-bit LSB shared object, Intel 80386,
887 version 1 (FreeBSD), not stripped
888.Ed
889.Pp
890If the output shows that
891.Pa /modules/vinum.ko
4995526a
SW
892is stripped, you will have to find a version which is not.
893Usually this will be
984263bc 894either in
0cd7b3a7 895.Pa /usr/obj/usr/src/sys/SYSTEM_NAME/usr/src/sys/dev/raid/vinum/vinum.ko
984263bc
MD
896(if you have built
897.Nm
898with a
4995526a 899.Dq Li "make world" )
984263bc 900or
0cd7b3a7 901.Pa /sys/dev/raid/vinum/vinum.ko
984263bc
MD
902(if you have built
903.Nm
4995526a
SW
904in this directory).
905Modify the file
984263bc
MD
906.Pa .gdbinit.vinum.paths
907accordingly.
908.It
909Either take a dump or use remote serial
4995526a
SW
910.Xr gdb 1
911to analyse the problem.
912To analyse a dump, say
984263bc
MD
913.Pa /var/crash/vmcore.5 ,
914link
915.Pa /var/crash/.gdbinit.crash
916to
917.Pa /var/crash/.gdbinit
918and enter:
4995526a
SW
919.Bd -literal -offset indent
920cd /var/crash
de9c90f5 921kgdb kernel.debug vmcore.5
984263bc
MD
922.Ed
923.Pp
924This example assumes that you have installed the correct debug kernel at
925.Pa /var/crash/kernel.debug .
926If not, substitute the correct name of the debug kernel.
927.Pp
928To perform remote serial debugging,
929link
930.Pa /var/crash/.gdbinit.serial
931to
932.Pa /var/crash/.gdbinit
933and enter
4995526a
SW
934.Bd -literal -offset indent
935cd /var/crash
de9c90f5 936kgdb kernel.debug
984263bc
MD
937.Ed
938.Pp
939In this case, the
940.Pa .gdbinit
4995526a
SW
941file performs the functions necessary to establish connection.
942The remote
984263bc 943machine must already be in debug mode: enter the kernel debugger and select
4995526a
SW
944.Ic gdb
945(see
946.Xr ddb 4
947for more details.)
984263bc
MD
948The serial
949.Pa .gdbinit
950file expects the serial connection to run at 38400 bits per second; if you run
951at a different speed, edit the file accordingly (look for the
4995526a 952.Va remotebaud
984263bc
MD
953specification).
954.Pp
955The following example shows a remote debugging session using the
4995526a 956.Ic debug
984263bc
MD
957command of
958.Xr vinum 8 :
4995526a 959.Bd -literal
984263bc
MD
960.if t .ps -3
961.if t .vs -3
d77e8324 962GDB 4.16 (i386-unknown-dragonfly), Copyright 1996 Free Software Foundation, Inc.
984263bc
MD
963Debugger (msg=0xf1093174 "vinum debug") at ../../i386/i386/db_interface.c:318
964318 in_Debugger = 0;
965#1 0xf108d9bc in vinumioctl (dev=0x40001900, cmd=0xc008464b, data=0xf6dedee0 "",
966 flag=0x3, p=0xf68b7940) at
0cd7b3a7 967 /usr/src/sys/dev/raid/vinum/vinumioctl.c:102
984263bc
MD
968102 Debugger ("vinum debug");
969(kgdb) bt
970#0 Debugger (msg=0xf0f661ac "vinum debug") at ../../i386/i386/db_interface.c:318
971#1 0xf0f60a7c in vinumioctl (dev=0x40001900, cmd=0xc008464b, data=0xf6923ed0 "",
972 flag=0x3, p=0xf688e6c0) at
0cd7b3a7 973 /usr/src/sys/dev/raid/vinum/vinumioctl.c:109
984263bc
MD
974#2 0xf01833b7 in spec_ioctl (ap=0xf6923e0c) at ../../miscfs/specfs/spec_vnops.c:424
975#3 0xf0182cc9 in spec_vnoperate (ap=0xf6923e0c) at ../../miscfs/specfs/spec_vnops.c:129
976#4 0xf01eb3c1 in ufs_vnoperatespec (ap=0xf6923e0c) at ../../ufs/ufs/ufs_vnops.c:2312
977#5 0xf017dbb1 in vn_ioctl (fp=0xf1007ec0, com=0xc008464b, data=0xf6923ed0 "",
978 p=0xf688e6c0) at vnode_if.h:395
979#6 0xf015dce0 in ioctl (p=0xf688e6c0, uap=0xf6923f84) at ../../kern/sys_generic.c:473
980#7 0xf0214c0b in syscall (frame={tf_es = 0x27, tf_ds = 0x27, tf_edi = 0xefbfcff8,
981 tf_esi = 0x1, tf_ebp = 0xefbfcf90, tf_isp = 0xf6923fd4, tf_ebx = 0x2,
982 tf_edx = 0x804b614, tf_ecx = 0x8085d10, tf_eax = 0x36, tf_trapno = 0x7,
983 tf_err = 0x2, tf_eip = 0x8060a34, tf_cs = 0x1f, tf_eflags = 0x286,
984 tf_esp = 0xefbfcf78, tf_ss = 0x27}) at ../../i386/i386/trap.c:1100
985#8 0xf020a1fc in Xint0x80_syscall ()
986#9 0x804832d in ?? ()
987#10 0x80482ad in ?? ()
988#11 0x80480e9 in ?? ()
984263bc
MD
989.if t .vs
990.if t .ps
4995526a 991.Ed
984263bc 992.Pp
4995526a 993When entering from the debugger, it is important that the source of frame 1
984263bc
MD
994(listed by the
995.Pa .gdbinit
996file at the top of the example) contains the text
4995526a 997.Dq Li "Debugger (\*[q]vinum debug\*[q]);" .
984263bc 998.Pp
4995526a
SW
999This is an indication that the address specifications are correct.
1000If you get
984263bc
MD
1001some other output, your symbols and the kernel module are out of sync, and the
1002trace will be meaningless.
1003.El
1004.Pp
1005For an initial investigation, the most important information is the output of
1006the
4995526a 1007.Ic bt
984263bc 1008(backtrace) command above.
4995526a 1009.Ss Reporting Problems with Vinum
984263bc
MD
1010If you find any bugs in
1011.Nm ,
4995526a
SW
1012please report them to
1013.An Greg Lehey Aq grog@lemis.com .
1014Supply the following
984263bc 1015information:
984263bc
MD
1016.Bl -bullet
1017.It
1018The output of the
9b5a9965 1019.Nm Cm list
4995526a
SW
1020command
1021(see
1022.Xr vinum 8 ) .
984263bc
MD
1023.It
1024Any messages printed in
1025.Pa /var/log/messages .
1026All such messages will be identified by the text
4995526a 1027.Dq Li vinum
984263bc
MD
1028at the beginning.
1029.It
1030If you have a panic, a stack trace as described above.
1031.El
ac561d34
SW
1032.Sh SEE ALSO
1033.Xr disklabel 5 ,
1034.Xr disklabel 8 ,
1035.Xr newfs 8 ,
1036.Xr vinum 8
984263bc
MD
1037.Sh HISTORY
1038.Nm
1039first appeared in
1040.Fx 3.0 .
1041The RAID-5 component of
1042.Nm
4995526a
SW
1043was developed by Cybernet Inc.\&
1044.Pq Pa http://www.cybernet.com/ ,
984263bc 1045for its NetMAX product.
ac561d34
SW
1046.Sh AUTHORS
1047.An Greg Lehey Aq grog@lemis.com .
1048.Sh BUGS
ac561d34 1049.Nm
4995526a
SW
1050is a new product.
1051Bugs can be expected.
1052The configuration mechanism is not yet
1053fully functional.
1054If you have difficulties, please look at the section
1055.Sx "DEBUGGING PROBLEMS WITH VINUM"
1056before reporting problems.
1057.Pp
ac561d34
SW
1058Kernels with the
1059.Nm
4995526a
SW
1060pseudo-device appear to work, but are not supported.
1061If you have trouble with
1062this configuration, please first replace the kernel with a
1063.No non- Ns Nm
ac561d34 1064kernel and test with the kld module.
4995526a 1065.Pp
ac561d34
SW
1066Detection of differences between the version of the kernel and the kld is not
1067yet implemented.
4995526a 1068.Pp
ac561d34
SW
1069The RAID-5 functionality is new in
1070.Fx 3.3 .
1071Some problems have been
1072reported with
1073.Nm
1074in combination with soft updates, but these are not reproducible on all
4995526a
SW
1075systems.
1076If you are planning to use
ac561d34
SW
1077.Nm
1078in a production environment, please test carefully.