nrelease - fix/improve livecd
[dragonfly.git] / share / man / man4 / vinum.4
CommitLineData
984263bc
MD
1.\" Hey, Emacs, edit this file in -*- nroff-fill -*- mode
2.\"-
3.\" Copyright (c) 1997, 1998
4.\" Nan Yang Computer Services Limited. All rights reserved.
5.\"
6.\" This software is distributed under the so-called ``Berkeley
7.\" License'':
8.\"
9.\" Redistribution and use in source and binary forms, with or without
10.\" modification, are permitted provided that the following conditions
11.\" are met:
12.\" 1. Redistributions of source code must retain the above copyright
13.\" notice, this list of conditions and the following disclaimer.
14.\" 2. Redistributions in binary form must reproduce the above copyright
15.\" notice, this list of conditions and the following disclaimer in the
16.\" documentation and/or other materials provided with the distribution.
17.\" 3. All advertising materials mentioning features or use of this software
18.\" must display the following acknowledgement:
19.\" This product includes software developed by Nan Yang Computer
20.\" Services Limited.
21.\" 4. Neither the name of the Company nor the names of its contributors
22.\" may be used to endorse or promote products derived from this software
23.\" without specific prior written permission.
24.\"
25.\" This software is provided ``as is'', and any express or implied
26.\" warranties, including, but not limited to, the implied warranties of
27.\" merchantability and fitness for a particular purpose are disclaimed.
28.\" In no event shall the company or contributors be liable for any
29.\" direct, indirect, incidental, special, exemplary, or consequential
30.\" damages (including, but not limited to, procurement of substitute
31.\" goods or services; loss of use, data, or profits; or business
32.\" interruption) however caused and on any theory of liability, whether
33.\" in contract, strict liability, or tort (including negligence or
34.\" otherwise) arising in any way out of the use of this software, even if
35.\" advised of the possibility of such damage.
36.\"
37.\" $FreeBSD: src/share/man/man4/vinum.4,v 1.22.2.9 2002/04/22 08:19:35 kuriyama Exp $
38.\"
af300af3 39.Dd December 12, 2014
4995526a 40.Dt VINUM 4
984263bc
MD
41.Os
42.Sh NAME
43.Nm vinum
44.Nd Logical Volume Manager
45.Sh SYNOPSIS
4995526a 46.Cd "pseudo-device vinum"
984263bc
MD
47.Sh DESCRIPTION
48.Nm
49is a logical volume manager inspired by, but not derived from, the Veritas
4995526a
SW
50Volume Manager.
51It provides the following features:
984263bc
MD
52.Bl -bullet
53.It
4995526a
SW
54It provides device-independent logical disks, called
55.Em volumes .
56Volumes are
984263bc
MD
57not restricted to the size of any disk on the system.
58.It
4995526a
SW
59The volumes consist of one or more
60.Em plexes ,
61each of which contain the
62entire address space of a volume.
63This represents an implementation of RAID-1
64(mirroring).
65Multiple plexes can also be used for
984263bc 66.\" XXX What about sparse plexes? Do we want them?
984263bc
MD
67.Bl -bullet
68.It
69Increased read throughput.
70.Nm
71will read data from the least active disk, so if a volume has plexes on multiple
72disks, more data can be read in parallel.
73.Nm
74reads data from only one plex, but it writes data to all plexes.
75.It
4995526a
SW
76Increased reliability.
77By storing plexes on different disks, data will remain
78available even if one of the plexes becomes unavailable.
79In comparison with a
984263bc
MD
80RAID-5 plex (see below), using multiple plexes requires more storage space, but
81gives better performance, particularly in the case of a drive failure.
82.It
4995526a
SW
83Additional plexes can be used for on-line data reorganization.
84By attaching an
984263bc
MD
85additional plex and subsequently detaching one of the older plexes, data can be
86moved on-line without compromising access.
87.It
4995526a
SW
88An additional plex can be used to obtain a consistent dump of a file system.
89By
984263bc
MD
90attaching an additional plex and detaching at a specific time, the detached plex
91becomes an accurate snapshot of the file system at the time of detachment.
92.\" Make sure to flush!
93.El
94.It
4995526a
SW
95Each plex consists of one or more logical disk slices, called
96.Em subdisks .
97Subdisks are defined as a contiguous block of physical disk storage.
98A plex may
984263bc
MD
99consist of any reasonable number of subdisks (in other words, the real limit is
100not the number, but other factors, such as memory and performance, associated
101with maintaining a large number of subdisks).
102.It
103A number of mappings between subdisks and plexes are available:
104.Bl -bullet
105.It
4995526a
SW
106.Em "Concatenated plexes"
107consist of one or more subdisks, each of which
984263bc
MD
108is mapped to a contiguous part of the plex address space.
109.It
4995526a
SW
110.Em "Striped plexes"
111consist of two or more subdisks of equal size.
112The file
113address space is mapped in
114.Em stripes ,
115integral fractions of the subdisk
116size.
117Consecutive plex address space is mapped to stripes in each subdisk in
984263bc 118turn.
4995526a 119.if t \{\
984263bc
MD
120.ig
121.\" FIXME
122.br
123.ne 1.5i
124.PS
125move right 2i
126down
127SD0: box
128SD1: box
129SD2: box
130
131"plex 0" at SD0.n+(0,.2)
132"subdisk 0" rjust at SD0.w-(.2,0)
133"subdisk 1" rjust at SD1.w-(.2,0)
134"subdisk 2" rjust at SD2.w-(.2,0)
135.PE
136..
137.\}
138The subdisks of a striped plex must all be the same size.
139.It
4995526a
SW
140.Em "RAID-5 plexes"
141require at least three equal-sized subdisks.
142They
984263bc 143resemble striped plexes, except that in each stripe, one subdisk stores parity
4995526a
SW
144information.
145This subdisk changes in each stripe: in the first stripe, it is the
146first subdisk, in the second it is the second subdisk, etc.
147In the event of a
984263bc
MD
148single disk failure,
149.Nm
150will recover the data based on the information stored on the remaining subdisks.
4995526a
SW
151This mapping is particularly suited to read-intensive access.
152The subdisks of a
984263bc
MD
153RAID-5 plex must all be the same size.
154.\" Make sure to flush!
155.El
156.It
4995526a
SW
157.Em Drives
158are the lowest level of the storage hierarchy.
159They represent disk special
984263bc
MD
160devices.
161.It
162.Nm
4995526a
SW
163offers automatic startup.
164Unlike
165.Ux
166file systems,
984263bc
MD
167.Nm
168volumes contain all the configuration information needed to ensure that they are
4995526a
SW
169started correctly when the subsystem is enabled.
170This is also a significant
171advantage over the Veritas\(tm File System.
172This feature regards the presence
173of the volumes.
174It does not mean that the volumes will be mounted
984263bc
MD
175automatically, since the standard startup procedures with
176.Pa /etc/fstab
177perform this function.
178.El
179.Sh KERNEL CONFIGURATION
180.Nm
4995526a
SW
181is currently supplied as a KLD module, and does not require
182configuration.
183As with other klds, it is absolutely necessary to match the kld
184to the version of the operating system.
185Failure to do so will cause
984263bc
MD
186.Nm
187to issue an error message and terminate.
188.Pp
189It is possible to configure
190.Nm
4995526a
SW
191in the kernel, but this is not recommended.
192To do so, add this line to the
984263bc 193kernel configuration file:
984263bc 194.Pp
4995526a
SW
195.D1 Cd "pseudo-device vinum"
196.Ss Debug Options
984263bc
MD
197The current version of
198.Nm ,
199both the kernel module and the user program
200.Xr vinum 8 ,
4995526a
SW
201include significant debugging support.
202It is not recommended to remove
984263bc 203this support at the moment, but if you do you must remove it from both the
4995526a
SW
204kernel and the user components.
205To do this, edit the files
984263bc
MD
206.Pa /usr/src/sbin/vinum/Makefile
207and
0cd7b3a7 208.Pa /sys/dev/raid/vinum/Makefile
4995526a
SW
209and edit the
210.Va CFLAGS
211variable to remove the
212.Li -DVINUMDEBUG
213option.
214If you have
984263bc
MD
215configured
216.Nm
217into the kernel, either specify the line
984263bc 218.Pp
4995526a
SW
219.D1 Cd "options VINUMDEBUG"
220.Pp
221in the kernel configuration file or remove the
222.Li -DVINUMDEBUG
223option from
984263bc
MD
224.Pa /usr/src/sbin/vinum/Makefile
225as described above.
226.Pp
4995526a
SW
227If the
228.Va VINUMDEBUG
229variables do not match,
984263bc
MD
230.Xr vinum 8
231will fail with a message
232explaining the problem and what to do to correct it.
233.Pp
234.Nm
235was previously available in two versions: a freely available version which did
236not contain RAID-5 functionality, and a full version including RAID-5
4995526a
SW
237functionality, which was available only from Cybernet Systems Inc.
238The present
984263bc
MD
239version of
240.Nm
241includes the RAID-5 functionality.
242.Sh RUNNING VINUM
243.Nm
244is part of the base
9bb2a92d 245.Dx
4995526a
SW
246system.
247It does not require installation.
984263bc 248To start it, start the
4995526a 249.Xr vinum 8
984263bc
MD
250program, which will load the kld if it is not already present.
251Before using
252.Nm ,
4995526a
SW
253it must be configured.
254See
984263bc
MD
255.Xr vinum 8
256for information on how to create a
257.Nm
258configuration.
259.Pp
260Normally, you start a configured version of
261.Nm
4995526a
SW
262at boot time.
263Set the variable
264.Va start_vinum
984263bc
MD
265in
266.Pa /etc/rc.conf
267to
4995526a 268.Dq Li YES
984263bc
MD
269to start
270.Nm
271at boot time.
4995526a
SW
272(See
273.Xr rc.conf 5
274for more details.)
984263bc
MD
275.Pp
276If
277.Nm
278is loaded as a kld (the recommended way), the
9b5a9965 279.Nm Cm stop
4995526a
SW
280command will unload it
281(see
282.Xr vinum 8 ) .
283You can also do this with the
284.Xr kldunload 8
984263bc
MD
285command.
286.Pp
287The kld can only be unloaded when idle, in other words when no volumes are
288mounted and no other instances of the
4995526a
SW
289.Xr vinum 8
290program are active.
291Unloading the kld does not harm the data in the volumes.
292.Ss Configuring and Starting Objects
984263bc
MD
293Use the
294.Xr vinum 8
295utility to configure and start
296.Nm
297objects.
298.Sh IOCTL CALLS
4995526a 299.Xr ioctl 2
984263bc 300calls are intended for the use of the
4995526a
SW
301.Xr vinum 8
302configuration program only.
303They are described in the header file
304.Pa /sys/dev/raid/vinum/vinumio.h .
305.Ss Disk Labels
984263bc 306Conventional disk special devices have a
4995526a
SW
307.Em "disk label"
308in the second sector of the device.
309See
984263bc 310.Xr disklabel 5
4995526a
SW
311for more details.
312This disk label describes the layout of the partitions within
984263bc
MD
313the device.
314.Nm
315does not subdivide volumes, so volumes do not contain a physical disk label.
316For convenience,
317.Nm
4995526a
SW
318implements the ioctl calls
319.Dv DIOCGDINFO
320(get disk label),
321.Dv DIOCGPART
322(get partition information),
323.Dv DIOCWDINFO
324(write partition information) and
325.Dv DIOCSDINFO
326(set partition information).
327.Dv DIOCGDINFO
328and
329.Dv DIOCGPART
330refer to an internal
331representation of the disk label which is not present on the volume.
332As a
984263bc
MD
333result, the
334.Fl r
335option of
336.Xr disklabel 8 ,
337which reads the
4995526a 338.Dq "raw disk" ,
984263bc
MD
339will fail.
340.Pp
341In general,
342.Xr disklabel 8
4995526a 343serves no useful purpose on a
9b5a9965 344.Nm
4995526a 345volume.
984263bc
MD
346.Pp
347.Nm
4995526a
SW
348ignores the
349.Dv DIOCWDINFO
350and
081e4509
SW
351.Dv DIOCSDINFO
352ioctls, since there is nothing to change.
984263bc
MD
353As a result, any attempt to modify the disk label will be silently ignored.
354.Sh MAKING FILE SYSTEMS
355Since
356.Nm
357volumes do not contain partitions, the names do not need to conform to the
4995526a
SW
358standard rules for naming disk partitions.
359For a physical disk partition, the
de9c90f5 360last letter of the device name specifies the partition identifier (a to p).
984263bc
MD
361.Nm
362volumes need not conform to this convention, but if they do not,
4995526a
SW
363.Xr newfs 8
364will complain that it cannot determine the partition.
365To solve this problem,
984263bc
MD
366use the
367.Fl v
368flag to
4995526a 369.Xr newfs 8 .
984263bc
MD
370For example, if you have a volume
371.Pa concat ,
167c1ad2
SW
372use the following command to create a
373.Xr UFS 5
374file system on it:
984263bc 375.Pp
4995526a 376.Dl "newfs -v /dev/vinum/concat"
984263bc
MD
377.Sh OBJECT NAMING
378.Nm
379assigns default names to plexes and subdisks, although they may be overridden.
4995526a
SW
380We do not recommend overriding the default names.
381Experience with the
382Veritas\(tm
3221afbe 383volume manager, which allows arbitrary naming of objects, has shown that this
984263bc 384flexibility does not bring a significant advantage, and it can cause confusion.
4995526a 385.Pp
984263bc 386Names may contain any non-blank character, but it is recommended to restrict
4995526a
SW
387them to letters, digits and the underscore characters.
388The names of volumes,
984263bc 389plexes and subdisks may be up to 64 characters long, and the names of drives may
4995526a
SW
390up to 32 characters long.
391When choosing volume and plex names, bear in mind
984263bc
MD
392that automatically generated plex and subdisk names are longer than the name
393from which they are derived.
394.Bl -bullet
395.It
396When
4995526a 397.Nm
984263bc
MD
398creates or deletes objects, it creates a directory
399.Pa /dev/vinum ,
4995526a
SW
400in which it makes device entries for each volume.
401It also creates the
402subdirectories,
984263bc
MD
403.Pa /dev/vinum/plex
404and
405.Pa /dev/vinum/sd ,
406in which it stores device entries for the plexes and subdisks. In addition, it
407creates two more directories,
408.Pa /dev/vinum/vol
409and
410.Pa /dev/vinum/drive ,
411in which it stores hierarchical information for volumes and drives.
412.It
413In addition,
414.Nm
415creates three super-devices,
416.Pa /dev/vinum/control ,
417.Pa /dev/vinum/Control
418and
419.Pa /dev/vinum/controld .
420.Pa /dev/vinum/control
421is used by
422.Xr vinum 8
4995526a
SW
423when it has been compiled without the
424.Dv VINUMDEBUG
425option,
984263bc
MD
426.Pa /dev/vinum/Control
427is used by
428.Xr vinum 8
4995526a
SW
429when it has been compiled with the
430.Dv VINUMDEBUG
431option, and
984263bc
MD
432.Pa /dev/vinum/controld
433is used by the
434.Nm
4995526a
SW
435daemon.
436The two control devices for
984263bc
MD
437.Xr vinum 8
438are used to synchronize the debug status of kernel and user modules.
439.It
440Unlike
6e61cee1 441.Ux
984263bc
MD
442drives,
443.Nm
444volumes are not subdivided into partitions, and thus do not contain a disk
4995526a
SW
445label.
446Unfortunately, this confuses a number of utilities, notably
447.Xr newfs 8 ,
984263bc
MD
448which normally tries to interpret the last letter of a
449.Nm
4995526a
SW
450volume name as a partition identifier.
451If you use a volume name which does not
984263bc 452end in the letters
4995526a 453.Ql a
984263bc 454to
4995526a 455.Ql c ,
984263bc
MD
456you must use the
457.Fl v
458flag to
4995526a 459.Xr newfs 8
984263bc
MD
460in order to tell it to ignore this convention.
461.\"
462.It
4995526a
SW
463Plexes do not need to be assigned explicit names.
464By default, a plex name is
465the name of the volume followed by the letters
466.Pa .p
467and the number of the
468plex.
469For example, the plexes of volume
470.Pa vol3
984263bc 471are called
4995526a
SW
472.Pa vol3.p0 , vol3.p1
473and so on.
474These names can be overridden, but it is not recommended.
984263bc
MD
475.It
476Like plexes, subdisks are assigned names automatically, and explicit naming is
4995526a
SW
477discouraged.
478A subdisk name is the name of the plex followed by the letters
479.Pa .s
480and a number identifying the subdisk.
481For example, the subdisks of
984263bc 482plex
4995526a 483.Pa vol3.p0
984263bc 484are called
4995526a 485.Pa vol3.p0.s0 , vol3.p0.s1
984263bc 486and so on.
984263bc
MD
487.It
488By contrast,
4995526a
SW
489.Em drives
490must be named.
491This makes it possible to move a drive to a different location
492and still recognize it automatically.
493Drive names may be up to 32 characters
984263bc
MD
494long.
495.El
4995526a 496.Ss Example
984263bc
MD
497Assume the
498.Nm
4995526a
SW
499objects described in the section
500.Sx "CONFIGURATION FILE"
501in
984263bc
MD
502.Xr vinum 8 .
503The directory
4995526a 504.Pa /dev/vinum
984263bc
MD
505looks like:
506.Bd -literal -offset indent
507# ls -lR /dev/vinum
508total 5
509crwxr-xr-- 1 root wheel 91, 2 Mar 30 16:08 concat
510crwx------ 1 root wheel 91, 0x40000000 Mar 30 16:08 control
511crwx------ 1 root wheel 91, 0x40000001 Mar 30 16:08 controld
512drwxrwxrwx 2 root wheel 512 Mar 30 16:08 drive
513drwxrwxrwx 2 root wheel 512 Mar 30 16:08 plex
514drwxrwxrwx 2 root wheel 512 Mar 30 16:08 rvol
515drwxrwxrwx 2 root wheel 512 Mar 30 16:08 sd
516crwxr-xr-- 1 root wheel 91, 3 Mar 30 16:08 strcon
517crwxr-xr-- 1 root wheel 91, 1 Mar 30 16:08 stripe
518crwxr-xr-- 1 root wheel 91, 0 Mar 30 16:08 tinyvol
519drwxrwxrwx 7 root wheel 512 Mar 30 16:08 vol
520crwxr-xr-- 1 root wheel 91, 4 Mar 30 16:08 vol5
521
522/dev/vinum/drive:
523total 0
524crw-r----- 1 root operator 4, 15 Oct 21 16:51 drive2
525crw-r----- 1 root operator 4, 31 Oct 21 16:51 drive4
526
527/dev/vinum/plex:
528total 0
529crwxr-xr-- 1 root wheel 91, 0x10000002 Mar 30 16:08 concat.p0
530crwxr-xr-- 1 root wheel 91, 0x10010002 Mar 30 16:08 concat.p1
531crwxr-xr-- 1 root wheel 91, 0x10000003 Mar 30 16:08 strcon.p0
532crwxr-xr-- 1 root wheel 91, 0x10010003 Mar 30 16:08 strcon.p1
533crwxr-xr-- 1 root wheel 91, 0x10000001 Mar 30 16:08 stripe.p0
534crwxr-xr-- 1 root wheel 91, 0x10000000 Mar 30 16:08 tinyvol.p0
535crwxr-xr-- 1 root wheel 91, 0x10000004 Mar 30 16:08 vol5.p0
536crwxr-xr-- 1 root wheel 91, 0x10010004 Mar 30 16:08 vol5.p1
537
538/dev/vinum/sd:
539total 0
540crwxr-xr-- 1 root wheel 91, 0x20000002 Mar 30 16:08 concat.p0.s0
541crwxr-xr-- 1 root wheel 91, 0x20100002 Mar 30 16:08 concat.p0.s1
542crwxr-xr-- 1 root wheel 91, 0x20010002 Mar 30 16:08 concat.p1.s0
543crwxr-xr-- 1 root wheel 91, 0x20000003 Mar 30 16:08 strcon.p0.s0
544crwxr-xr-- 1 root wheel 91, 0x20100003 Mar 30 16:08 strcon.p0.s1
545crwxr-xr-- 1 root wheel 91, 0x20010003 Mar 30 16:08 strcon.p1.s0
546crwxr-xr-- 1 root wheel 91, 0x20110003 Mar 30 16:08 strcon.p1.s1
547crwxr-xr-- 1 root wheel 91, 0x20000001 Mar 30 16:08 stripe.p0.s0
548crwxr-xr-- 1 root wheel 91, 0x20100001 Mar 30 16:08 stripe.p0.s1
549crwxr-xr-- 1 root wheel 91, 0x20000000 Mar 30 16:08 tinyvol.p0.s0
550crwxr-xr-- 1 root wheel 91, 0x20100000 Mar 30 16:08 tinyvol.p0.s1
551crwxr-xr-- 1 root wheel 91, 0x20000004 Mar 30 16:08 vol5.p0.s0
552crwxr-xr-- 1 root wheel 91, 0x20100004 Mar 30 16:08 vol5.p0.s1
553crwxr-xr-- 1 root wheel 91, 0x20010004 Mar 30 16:08 vol5.p1.s0
554crwxr-xr-- 1 root wheel 91, 0x20110004 Mar 30 16:08 vol5.p1.s1
555
556/dev/vinum/vol:
557total 5
558crwxr-xr-- 1 root wheel 91, 2 Mar 30 16:08 concat
559drwxr-xr-x 4 root wheel 512 Mar 30 16:08 concat.plex
560crwxr-xr-- 1 root wheel 91, 3 Mar 30 16:08 strcon
561drwxr-xr-x 4 root wheel 512 Mar 30 16:08 strcon.plex
562crwxr-xr-- 1 root wheel 91, 1 Mar 30 16:08 stripe
563drwxr-xr-x 3 root wheel 512 Mar 30 16:08 stripe.plex
564crwxr-xr-- 1 root wheel 91, 0 Mar 30 16:08 tinyvol
565drwxr-xr-x 3 root wheel 512 Mar 30 16:08 tinyvol.plex
566crwxr-xr-- 1 root wheel 91, 4 Mar 30 16:08 vol5
567drwxr-xr-x 4 root wheel 512 Mar 30 16:08 vol5.plex
568
569/dev/vinum/vol/concat.plex:
570total 2
571crwxr-xr-- 1 root wheel 91, 0x10000002 Mar 30 16:08 concat.p0
572drwxr-xr-x 2 root wheel 512 Mar 30 16:08 concat.p0.sd
573crwxr-xr-- 1 root wheel 91, 0x10010002 Mar 30 16:08 concat.p1
574drwxr-xr-x 2 root wheel 512 Mar 30 16:08 concat.p1.sd
575
576/dev/vinum/vol/concat.plex/concat.p0.sd:
577total 0
578crwxr-xr-- 1 root wheel 91, 0x20000002 Mar 30 16:08 concat.p0.s0
579crwxr-xr-- 1 root wheel 91, 0x20100002 Mar 30 16:08 concat.p0.s1
580
581/dev/vinum/vol/concat.plex/concat.p1.sd:
582total 0
583crwxr-xr-- 1 root wheel 91, 0x20010002 Mar 30 16:08 concat.p1.s0
584
585/dev/vinum/vol/strcon.plex:
586total 2
587crwxr-xr-- 1 root wheel 91, 0x10000003 Mar 30 16:08 strcon.p0
588drwxr-xr-x 2 root wheel 512 Mar 30 16:08 strcon.p0.sd
589crwxr-xr-- 1 root wheel 91, 0x10010003 Mar 30 16:08 strcon.p1
590drwxr-xr-x 2 root wheel 512 Mar 30 16:08 strcon.p1.sd
591
592/dev/vinum/vol/strcon.plex/strcon.p0.sd:
593total 0
594crwxr-xr-- 1 root wheel 91, 0x20000003 Mar 30 16:08 strcon.p0.s0
595crwxr-xr-- 1 root wheel 91, 0x20100003 Mar 30 16:08 strcon.p0.s1
596
597/dev/vinum/vol/strcon.plex/strcon.p1.sd:
598total 0
599crwxr-xr-- 1 root wheel 91, 0x20010003 Mar 30 16:08 strcon.p1.s0
600crwxr-xr-- 1 root wheel 91, 0x20110003 Mar 30 16:08 strcon.p1.s1
601
602/dev/vinum/vol/stripe.plex:
603total 1
604crwxr-xr-- 1 root wheel 91, 0x10000001 Mar 30 16:08 stripe.p0
605drwxr-xr-x 2 root wheel 512 Mar 30 16:08 stripe.p0.sd
606
607/dev/vinum/vol/stripe.plex/stripe.p0.sd:
608total 0
609crwxr-xr-- 1 root wheel 91, 0x20000001 Mar 30 16:08 stripe.p0.s0
610crwxr-xr-- 1 root wheel 91, 0x20100001 Mar 30 16:08 stripe.p0.s1
611
612/dev/vinum/vol/tinyvol.plex:
613total 1
614crwxr-xr-- 1 root wheel 91, 0x10000000 Mar 30 16:08 tinyvol.p0
615drwxr-xr-x 2 root wheel 512 Mar 30 16:08 tinyvol.p0.sd
616
617/dev/vinum/vol/tinyvol.plex/tinyvol.p0.sd:
618total 0
619crwxr-xr-- 1 root wheel 91, 0x20000000 Mar 30 16:08 tinyvol.p0.s0
620crwxr-xr-- 1 root wheel 91, 0x20100000 Mar 30 16:08 tinyvol.p0.s1
621
622/dev/vinum/vol/vol5.plex:
623total 2
624crwxr-xr-- 1 root wheel 91, 0x10000004 Mar 30 16:08 vol5.p0
625drwxr-xr-x 2 root wheel 512 Mar 30 16:08 vol5.p0.sd
626crwxr-xr-- 1 root wheel 91, 0x10010004 Mar 30 16:08 vol5.p1
627drwxr-xr-x 2 root wheel 512 Mar 30 16:08 vol5.p1.sd
628
629/dev/vinum/vol/vol5.plex/vol5.p0.sd:
630total 0
631crwxr-xr-- 1 root wheel 91, 0x20000004 Mar 30 16:08 vol5.p0.s0
632crwxr-xr-- 1 root wheel 91, 0x20100004 Mar 30 16:08 vol5.p0.s1
633
634/dev/vinum/vol/vol5.plex/vol5.p1.sd:
635total 0
636crwxr-xr-- 1 root wheel 91, 0x20010004 Mar 30 16:08 vol5.p1.s0
637crwxr-xr-- 1 root wheel 91, 0x20110004 Mar 30 16:08 vol5.p1.s1
638.Ed
639.Pp
4995526a
SW
640In the case of unattached plexes and subdisks, the naming is reversed.
641Subdisks
984263bc
MD
642are named after the disk on which they are located, and plexes are named after
643the subdisk.
644.\" XXX
4995526a
SW
645.Bf -symbolic
646This mapping is still to be determined.
647.Ef
648.Ss Object States
984263bc
MD
649Each
650.Nm
4995526a
SW
651object has a
652.Em state
653associated with it.
984263bc
MD
654.Nm
655uses this state to determine the handling of the object.
4995526a 656.Ss Volume States
984263bc 657Volumes may have the following states:
984263bc 658.Bl -hang -width 14n
4995526a 659.It Em down
984263bc 660The volume is completely inaccessible.
4995526a
SW
661.It Em up
662The volume is up and at least partially functional.
663Not all plexes may be
984263bc
MD
664available.
665.El
4995526a 666.Ss "Plex States"
984263bc 667Plexes may have the following states:
984263bc 668.Bl -hang -width 14n
4995526a 669.It Em referenced
984263bc
MD
670A plex entry which has been referenced as part of a volume, but which is
671currently not known.
4995526a 672.It Em faulty
984263bc 673A plex which has gone completely down because of I/O errors.
4995526a 674.It Em down
984263bc 675A plex which has been taken down by the administrator.
4995526a 676.It Em initializing
984263bc 677A plex which is being initialized.
4995526a
SW
678.El
679.Pp
984263bc 680The remaining states represent plexes which are at least partially up.
4995526a
SW
681.Bl -hang -width 14n
682.It Em corrupt
683A plex entry which is at least partially up.
684Not all subdisks are available,
685and an inconsistency has occurred.
686If no other plex is uncorrupted, the volume
984263bc 687is no longer consistent.
4995526a 688.It Em degraded
984263bc
MD
689A RAID-5 plex entry which is accessible, but one subdisk is down, requiring
690recovery for many I/O requests.
4995526a
SW
691.It Em flaky
692A plex which is really up, but which has a reborn subdisk which we do not
693completely trust, and which we do not want to read if we can avoid it.
694.It Em up
695A plex entry which is completely up.
696All subdisks are up.
984263bc 697.El
4995526a 698.Ss "Subdisk States"
984263bc 699Subdisks can have the following states:
984263bc 700.Bl -hang -width 14n
4995526a
SW
701.It Em empty
702A subdisk entry which has been created completely.
703All fields are correct, and
984263bc 704the disk has been updated, but the on the disk is not valid.
4995526a 705.It Em referenced
984263bc
MD
706A subdisk entry which has been referenced as part of a plex, but which is
707currently not known.
4995526a 708.It Em initializing
984263bc
MD
709A subdisk entry which has been created completely and which is currently being
710initialized.
4995526a
SW
711.El
712.Pp
984263bc 713The following states represent invalid data.
4995526a
SW
714.Bl -hang -width 14n
715.It Em obsolete
716A subdisk entry which has been created completely.
717All fields are correct, the
984263bc
MD
718config on disk has been updated, and the data was valid, but since then the
719drive has been taken down, and as a result updates have been missed.
4995526a
SW
720.It Em stale
721A subdisk entry which has been created completely.
722All fields are correct, the
984263bc
MD
723disk has been updated, and the data was valid, but since then the drive has been
724crashed and updates have been lost.
4995526a
SW
725.El
726.Pp
984263bc 727The following states represent valid, inaccessible data.
4995526a
SW
728.Bl -hang -width 14n
729.It Em crashed
730A subdisk entry which has been created completely.
731All fields are correct, the
984263bc 732disk has been updated, and the data was valid, but since then the drive has gone
4995526a
SW
733down.
734No attempt has been made to write to the subdisk since the crash, so the
984263bc 735data is valid.
4995526a 736.It Em down
984263bc 737A subdisk entry which was up, which contained valid data, and which was taken
4995526a
SW
738down by the administrator.
739The data is valid.
740.It Em reviving
741The subdisk is currently in the process of being revived.
742We can write but not
984263bc 743read.
4995526a
SW
744.El
745.Pp
984263bc 746The following states represent accessible subdisks with valid data.
4995526a
SW
747.Bl -hang -width 14n
748.It Em reborn
749A subdisk entry which has been created completely.
750All fields are correct, the
984263bc 751disk has been updated, and the data was valid, but since then the drive has gone
4995526a
SW
752down and up again.
753No updates were lost, but it is possible that the subdisk
754has been damaged.
755We won't read from this subdisk if we have a choice.
756If this
984263bc
MD
757is the only subdisk which covers this address space in the plex, we set its
758state to up under these circumstances, so this status implies that there is
759another subdisk to fulfil the request.
4995526a
SW
760.It Em up
761A subdisk entry which has been created completely.
762All fields are correct, the
984263bc
MD
763disk has been updated, and the data is valid.
764.El
4995526a 765.Ss "Drive States"
984263bc 766Drives can have the following states:
984263bc 767.Bl -hang -width 14n
4995526a 768.It Em referenced
984263bc 769At least one subdisk refers to the drive, but it is not currently accessible to
4995526a
SW
770the system.
771No device name is known.
772.It Em down
984263bc 773The drive is not accessible.
4995526a 774.It Em up
984263bc
MD
775The drive is up and running.
776.El
984263bc
MD
777.Sh DEBUGGING PROBLEMS WITH VINUM
778Solving problems with
779.Nm
4995526a
SW
780can be a difficult affair.
781This section suggests some approaches.
984263bc 782.Ss Configuration problems
984263bc
MD
783It is relatively easy (too easy) to run into problems with the
784.Nm
4995526a
SW
785configuration.
786If you do, the first thing you should do is stop configuration
984263bc 787updates:
4995526a
SW
788.Pp
789.Dl "vinum setdaemon 4"
984263bc
MD
790.Pp
791This will stop updates and any further corruption of the on-disk configuration.
792.Pp
793Next, look at the on-disk configuration with the
9b5a9965 794.Nm Cm dumpconfig
984263bc
MD
795command, for example:
796.if t .ps -3
797.if t .vs -3
798.Bd -literal
799# \fBvinum dumpconfig\fP
de9c90f5 800Drive 4: Device /dev/da3s0h
984263bc
MD
801 Created on crash.lemis.com at Sat May 20 16:32:44 2000
802 Config last updated Sat May 20 16:32:56 2000
803 Size: 601052160 bytes (573 MB)
804volume obj state up
805volume src state up
806volume raid state down
807volume r state down
808volume foo state up
809plex name obj.p0 state corrupt org concat vol obj
810plex name obj.p1 state corrupt org striped 128b vol obj
811plex name src.p0 state corrupt org striped 128b vol src
812plex name src.p1 state up org concat vol src
813plex name raid.p0 state faulty org disorg vol raid
814plex name r.p0 state faulty org disorg vol r
815plex name foo.p0 state up org concat vol foo
816plex name foo.p1 state faulty org concat vol foo
817sd name obj.p0.s0 drive drive2 plex obj.p0 state reborn len 409600b driveoffset 265b plexoffset 0b
818sd name obj.p0.s1 drive drive4 plex obj.p0 state up len 409600b driveoffset 265b plexoffset 409600b
819sd name obj.p1.s0 drive drive1 plex obj.p1 state up len 204800b driveoffset 265b plexoffset 0b
820sd name obj.p1.s1 drive drive2 plex obj.p1 state reborn len 204800b driveoffset 409865b plexoffset 128b
821sd name obj.p1.s2 drive drive3 plex obj.p1 state up len 204800b driveoffset 265b plexoffset 256b
822sd name obj.p1.s3 drive drive4 plex obj.p1 state up len 204800b driveoffset 409865b plexoffset 384b
823.Ed
4995526a
SW
824.if t .vs +3
825.if t .ps +3
984263bc 826.Pp
4995526a
SW
827The configuration on all disks should be the same.
828If this is not the case,
829please save the output to a file and report the problem.
830There is probably
984263bc
MD
831little that can be done to recover the on-disk configuration, but if you keep a
832copy of the files used to create the objects, you should be able to re-create
4995526a
SW
833them.
834The
984263bc
MD
835.Cm create
836command does not change the subdisk data, so this will not cause data
4995526a
SW
837corruption.
838You may need to use the
984263bc
MD
839.Cm resetconfig
840command if you have this kind of trouble.
841.Ss Kernel Panics
984263bc
MD
842In order to analyse a panic which you suspect comes from
843.Nm
4995526a
SW
844you will need to build a debug kernel.
845See the online handbook at
a754a615 846.Pa http://www.dragonflybsd.org/docs/user/list/DebugKernelCrashDumps/
984263bc
MD
847for more details of how to do this.
848.Pp
849Perform the following steps to analyse a
850.Nm
851problem:
852.Bl -enum
853.It
4995526a
SW
854Copy the following files to the directory in which you will be
855performing the analysis, typically
856.Pa /var/crash :
857.Pp
858.Bl -bullet -compact
859.It
0cd7b3a7 860.Pa /sys/dev/raid/vinum/.gdbinit.crash ,
4995526a 861.It
0cd7b3a7 862.Pa /sys/dev/raid/vinum/.gdbinit.kernel ,
4995526a 863.It
0cd7b3a7 864.Pa /sys/dev/raid/vinum/.gdbinit.serial ,
4995526a 865.It
0cd7b3a7 866.Pa /sys/dev/raid/vinum/.gdbinit.vinum
984263bc 867and
4995526a 868.It
0cd7b3a7 869.Pa /sys/dev/raid/vinum/.gdbinit.vinum.paths
4995526a 870.El
984263bc
MD
871.It
872Make sure that you build the
873.Nm
4995526a
SW
874module with debugging information.
875The standard
984263bc 876.Pa Makefile
4995526a
SW
877builds a module with debugging symbols by default.
878If the version of
984263bc
MD
879.Nm
880in
af300af3 881.Pa /boot/kernel
984263bc 882does not contain symbols, you will not get an error message, but the stack trace
4995526a
SW
883will not show the symbols.
884Check the module before starting
de9c90f5 885.Xr kgdb 1 :
984263bc 886.Bd -literal
af300af3
SW
887$ file /boot/kernel/vinum.ko
888/boot/kernel/vinum.ko: ELF 32-bit LSB shared object, Intel 80386,
8e1c6f81 889 version 1 (SYSV), dynamically linked, not stripped
984263bc
MD
890.Ed
891.Pp
892If the output shows that
af300af3 893.Pa /boot/kernel/vinum.ko
4995526a
SW
894is stripped, you will have to find a version which is not.
895Usually this will be
984263bc 896either in
0cd7b3a7 897.Pa /usr/obj/usr/src/sys/SYSTEM_NAME/usr/src/sys/dev/raid/vinum/vinum.ko
984263bc
MD
898(if you have built
899.Nm
900with a
4995526a 901.Dq Li "make world" )
984263bc 902or
0cd7b3a7 903.Pa /sys/dev/raid/vinum/vinum.ko
984263bc
MD
904(if you have built
905.Nm
4995526a
SW
906in this directory).
907Modify the file
984263bc
MD
908.Pa .gdbinit.vinum.paths
909accordingly.
910.It
911Either take a dump or use remote serial
4995526a
SW
912.Xr gdb 1
913to analyse the problem.
914To analyse a dump, say
984263bc
MD
915.Pa /var/crash/vmcore.5 ,
916link
917.Pa /var/crash/.gdbinit.crash
918to
919.Pa /var/crash/.gdbinit
920and enter:
4995526a
SW
921.Bd -literal -offset indent
922cd /var/crash
de9c90f5 923kgdb kernel.debug vmcore.5
984263bc
MD
924.Ed
925.Pp
926This example assumes that you have installed the correct debug kernel at
927.Pa /var/crash/kernel.debug .
928If not, substitute the correct name of the debug kernel.
929.Pp
930To perform remote serial debugging,
931link
932.Pa /var/crash/.gdbinit.serial
933to
934.Pa /var/crash/.gdbinit
935and enter
4995526a
SW
936.Bd -literal -offset indent
937cd /var/crash
de9c90f5 938kgdb kernel.debug
984263bc
MD
939.Ed
940.Pp
941In this case, the
942.Pa .gdbinit
4995526a
SW
943file performs the functions necessary to establish connection.
944The remote
984263bc 945machine must already be in debug mode: enter the kernel debugger and select
4995526a
SW
946.Ic gdb
947(see
948.Xr ddb 4
949for more details.)
984263bc
MD
950The serial
951.Pa .gdbinit
952file expects the serial connection to run at 38400 bits per second; if you run
953at a different speed, edit the file accordingly (look for the
4995526a 954.Va remotebaud
984263bc
MD
955specification).
956.Pp
957The following example shows a remote debugging session using the
4995526a 958.Ic debug
984263bc
MD
959command of
960.Xr vinum 8 :
4995526a 961.Bd -literal
984263bc
MD
962.if t .ps -3
963.if t .vs -3
d77e8324 964GDB 4.16 (i386-unknown-dragonfly), Copyright 1996 Free Software Foundation, Inc.
984263bc
MD
965Debugger (msg=0xf1093174 "vinum debug") at ../../i386/i386/db_interface.c:318
966318 in_Debugger = 0;
967#1 0xf108d9bc in vinumioctl (dev=0x40001900, cmd=0xc008464b, data=0xf6dedee0 "",
968 flag=0x3, p=0xf68b7940) at
0cd7b3a7 969 /usr/src/sys/dev/raid/vinum/vinumioctl.c:102
984263bc
MD
970102 Debugger ("vinum debug");
971(kgdb) bt
972#0 Debugger (msg=0xf0f661ac "vinum debug") at ../../i386/i386/db_interface.c:318
973#1 0xf0f60a7c in vinumioctl (dev=0x40001900, cmd=0xc008464b, data=0xf6923ed0 "",
974 flag=0x3, p=0xf688e6c0) at
0cd7b3a7 975 /usr/src/sys/dev/raid/vinum/vinumioctl.c:109
984263bc
MD
976#2 0xf01833b7 in spec_ioctl (ap=0xf6923e0c) at ../../miscfs/specfs/spec_vnops.c:424
977#3 0xf0182cc9 in spec_vnoperate (ap=0xf6923e0c) at ../../miscfs/specfs/spec_vnops.c:129
978#4 0xf01eb3c1 in ufs_vnoperatespec (ap=0xf6923e0c) at ../../ufs/ufs/ufs_vnops.c:2312
979#5 0xf017dbb1 in vn_ioctl (fp=0xf1007ec0, com=0xc008464b, data=0xf6923ed0 "",
980 p=0xf688e6c0) at vnode_if.h:395
981#6 0xf015dce0 in ioctl (p=0xf688e6c0, uap=0xf6923f84) at ../../kern/sys_generic.c:473
982#7 0xf0214c0b in syscall (frame={tf_es = 0x27, tf_ds = 0x27, tf_edi = 0xefbfcff8,
983 tf_esi = 0x1, tf_ebp = 0xefbfcf90, tf_isp = 0xf6923fd4, tf_ebx = 0x2,
984 tf_edx = 0x804b614, tf_ecx = 0x8085d10, tf_eax = 0x36, tf_trapno = 0x7,
985 tf_err = 0x2, tf_eip = 0x8060a34, tf_cs = 0x1f, tf_eflags = 0x286,
986 tf_esp = 0xefbfcf78, tf_ss = 0x27}) at ../../i386/i386/trap.c:1100
987#8 0xf020a1fc in Xint0x80_syscall ()
988#9 0x804832d in ?? ()
989#10 0x80482ad in ?? ()
990#11 0x80480e9 in ?? ()
984263bc
MD
991.if t .vs
992.if t .ps
4995526a 993.Ed
984263bc 994.Pp
4995526a 995When entering from the debugger, it is important that the source of frame 1
984263bc
MD
996(listed by the
997.Pa .gdbinit
998file at the top of the example) contains the text
4995526a 999.Dq Li "Debugger (\*[q]vinum debug\*[q]);" .
984263bc 1000.Pp
4995526a
SW
1001This is an indication that the address specifications are correct.
1002If you get
984263bc
MD
1003some other output, your symbols and the kernel module are out of sync, and the
1004trace will be meaningless.
1005.El
1006.Pp
1007For an initial investigation, the most important information is the output of
1008the
4995526a 1009.Ic bt
984263bc 1010(backtrace) command above.
4995526a 1011.Ss Reporting Problems with Vinum
984263bc
MD
1012If you find any bugs in
1013.Nm ,
4995526a 1014please report them to
e18a87e3 1015.An Greg Lehey Aq Mt grog@lemis.com .
4995526a 1016Supply the following
984263bc 1017information:
984263bc
MD
1018.Bl -bullet
1019.It
1020The output of the
9b5a9965 1021.Nm Cm list
4995526a
SW
1022command
1023(see
1024.Xr vinum 8 ) .
984263bc
MD
1025.It
1026Any messages printed in
1027.Pa /var/log/messages .
1028All such messages will be identified by the text
4995526a 1029.Dq Li vinum
984263bc
MD
1030at the beginning.
1031.It
1032If you have a panic, a stack trace as described above.
1033.El
ac561d34
SW
1034.Sh SEE ALSO
1035.Xr disklabel 5 ,
1036.Xr disklabel 8 ,
1037.Xr newfs 8 ,
1038.Xr vinum 8
984263bc
MD
1039.Sh HISTORY
1040.Nm
1041first appeared in
1042.Fx 3.0 .
1043The RAID-5 component of
1044.Nm
4995526a
SW
1045was developed by Cybernet Inc.\&
1046.Pq Pa http://www.cybernet.com/ ,
984263bc 1047for its NetMAX product.
ac561d34 1048.Sh AUTHORS
e18a87e3 1049.An Greg Lehey Aq Mt grog@lemis.com .
ac561d34 1050.Sh BUGS
ac561d34 1051.Nm
4995526a
SW
1052is a new product.
1053Bugs can be expected.
1054The configuration mechanism is not yet
1055fully functional.
1056If you have difficulties, please look at the section
1057.Sx "DEBUGGING PROBLEMS WITH VINUM"
1058before reporting problems.
1059.Pp
ac561d34
SW
1060Kernels with the
1061.Nm
4995526a
SW
1062pseudo-device appear to work, but are not supported.
1063If you have trouble with
1064this configuration, please first replace the kernel with a
1065.No non- Ns Nm
ac561d34 1066kernel and test with the kld module.
4995526a 1067.Pp
ac561d34
SW
1068Detection of differences between the version of the kernel and the kld is not
1069yet implemented.
4995526a 1070.Pp
ac561d34
SW
1071The RAID-5 functionality is new in
1072.Fx 3.3 .
1073Some problems have been
1074reported with
1075.Nm
1076in combination with soft updates, but these are not reproducible on all
4995526a
SW
1077systems.
1078If you are planning to use
ac561d34
SW
1079.Nm
1080in a production environment, please test carefully.