| 1 | .\" |
| 2 | .\" Copyright (c) 2006, 2007 |
| 3 | .\" The DragonFly Project. All rights reserved. |
| 4 | .\" |
| 5 | .\" Redistribution and use in source and binary forms, with or without |
| 6 | .\" modification, are permitted provided that the following conditions |
| 7 | .\" are met: |
| 8 | .\" |
| 9 | .\" 1. Redistributions of source code must retain the above copyright |
| 10 | .\" notice, this list of conditions and the following disclaimer. |
| 11 | .\" 2. Redistributions in binary form must reproduce the above copyright |
| 12 | .\" notice, this list of conditions and the following disclaimer in |
| 13 | .\" the documentation and/or other materials provided with the |
| 14 | .\" distribution. |
| 15 | .\" 3. Neither the name of The DragonFly Project nor the names of its |
| 16 | .\" contributors may be used to endorse or promote products derived |
| 17 | .\" from this software without specific, prior written permission. |
| 18 | .\" |
| 19 | .\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS |
| 20 | .\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT |
| 21 | .\" LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS |
| 22 | .\" FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE |
| 23 | .\" COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, |
| 24 | .\" INCIDENTAL, SPECIAL, EXEMPLARY OR CONSEQUENTIAL DAMAGES (INCLUDING, |
| 25 | .\" BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; |
| 26 | .\" LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED |
| 27 | .\" AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, |
| 28 | .\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT |
| 29 | .\" OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF |
| 30 | .\" SUCH DAMAGE. |
| 31 | .\" |
| 32 | .Dd May 17, 2012 |
| 33 | .Dt VKERNEL 7 |
| 34 | .Os |
| 35 | .Sh NAME |
| 36 | .Nm vkernel , |
| 37 | .Nm vcd , |
| 38 | .Nm vkd , |
| 39 | .Nm vke |
| 40 | .Nd virtual kernel architecture |
| 41 | .Sh SYNOPSIS |
| 42 | .Cd "platform vkernel # for 32 bit vkernels" |
| 43 | .Cd "platform vkernel64 # for 64 bit vkernels" |
| 44 | .Cd "device vcd" |
| 45 | .Cd "device vkd" |
| 46 | .Cd "device vke" |
| 47 | .Pp |
| 48 | .Pa /var/vkernel/boot/kernel/kernel |
| 49 | .Op Fl hsUv |
| 50 | .Op Fl c Ar file |
| 51 | .Op Fl e Ar name Ns = Ns Li value : Ns Ar name Ns = Ns Li value : Ns ... |
| 52 | .Op Fl i Ar file |
| 53 | .Op Fl I Ar interface Ns Op Ar :address1 Ns Oo Ar :address2 Oc Ns Oo Ar /netmask Oc |
| 54 | .Op Fl l Ar cpulock |
| 55 | .Op Fl m Ar size |
| 56 | .Op Fl n Ar numcpus Ns Op Ar :lbits Ns Oo Ar :cbits Oc |
| 57 | .Op Fl p Ar pidfile |
| 58 | .Op Fl r Ar file |
| 59 | .Sh DESCRIPTION |
| 60 | The |
| 61 | .Nm |
| 62 | architecture allows for running |
| 63 | .Dx |
| 64 | kernels in userland. |
| 65 | .Pp |
| 66 | The following options are available: |
| 67 | .Bl -tag -width ".Fl m Ar size" |
| 68 | .It Fl c Ar file |
| 69 | Specify a readonly CD-ROM image |
| 70 | .Ar file |
| 71 | to be used by the kernel, with the first |
| 72 | .Fl c |
| 73 | option defining |
| 74 | .Li vcd0 , |
| 75 | the second one |
| 76 | .Li vcd1 , |
| 77 | and so on. |
| 78 | The first |
| 79 | .Fl r |
| 80 | or |
| 81 | .Fl c |
| 82 | option specified on the command line will be the boot disk. |
| 83 | The CD9660 filesystem is assumed when booting from this media. |
| 84 | .It Fl e Ar name Ns = Ns Li value : Ns Ar name Ns = Ns Li value : Ns ... |
| 85 | Specify an environment to be used by the kernel. |
| 86 | This option can be specified more than once. |
| 87 | .It Fl h |
| 88 | Shows a list of available options, each with a short description. |
| 89 | .It Fl i Ar file |
| 90 | Specify a memory image |
| 91 | .Ar file |
| 92 | to be used by the virtual kernel. |
| 93 | If no |
| 94 | .Fl i |
| 95 | option is given, the kernel will generate a name of the form |
| 96 | .Pa /var/vkernel/memimg.XXXXXX , |
| 97 | with the trailing |
| 98 | .Ql X Ns s |
| 99 | being replaced by a sequential number, e.g.\& |
| 100 | .Pa memimg.000001 . |
| 101 | .It Fl I Ar interface Ns Op Ar :address1 Ns Oo Ar :address2 Oc Ns Oo Ar /netmask Oc |
| 102 | Create a virtual network device, with the first |
| 103 | .Fl I |
| 104 | option defining |
| 105 | .Li vke0 , |
| 106 | the second one |
| 107 | .Li vke1 , |
| 108 | and so on. |
| 109 | .Pp |
| 110 | The |
| 111 | .Ar interface |
| 112 | argument is the name of a |
| 113 | .Xr tap 4 |
| 114 | device node or the path to a |
| 115 | .Xr vknetd 8 |
| 116 | socket. |
| 117 | The |
| 118 | .Pa /dev/ |
| 119 | path prefix does not have to be specified and will be automatically prepended |
| 120 | for a device node. |
| 121 | Specifying |
| 122 | .Cm auto |
| 123 | will pick the first unused |
| 124 | .Xr tap 4 |
| 125 | device. |
| 126 | .Pp |
| 127 | The |
| 128 | .Ar address1 |
| 129 | and |
| 130 | .Ar address2 |
| 131 | arguments are the IP addresses of the |
| 132 | .Xr tap 4 |
| 133 | and |
| 134 | .Nm vke |
| 135 | interfaces. |
| 136 | Optionally, |
| 137 | .Ar address1 |
| 138 | may be of the form |
| 139 | .Li bridge Ns Em X |
| 140 | in which case the |
| 141 | .Xr tap 4 |
| 142 | interface is added to the specified |
| 143 | .Xr bridge 4 |
| 144 | interface. |
| 145 | The |
| 146 | .Nm vke |
| 147 | address is not assigned until the interface is brought up in the guest. |
| 148 | .Pp |
| 149 | The |
| 150 | .Ar netmask |
| 151 | argument applies to all interfaces for which an address is specified. |
| 152 | .Pp |
| 153 | When running multiple vkernels it is often more convenient to simply |
| 154 | connect to a |
| 155 | .Xr vknetd 8 |
| 156 | socket and let vknetd deal with the tap and/or bridge. An example of |
| 157 | this would be '/var/run/vknet:0.0.0.0:10.2.0.2/16'. |
| 158 | .It Fl l Ar cpulock |
| 159 | Specify which, if any, real CPUs to lock virtual CPUs to. |
| 160 | .Ar cpulock |
| 161 | is one of |
| 162 | .Cm any , |
| 163 | .Cm map Ns Op Ns , Ns Ar startCPU , |
| 164 | or |
| 165 | .Ar CPU . |
| 166 | .Pp |
| 167 | .Cm any |
| 168 | does not map virtual CPUs to real CPUs. |
| 169 | This is the default. |
| 170 | .Pp |
| 171 | .Cm map Ns Op Ns , Ns Ar startCPU |
| 172 | maps each virtual CPU to a real CPU starting with real CPU 0 or |
| 173 | .Ar startCPU |
| 174 | if specified. |
| 175 | .Pp |
| 176 | .Ar CPU |
| 177 | locks all virtual CPUs to the real CPU specified by |
| 178 | .Ar CPU . |
| 179 | .It Fl m Ar size |
| 180 | Specify the amount of memory to be used by the kernel in bytes, |
| 181 | .Cm K |
| 182 | .Pq kilobytes , |
| 183 | .Cm M |
| 184 | .Pq megabytes |
| 185 | or |
| 186 | .Cm G |
| 187 | .Pq gigabytes . |
| 188 | Lowercase versions of |
| 189 | .Cm K , M , |
| 190 | and |
| 191 | .Cm G |
| 192 | are allowed. |
| 193 | .It Fl n Ar numcpus Ns Op Ar :lbits Ns Oo Ar :cbits Oc |
| 194 | .Ar numcpus |
| 195 | specifies the number of CPUs you wish to emulate. |
| 196 | Up to 16 CPUs are supported. |
| 197 | The virtual kernel must be built with |
| 198 | .Cd options SMP |
| 199 | to use this option and will default to 2 CPUs unless otherwise specified. |
| 200 | .Ar lbits |
| 201 | specifies the number of bits within APICID(=CPUID) needed for representing |
| 202 | the logical ID. |
| 203 | Controls the number of threads/core (0bits - 1 thread, 1bit - 2 threads). |
| 204 | This parameter is optional (mandatory only if |
| 205 | .Ar cbits |
| 206 | is specified). |
| 207 | .Ar cbits |
| 208 | specifies the number of bits within APICID(=CPUID) needed for representing |
| 209 | the core ID. |
| 210 | Controls the number of core/package (0bits - 1 core, 1bit - 2 cores). |
| 211 | This parameter is optional. |
| 212 | .It Fl p Ar pidfile |
| 213 | Specify a pidfile in which to store the process ID. |
| 214 | Scripts can use this file to locate the vkernel pid for the purpose of |
| 215 | shutting down or killing it. |
| 216 | .Pp |
| 217 | The vkernel will hold a lock on the pidfile while running. |
| 218 | Scripts may test for the lock to determine if the pidfile is valid or |
| 219 | stale so as to avoid accidentally killing a random process. |
| 220 | Something like '/usr/bin/lockf -ks -t 0 pidfile echo -n' may be used |
| 221 | to test the lock. |
| 222 | A non-zero exit code indicates that the pidfile represents a running |
| 223 | vkernel. |
| 224 | .Pp |
| 225 | An error is issued and the vkernel exits if this file cannot be opened for |
| 226 | writing or if it is already locked by an active vkernel process. |
| 227 | .It Fl r Ar file |
| 228 | Specify a R/W disk image |
| 229 | .Ar file |
| 230 | to be used by the kernel, with the first |
| 231 | .Fl r |
| 232 | option defining |
| 233 | .Li vkd0 , |
| 234 | the second one |
| 235 | .Li vkd1 , |
| 236 | and so on. |
| 237 | The first |
| 238 | .Fl r |
| 239 | or |
| 240 | .Fl c |
| 241 | option specified on the command line will be the boot disk. |
| 242 | .It Fl s |
| 243 | Boot into single-user mode. |
| 244 | .It Fl U |
| 245 | Enable writing to kernel memory and module loading. |
| 246 | By default, those are disabled for security reasons. |
| 247 | .It Fl v |
| 248 | Turn on verbose booting. |
| 249 | .El |
| 250 | .Sh DEVICES |
| 251 | A number of virtual device drivers exist to supplement the virtual kernel. |
| 252 | .Ss Disk device |
| 253 | The |
| 254 | .Nm vkd |
| 255 | driver allows for up to 16 |
| 256 | .Xr vn 4 |
| 257 | based disk devices. |
| 258 | The root device will be |
| 259 | .Li vkd0 |
| 260 | (see |
| 261 | .Sx EXAMPLES |
| 262 | for further information on how to prepare a root image). |
| 263 | .Ss CD-ROM device |
| 264 | The |
| 265 | .Nm vcd |
| 266 | driver allows for up to 16 virtual CD-ROM devices. |
| 267 | Basically this is a read only |
| 268 | .Nm vkd |
| 269 | device with a block size of 2048. |
| 270 | .Ss Network interface |
| 271 | The |
| 272 | .Nm vke |
| 273 | driver supports up to 16 virtual network interfaces which are associated with |
| 274 | .Xr tap 4 |
| 275 | devices on the host. |
| 276 | For each |
| 277 | .Nm vke |
| 278 | device, the per-interface read only |
| 279 | .Xr sysctl 3 |
| 280 | variable |
| 281 | .Va hw.vke Ns Em X Ns Va .tap_unit |
| 282 | holds the unit number of the associated |
| 283 | .Xr tap 4 |
| 284 | device. |
| 285 | .Sh SIGNALS |
| 286 | The virtual kernel only enables |
| 287 | .Dv SIGQUIT |
| 288 | and |
| 289 | .Dv SIGTERM |
| 290 | while operating in regular console mode. |
| 291 | Sending |
| 292 | .Ql \&^\e |
| 293 | .Pq Dv SIGQUIT |
| 294 | to the virtual kernel causes the virtual kernel to enter its internal |
| 295 | .Xr ddb 4 |
| 296 | debugger and re-enable all other terminal signals. |
| 297 | Sending |
| 298 | .Dv SIGTERM |
| 299 | to the virtual kernel triggers a clean shutdown by passing a |
| 300 | .Dv SIGUSR2 |
| 301 | to the virtual kernel's |
| 302 | .Xr init 8 |
| 303 | process. |
| 304 | .Sh DEBUGGING |
| 305 | It is possible to directly gdb the virtual kernel's process. |
| 306 | It is recommended that you do a |
| 307 | .Ql handle SIGSEGV noprint |
| 308 | to ignore page faults processed by the virtual kernel itself and |
| 309 | .Ql handle SIGUSR1 noprint |
| 310 | to ignore signals used for simulating inter-processor interrupts (SMP build |
| 311 | only). |
| 312 | .Sh PROFILING |
| 313 | To compile a vkernel with profiling support, the |
| 314 | .Va CONFIGARGS |
| 315 | variable needs to be used to pass |
| 316 | .Fl p |
| 317 | to |
| 318 | .Xr config 8 . |
| 319 | .Bd -literal |
| 320 | cd /usr/src |
| 321 | make -DNO_MODULES CONFIGARGS=-p buildkernel KERNCONF=VKERNEL |
| 322 | .Ed |
| 323 | .Sh FILES |
| 324 | .Bl -tag -width ".It Pa /sys/config/VKERNEL" -compact |
| 325 | .It Pa /sys/config/VKERNEL |
| 326 | .It Pa /sys/config/VKERNEL64 |
| 327 | .El |
| 328 | .Pp |
| 329 | Per architecture |
| 330 | .Nm |
| 331 | configuration files, for |
| 332 | .Xr config 8 . |
| 333 | .Sh CONFIGURATION FILES |
| 334 | Your virtual kernel is a complete |
| 335 | .Dx |
| 336 | system, but you might not want to run all the services a normal kernel runs. |
| 337 | Here is what a typical virtual kernel's |
| 338 | .Pa /etc/rc.conf |
| 339 | file looks like, with some additional possibilities commented out. |
| 340 | .Bd -literal |
| 341 | hostname="vkernel" |
| 342 | network_interfaces="lo0 vke0" |
| 343 | ifconfig_vke0="DHCP" |
| 344 | sendmail_enable="NO" |
| 345 | #syslog_enable="NO" |
| 346 | blanktime="NO" |
| 347 | .Ed |
| 348 | .Sh DISKLESS OPERATION |
| 349 | To boot a |
| 350 | .Nm |
| 351 | from a NFS root, a number of tunables need to be set: |
| 352 | .Bl -tag -width indent |
| 353 | .It Va boot.netif.ip |
| 354 | IP address to be set in the vkernel interface. |
| 355 | .It Va boot.netif.netmask |
| 356 | Netmask for the IP to be set. |
| 357 | .It Va boot.netif.name |
| 358 | Network interface name inside the vkernel. |
| 359 | .It Va boot.nfsroot.server |
| 360 | Host running |
| 361 | .Xr nfsd 8 . |
| 362 | .It Va boot.nfsroot.path |
| 363 | Host path where a world and distribution |
| 364 | targets are properly installed. |
| 365 | .El |
| 366 | .Pp |
| 367 | See an example on how to boot a diskless |
| 368 | .Nm |
| 369 | in the |
| 370 | .Sx EXAMPLES |
| 371 | section. |
| 372 | .Sh EXAMPLES |
| 373 | A couple of steps are necessary in order to prepare the system to build and |
| 374 | run a virtual kernel. |
| 375 | .Ss Setting up the filesystem |
| 376 | The |
| 377 | .Nm |
| 378 | architecture needs a number of files which reside in |
| 379 | .Pa /var/vkernel . |
| 380 | Since these files tend to get rather big and the |
| 381 | .Pa /var |
| 382 | partition is usually of limited size, we recommend the directory to be |
| 383 | created in the |
| 384 | .Pa /home |
| 385 | partition with a link to it in |
| 386 | .Pa /var : |
| 387 | .Bd -literal |
| 388 | mkdir -p /home/var.vkernel/boot |
| 389 | ln -s /home/var.vkernel /var/vkernel |
| 390 | .Ed |
| 391 | .Pp |
| 392 | Next, a filesystem image to be used by the virtual kernel has to be |
| 393 | created and populated (assuming world has been built previously). |
| 394 | If the image is created on a UFS filesystem you might want to pre-zero it. |
| 395 | On a HAMMER filesystem you should just truncate-extend to the image size |
| 396 | as HAMMER does not re-use data blocks already present in the file. |
| 397 | .Bd -literal |
| 398 | vnconfig -c -S 2g -T vn0 /var/vkernel/rootimg.01 |
| 399 | disklabel -r -w vn0s0 auto |
| 400 | disklabel -e vn0s0 # add `a' partition with fstype `4.2BSD' |
| 401 | newfs /dev/vn0s0a |
| 402 | mount /dev/vn0s0a /mnt |
| 403 | cd /usr/src |
| 404 | make installworld DESTDIR=/mnt |
| 405 | cd etc |
| 406 | make distribution DESTDIR=/mnt |
| 407 | echo '/dev/vkd0s0a / ufs rw 1 1' >/mnt/etc/fstab |
| 408 | echo 'proc /proc procfs rw 0 0' >>/mnt/etc/fstab |
| 409 | .Ed |
| 410 | .Pp |
| 411 | Edit |
| 412 | .Pa /mnt/etc/ttys |
| 413 | and replace the |
| 414 | .Li console |
| 415 | entry with the following line and turn off all other gettys. |
| 416 | .Bd -literal |
| 417 | console "/usr/libexec/getty Pc" cons25 on secure |
| 418 | .Ed |
| 419 | .Pp |
| 420 | Replace |
| 421 | .Li \&Pc |
| 422 | with |
| 423 | .Li al.Pc |
| 424 | if you would like to automatically log in as root. |
| 425 | .Pp |
| 426 | Then, unmount the disk. |
| 427 | .Bd -literal |
| 428 | umount /mnt |
| 429 | vnconfig -u vn0 |
| 430 | .Ed |
| 431 | .Ss Compiling the virtual kernel |
| 432 | In order to compile a virtual kernel use the |
| 433 | .Li VKERNEL |
| 434 | kernel configuration file residing in |
| 435 | .Pa /sys/config |
| 436 | (or a configuration file derived thereof): |
| 437 | .Bd -literal |
| 438 | cd /usr/src |
| 439 | make -DNO_MODULES buildkernel KERNCONF=VKERNEL |
| 440 | make -DNO_MODULES installkernel KERNCONF=VKERNEL DESTDIR=/var/vkernel |
| 441 | .Ed |
| 442 | .Ss Enabling virtual kernel operation |
| 443 | A special |
| 444 | .Xr sysctl 8 , |
| 445 | .Va vm.vkernel_enable , |
| 446 | must be set to enable |
| 447 | .Nm |
| 448 | operation: |
| 449 | .Bd -literal |
| 450 | sysctl vm.vkernel_enable=1 |
| 451 | .Ed |
| 452 | .Ss Configuring the network on the host system |
| 453 | In order to access a network interface of the host system from the |
| 454 | .Nm , |
| 455 | you must add the interface to a |
| 456 | .Xr bridge 4 |
| 457 | device which will then be passed to the |
| 458 | .Fl I |
| 459 | option: |
| 460 | .Bd -literal |
| 461 | kldload if_bridge.ko |
| 462 | kldload if_tap.ko |
| 463 | ifconfig bridge0 create |
| 464 | ifconfig bridge0 addm re0 # assuming re0 is the host's interface |
| 465 | ifconfig bridge0 up |
| 466 | .Ed |
| 467 | .Ss Running the kernel |
| 468 | Finally, the virtual kernel can be run: |
| 469 | .Bd -literal |
| 470 | cd /var/vkernel |
| 471 | \&./boot/kernel/kernel -m 64m -r rootimg.01 -I auto:bridge0 |
| 472 | .Ed |
| 473 | .Pp |
| 474 | You can issue the |
| 475 | .Xr reboot 8 , |
| 476 | .Xr halt 8 , |
| 477 | or |
| 478 | .Xr shutdown 8 |
| 479 | commands from inside a virtual kernel. |
| 480 | After doing a clean shutdown the |
| 481 | .Xr reboot 8 |
| 482 | command will re-exec the virtual kernel binary while the other two will |
| 483 | cause the virtual kernel to exit. |
| 484 | .Ss Diskless operation |
| 485 | Booting a |
| 486 | .Nm |
| 487 | with a |
| 488 | .Xr vknetd 8 |
| 489 | network configuration: |
| 490 | .Bd -literal |
| 491 | \&./boot/kernel/kernel -m 64m -m -i memimg.0000 -I /var/run/vknet |
| 492 | -e boot.netif.ip=172.1.0.4 |
| 493 | -e boot.netif.netmask=255.255.0.0 |
| 494 | -e boot.netif.name=vke0 |
| 495 | -e boot.nfsroot.server=172.1.0.1 |
| 496 | -e boot.nfsroot.path=/home/vkernel/vkdiskless |
| 497 | .Ed |
| 498 | .Sh BUILDING THE WORLD UNDER A VKERNEL |
| 499 | The virtual kernel platform does not have all the header files expected |
| 500 | by a world build, so the easiest thing to do right now is to specify a |
| 501 | pc32 (in a 32 bit vkernel) or pc64 (in a 64 bit vkernel) target when |
| 502 | building the world under a virtual kernel, like this: |
| 503 | .Bd -literal |
| 504 | vkernel# make MACHINE_PLATFORM=pc32 buildworld |
| 505 | vkernel# make MACHINE_PLATFORM=pc32 installworld |
| 506 | .Ed |
| 507 | .Sh SEE ALSO |
| 508 | .Xr vknet 1 , |
| 509 | .Xr bridge 4 , |
| 510 | .Xr tap 4 , |
| 511 | .Xr vn 4 , |
| 512 | .Xr sysctl.conf 5 , |
| 513 | .Xr build 7 , |
| 514 | .Xr config 8 , |
| 515 | .Xr disklabel 8 , |
| 516 | .Xr ifconfig 8 , |
| 517 | .Xr vknetd 8 , |
| 518 | .Xr vnconfig 8 |
| 519 | .Rs |
| 520 | .%A Aggelos Economopoulos |
| 521 | .%D March 2007 |
| 522 | .%T "A Peek at the DragonFly Virtual Kernel" |
| 523 | .Re |
| 524 | .Sh HISTORY |
| 525 | Virtual kernels were introduced in |
| 526 | .Dx 1.7 . |
| 527 | .Sh AUTHORS |
| 528 | .An -nosplit |
| 529 | .An Matt Dillon |
| 530 | thought up and implemented the |
| 531 | .Nm |
| 532 | architecture and wrote the |
| 533 | .Nm vkd |
| 534 | device driver. |
| 535 | .An Sepherosa Ziehau |
| 536 | wrote the |
| 537 | .Nm vke |
| 538 | device driver. |
| 539 | This manual page was written by |
| 540 | .An Sascha Wildner . |