Remove an obsolete link and rephrase section about debug options in kernel a bit.
[ikiwiki.git] / docs / user / DebugKernelCrashDumps.mdwn
CommitLineData
27b56562 1# Debug the DragonFly kernel
2
3
4
5This chapter should give you an introduction how to obtain a crash dump after a kernel panic and how to extract useful information for the developers out of the dump.
6
7
b790f880 8[[!toc levels=3 ]]
27b56562 9***Contributed by Matthias Schmidt***
10
11
12
13## Configure your system
14
15Normally a crash dump is saved in your swap partition after a crash. If you reboot your machine the next time the dump is extracted by [savecore(8)](http://leaf.dragonflybsd.org/cgi/web-man?command#savecore&section8) from the partition and stored into `/var/crash`. Due to the fact that `/var` is a relatively small partition it could be possible that the dump isn't saved, because the dump size is larger than the remaining size.
16
17
18
19To circumvent this problem you can change the default settings in `/etc/rc.conf`:
20
21
22
60e3e162 23[[!table header=no data="""
24> `dumpdev` | | Indicates the device (usually a swap partition) to which a crash dump should be written in the event of a system crash.
25> `dumpdir` | | savecore(8) will save that crash dump and a copy of the kernel to the directory specified by the dumpdir variable. The default value is /var/crash. You can set this to another directory on another partition with more space available to safely obtain the dump.
27b56562 26"""]]
27
28If you are unsure about your swap partition device, use [swapinfo(8)](http://leaf.dragonflybsd.org/cgi/web-man?command#swapinfo&section8) or look into `/etc/fstab` :
29
30
31
32 # swapinfo
27b56562 33 Device 1K-blocks Used Avail Capacity Type
27b56562 34 /dev/ad0s1b 1048448 0 1048448 0% Interleaved
35
27b56562 36 # cat /etc/fstab | grep swap
27b56562 37 /dev/ad0s1b none swap sw 0 0
38
39
c8958fbc 40### Enable debugging options in your custom kernel config
27b56562 41
c8958fbc 42If you run a custom kernel you have to add the following lines to compile your kernel with debugging symbols:
27b56562 43
44
45
46 makeoptions DEBUG=-g #Build kernel with gdb(1) debug symbols
47
48
27b56562 49If you want additional support for the interactive kernel debugger [ddb(4)](http://leaf.dragonflybsd.org/cgi/web-man?command#ddb&section4) and invariant debugging, also add these lines:
50
51
52
53 # Debugging for Development
27b56562 54 options DDB
27b56562 55 options DDB_TRACE
27b56562 56 options INVARIANTS
57
58
c8958fbc 59You don't have to do anything to get debugging enabled in the default GENERIC kernel as it's already there.
27b56562 60
61## How does a crash look like?
62
63
64
65Easy answer: Your system stopped working. Complicated one: Your system occurred a [panic(9)](http://leaf.dragonflybsd.org/cgi/web-man?command#panic&section9) and drops into [ddb(4)](http://leaf.dragonflybsd.org/cgi/web-man?command=ddb&section=4), the interactive kernel debugger.
66
67The output while seeing a crash might look this:
68
69
70
71 Fatal trap 12: page fault while in kernel mode
27b56562 72 fault virtual address = 0xd0686f55
27b56562 73 fault code = supervisor read, page not present
27b56562 74 instruction pointer = 0x8:0xc02ddb9a
27b56562 75 stack pointer = 0x10:0xcec0fb18
27b56562 76 frame pointer = 0x10:0xcec0fb18
27b56562 77 code segment = base 0x0, limit 0xfffff, type 0x1b
27b56562 78 = DPL 0, pres 1, def32 1, gran 1
27b56562 79 processor eflags # interrupt enabled, resume, IOPL 0
27b56562 80 current process = 50725 (sysctl)
27b56562 81 current thread = pri 6
82
27b56562 83 panic: from debugger
84
85
27b56562 86Before your machine reboots a crash dump is saved into your swap partition (if you have one and don't disabled crash dumps). Writing the dump to disk takes some time depending on your machine and the amount of RAM installed. This might look like this:
87
88
89
90 dumping to dev #ad/0x20001, blockno 1049088
27b56562 91 dump 511 510 509 508 507 506 505
27b56562 92 [...]
27b56562 93 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 succeeded
94
27b56562 95Now your machine reboots, checks its file system and finally extracts the crash dump from the swap partition to your `dumpdir` (see `rc.conf` setting above). If your `/var` partition is to small, you'll see an error similar to the following:
96
97
98
99 savecore: reboot after panic: from debugger
27b56562 100 savecore: no dump, not enough free space on device (231420 available, need 541840)
101
102
27b56562 103If this happens, you have to extract the crash dump yourselves. See next Section how to do this.
104
105### Extract a crash dump manually
106
107You can use [savecore(8)](http://leaf.dragonflybsd.org/cgi/web-man?command#savecore&section8) to copy your currently running kernel and the associated crash dump to a particular directory you have to specify (we use `/usr/crash` here in the example):
108
109
110
111 # mkdir -p /usr/crash
27b56562 112 # chmod 700 /usr/crash
27b56562 113 # savecore /usr/crash/
27b56562 114 [...]
115
116
117
118This will take some time dependent on the speed of your machine. See the man page of savecore(8) for more available options.
119
120### Upload the crash dump
121
122If you don't have the ability or skills to debug are crash yourselves, please upload the complete content of your crash directory to a public available location (HTTP, FTP web space or your leaf account) and send a detailed bug report to the bugs@dragonflybsd.org list. If its possible please tar and compress (gzip, bzip2) the directory to save disk space and bandwith.
123
124
125
126## Debug the crash dump with kgdb
127
128The [kgdb(1)](http://leaf.dragonflybsd.org/cgi/web-man?command#kgdb&section1) utility is a debugger based on gdb(1) that allows debugging of kernel core files.
129
130### kgdb extesions
131
132To get some handy helper command execute the following command before starting kgdb:
133
134
135
136 source /usr/src/test/debug/gdb.kernel
137
27b56562 138This gives you several new commands like kldstat (displays all loaded modules) or psx (displays all running processes).
139
140Start kgdb as follows:
141
142
143
144 # cd /usr/crash
27b56562 145 # ls -l
27b56562 146 -rw-r--r-- 1 root wheel 2B Jan 7 17:07 bounds
27b56562 147 -rw-r--r-- 1 root wheel 17M Jan 7 17:08 kernel.0
27b56562 148 -rw------- 1 root wheel 512M Jan 7 17:08 vmcore.0
27b56562 149 # kgdb kernel.0 vmcore.0
150
151
27b56562 152kgdb(1) will show you the panic message after start. The first thing to do is to obtain a ***backtrace*** with the ***bt*** command:
153
154
155
156 Unread portion of the kernel message buffer:
157
158
159
160
161
162 Fatal trap 12: page fault while in kernel mode
27b56562 163 fault virtual address = 0xd0686f55
27b56562 164 fault code = supervisor read, page not present
27b56562 165 instruction pointer = 0x8:0xc02ddb9a
27b56562 166 stack pointer = 0x10:0xcec0fb18
27b56562 167 frame pointer = 0x10:0xcec0fb18
27b56562 168 code segment = base 0x0, limit 0xfffff, type 0x1b
27b56562 169 current process = 50725 (sysctl)
27b56562 170 current thread = pri 6
171
27b56562 172 panic: from debugger
173
174
175
176
177
178 Fatal trap 3: breakpoint instruction fault while in kernel mode
27b56562 179 instruction pointer = 0x8:0xc03136a4
27b56562 180 stack pointer = 0x10:0xcec0f92c
27b56562 181 frame pointer = 0x10:0xcec0f934
27b56562 182 code segment = base 0x0, limit 0xfffff, type 0x1b
27b56562 183 = DPL 0, pres 1, def32 1, gran 1
27b56562 184 processor eflags # interrupt enabled, IOPL 0
27b56562 185 current process = 50725 (sysctl)
27b56562 186 current thread = pri 6
27b56562 187
188
189 panic: from debugger
27b56562 190 Uptime: 3h57m22s
191
192
193
194 dumping to dev #ad/0x20001, blockno 1049088
27b56562 195 dump 511 510 509 508 507 506 505 504 503 502 501 500 499 498
27b56562 196 [...]
27b56562 197 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
198
199
200
201 GNU gdb 6.2.1
27b56562 202 Copyright 2004 Free Software Foundation, Inc.
27b56562 203 GDB is free software, covered by the GNU General Public License, and you are
27b56562 204 welcome to change it and/or distribute copies of it under certain conditions.
27b56562 205 Type "show copying" to see the conditions.
27b56562 206 There is absolutely no warranty for GDB. Type "show warranty" for details.
27b56562 207 This GDB was configured as "i386-dragonfly".
27b56562 208 (kgdb) bt
27b56562 209 #0 dumpsys () at thread.h:83
27b56562 210 #1 0xc01c4e1b in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:375
27b56562 211 #2 0xc01c4f3c in panic (fmt=Variable "fmt" is not available.
27b56562 212 ) at /usr/src/sys/kern/kern_shutdown.c:800
27b56562 213 #3 0xc0149be5 in db_panic (addr=Could not find the frame base for "db_panic".
27b56562 214 ) at /usr/src/sys/ddb/db_command.c:447
27b56562 215 #4 0xc014a250 in db_command_loop () at /usr/src/sys/ddb/db_command.c:343
27b56562 216 #5 0xc014c7bc in db_trap (type#12, code0) at /usr/src/sys/ddb/db_trap.c:71
27b56562 217 #6 0xc03137f7 in kdb_trap (type#12, code0, regs=0xcec0fad0) at /usr/src/sys/platform/pc32/i386/db_interface.c:148
27b56562 218 #7 0xc032384b in trap_fatal (frame#0xcec0fad0, evaVariable "eva" is not available.
27b56562 219 ) at /usr/src/sys/platform/pc32/i386/trap.c:1091
27b56562 220 #8 0xc03239b0 in trap_pfault (frame#0xcec0fad0, usermode0, eva=3496505173)
27b56562 221 at /usr/src/sys/platform/pc32/i386/trap.c:997
27b56562 222 #9 0xc03241a0 in trap (frame=0xcec0fad0) at /usr/src/sys/platform/pc32/i386/trap.c:680
27b56562 223 #10 0xc0314506 in calltrap () at /usr/src/sys/platform/pc32/i386/exception.s:783
27b56562 224 #11 0xc02ddb9a in strlen (str=0xd0686f55 <Address 0xd0686f55 out of bounds>) at /usr/src/sys/libkern/strlen.c:41
27b56562 225 #12 0xc02c2153 in sysctl_vm_zone (oidp#0xc03b42a0, arg10x0, arg2=0, req=0xcec0fc08) at /usr/src/sys/vm/vm_zone.c:447
27b56562 226 #13 0xc01cf935 in sysctl_root (oidp=Variable "oidp" is not available.
27b56562 227 ) at /usr/src/sys/kern/kern_sysctl.c:1193
27b56562 228 #14 0xc01cfa27 in userland_sysctl (name#0xcec0fc90, namelen2, old=0x0, oldlenp=0xbfbfe8f0, inkernel=0, new=0x0,
27b56562 229 newlen#0, retval0xcec0fc8c) at /usr/src/sys/kern/kern_sysctl.c:1268
27b56562 230 #15 0xc01cfc28 in sys___sysctl (uap=0xcec0fcf0) at /usr/src/sys/kern/kern_sysctl.c:1211
27b56562 231 #16 0xc0323ccb in syscall2 (frame=0xcec0fd40) at /usr/src/sys/platform/pc32/i386/trap.c:1339
27b56562 232 #17 0xc03145a5 in Xint0x80_syscall () at /usr/src/sys/platform/pc32/i386/exception.s:872
27b56562 233 #18 0x08055d38 in ?? ()
27b56562 234 #19 0xbfbfe86c in ?? ()
27b56562 235 #20 0x0000002f in ?? ()
27b56562 236 #21 0x00000000 in ?? ()
27b56562 237 #22 0x00000000 in ?? ()
27b56562 238 #23 0x00000000 in ?? ()
27b56562 239 #24 0x00000000 in ?? ()
27b56562 240 #25 0x13c4b000 in ?? ()
27b56562 241 #26 0x00000001 in ?? ()
27b56562 242 #27 0xc03c2bf8 in intr_info_ary ()
540f4d30 243 #28 0xcec0f8d4 in ?? ()
27b56562 244 #29 0xcec0f8c4 in ?? ()
27b56562 245 #30 0xc8076300 in ?? ()
27b56562 246 #31 0xc01cac5a in lwkt_preempt (ntd#0x2, critpriCannot access memory at address 0xbfbfe8a4
27b56562 247 ) at /usr/src/sys/kern/lwkt_thread.c:893
27b56562 248 Previous frame inner to this frame (corrupt stack?)
249
250
251
252kgdb(1) gives you the ability to look into specific frames, display variable content and obtain the source code (if your kernel was compiled with -g):
253
254
255
256 (kgdb) f 13
27b56562 257 #13 0xc01cf935 in sysctl_root (oidp=Variable "oidp" is not available.
27b56562 258 ) at /usr/src/sys/kern/kern_sysctl.c:1193
27b56562 259 1193 error = oid->oid_handler(oid, oid->oid_arg1, oid->oid_arg2,
27b56562 260 (kgdb) l
27b56562 261 1188
27b56562 262 1189 if ((oid->oid_kind & CTLTYPE) # CTLTYPE_NODE)
27b56562 263 1190 error = oid->oid_handler(oid, (int *)arg1 + indx, arg2 - indx,
27b56562 264 1191 req);
27b56562 265 1192 else
27b56562 266 1193 error = oid->oid_handler(oid, oid->oid_arg1, oid->oid_arg2,
27b56562 267 1194 req);
27b56562 268 1195 return (error);
27b56562 269 1196 }
27b56562 270 1197
27b56562 271 (kgdb) p *oid
27b56562 272 $1 # {oid_parent 0xc03cbda8, oid_link = {sle_next = 0x0}, oid_number = 283, oid_kind = -2147483645, oid_arg1 = 0x0,
27b56562 273 oid_arg2 # 0, oid_name 0xc03616ad "zone", oid_handler = 0xc02c20fa <sysctl_vm_zone>, oid_fmt = 0xc036a56f "A",
27b56562 274 oid_refcnt # 0, oid_descr 0xc036906a "Zone Info"}
275
276
277
278## Further Information
279
280To get more information about how to use a debugger, look here:
281
282
283* [Man page of kgdb(1)](http://leaf.dragonflybsd.org/cgi/web-man?command#kgdb&section1)
284
285
286* [Man page of gdb(1)](http://leaf.dragonflybsd.org/cgi/web-man?command#gdb&section1)
287
288
289* [How to retrieve symbols from kernel modules](http://leaf.dragonflybsd.org/mailarchive/kernel/2005-11/msg00065.html)
290
291
292* [FreeBSD Developers Handbook](http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/book.html#KERNELDEBUG)
293
294
295* [GDB Manual](http://sourceware.org/gdb/documentation/)
296
297
298* [Debug tutorial from Greg Lehey](http://www.lemis.com/grog/Papers/Debug-tutorial/tutorial.pdf)
299