# Introduction
The purpose of this document is to introduce the reader with vkernel debugging.
The vkernel architecture allows us to run DragonFly kernels in userland. These virtual
kernels can be paniced or otherwise abused, without affecting the host operating system.

To make things a bit more interesting, we will use a real life example.

# Once upon a time
... I wrote a simple program that used the AIO interface. As it turned out we don't support
this feature, but at that point I didn't know.

    [beket@sadness ~]$ gcc t_aio.c -o t_aio -Wall -ansi -pedantic
    [beket@sadness ~]$ ./t_aio 
    aio_read: Function not implemented
    [beket@sadness ~]$ 

Ktrace'ing the process and seeing with my own eyes what was going on, seemed like a good idea.
Here comes the fun. I misread the ktrace(1) man page and typed:

    [beket@sadness ~]$ ktrace -c ./t_aio

And the system hang.

(My intention was to track the system calls of t_aio, but what I typed would actually disable all traces from all process to the t_aio file.)

# Setup a vkernel
To setup a vkernel, please consult this [man page](http://leaf.dragonflybsd.org/cgi/web-man?command=vkernel&section=ANY).
It's very straightforward.

# Reproduce the problem
We boot into our vkernel:

    # cd /var/kernel
    # ./boot/kernel -m 64m -r rootimg.01 -I auto:bridge0
    [...]
    login: root
    #
And then try to reproduce the system freeze:

    # ktrace -c ./t_aio

    Fatal trap 12: page fault while in kernel mode
    mp_lock = 00000001; cpuid = 1
    fault virtual address   = 0x0
    fault code              = supervisor read, page not present
    instruction pointer     = 0x1f:0x80aca52
    stack pointer           = 0x10:0x5709d914
    frame pointer           = 0x10:0x5709dbe0
    processor eflags        = interrupt enabled, resume, IOPL = 0
    current process         = 692 (ktrace)
    current thread          = pri 6 
     <- SMP: XXX
    kernel: type 12 trap, code=4
    
    CPU1 stopping CPUs: 0x00000001
     stopped
    Stopped at      0x80aca52:      movl    0(%eax),%eax
    db> 

This db> prompt is from ddb(4), the interactive kernel debugger.
The

    fault virtual address   = 0x0

field is indicative of a NULL pointer dereference inside the kernel.

Let's get a trace of what went wrong:

    db> trace
    ktrdestroy(57082700,5709dc5c,0,57082700,5709dca0) at 0x80aca52
    allproc_scan(80aca14,5709dc5c,be,2,0) at 0x80b2e91
    sys_ktrace(5709dca0,6,0,0,57082700) at 0x80acffe
    syscall2(5709dd40,6,57082700,0,0) at 0x8214b6d
    user_trap(5709dd40,570940e8,8214185,0,8215462) at 0x8214d9c
    go_user(5709dd38,0,0,7b,0) at 0x82151ac
    db> 

# Gdb
Quoting from vkernel(7):

It is possible to directly gdb the virtual kernel's process.  It is recommended that you do a `handle SIGSEGV noprint' to ignore page faults processed by the virtual kernel itself and `handle SIGUSR1 noprint' to ignore signals used for simulating inter-processor interrupts (SMP build only).

You can add these two commands in my ~/.gdbinit to save yourself from typing them again and again.

    [beket@sadness ~]$ cat ~/.gdbinit
    handle SIGSEGV noprint
    handle SIGUSR1 noprint

So we are going to attach to the vkernel process:

    # ps aux | grep kernel
    root  25408  0.0  2.3 1053376 17772  p0  IL+   8:32PM   0:06.51 ./boot/kernel -m 64m -r rootimg.01 -I auto:bridge0
    # gdb kernel 25408
    GNU gdb 6.7.1
    [...]

Let's get a trace from inside gdb:

    (gdb) bt
    #0  0x282d4c10 in sigsuspend () from /usr/lib/libc.so.6
    #1  0x28287eb2 in sigsuspend () from /usr/lib/libthread_xu.so.2
    #2  0x0821530a in stopsig (nada=24, info=0x40407d2c, ctxp=0x40407a4c) at /usr/src/sys/platform/vkernel/i386/exception.c:112
    #3  <signal handler called>
    #4  0x282d4690 in umtx_sleep () from /usr/lib/libc.so.6
    #5  0x08213bde in cpu_idle () at /usr/src/sys/platform/vkernel/i386/cpu_regs.c:722
    #6  0x00000000 in ?? ()
   (gdb) 

Why does it differ from the ddb's trace ?
Well, when the vkernel is sitting at a db> prompt all vkernel threads representing virtual cpu's except the one handling the db> prompt itself will be suspended in stopsig(). The backtrace only sees one of the N threads.

We need to do better this time. Let's break into the kernel _before_ it crashes. sys_ktrace() seems like a good candidate.

    # gdb kernel 25532
    GNU gdb 6.7.1
    [...]
    (gdb) break sys_ktrace
    Breakpoint 1 at 0x80acf43: file ./machine/thread.h, line 83.
    (gdb) 

Next we type 'c' in the gdb prompt to resume vkernel execution:

    (gdb) c
    Continuing.

Now we go to our vkernel and type the offending command:

    # ktrace -c 

Gdb stops the execution of vkernel and a message pops up in gdb buffer:

    Breakpoint 1, sys_ktrace (uap=0x573e2ca0) at ./machine/thread.h:83
    83          __asm ("movl %%fs:globaldata,%0" : "=r" (gd) : "m"(__mycpu__dummy));
    (gdb) 

We navigate through source code with the 'step' and 'next' gdb commands. They are identical, except that 'step' follows function calls. When we meet this call:

    276                     allproc_scan(ktrace_clear_callback, &info);

we 'step' inside it. alloproc_scan() iterates through the process list and applies the ktrace_clear_callback() to each one of them.
Later we see this:

    347                     if (p->p_tracenode->kn_vp == info->tracenode->kn_vp) {

Here p is a pointer to the current process:

    (gdb) print p
    $1 = (struct proc *) 0x57098c00

Let's see if this process is traced:

    (gdb) print p->p_tracenode
    $2 = (struct ktrace_node *) 0x0
    (gdb) 

Oops. There is no trace to a vnode for this process. The code will try to access p->p_tracenode and crash. This is the zero virtual address we saw before.