Using the 2.6.26 Linux Kernel Debugger (KGDB) with VMware

Reading the linux kernel documentation on KGDB wasn’t enough for me to be able
to use the newly built-in KGDB kernel debugger with version 2.6.26 or 2.6.27.
The breakthrough for me was reading [part of Jason Wessel’s
guide](http://www.kernel.org/pub/linux/kernel/people/jwessel/kgdb/ch03s03.html).

I have two machines:

* developer – where I run gdb
* target – where the kernel is being debugged, running in VMware

Configure VMware on the developer machine

* Power down the guest (target)
* Edit the VM guest settings
* Add a serial port
* Use named pipe `/tmp/com_1` (it’s really a UNIX domain socket)
* Configure it to “Yield CPU on poll” (under Advanced)
* Install ‘socat’, if not already installed

Configure and Compile the kernel on the developer or the target machine

– Get kernel 2.6.26 or newer
– `make menuconfig` # or make gconfig
– Under Kernel Hacking:
– enable KGDB
– enable the Magic SysRq key
– enable “Compile the kernel with debug info”
– Build kernel: `make`

Configure target

– Enable Magic SysRq key on target:
– Edit /etc/sysctl.conf and set `kernel.sysrq = 1`
– or run `sysctl -w kernel.sysrq=1` # this doesn’t survive a reboot
– Install developer kernel
– On the developer machine:
`rsync -av –exclude .git ./ root@target.host.name:/mnt/work/linux-2.6.26`
– On the target, a RedHat based system:
`make install`
`make modules_install`
– Edit /boot/grub/grub.conf and set `timeout=15`
– Boot into the newly installed kernel

Start debugging

– On target:
`echo ttyS0 > /sys/module/kgdboc/parameters/kgdboc`
– On developer:
`socat -d -d /tmp/com_1 PTY:` # notice what pty is allocated — /dev/pts/1 in my case
`gdb vmlinux`
`set remotebaud 115200`
`target remote /dev/pts/1`
– On target, do one of the following:
– `echo “g” > /proc/sysrq-trigger`
– Type ALT-SysRq-G
– Ready, get set, go! Go back to developer machine and use gdb to set
breakpoints, continue, etc.

I set up debugging because I wanted to understand the behavior of the kernel
when loading a module. It turns out that loading of the module failed because
sitting in a debugger delayed the execution, causing a timeout in module load
by the time I stepped through the code. Use of printk turned out to work
better.