Archive for November, 2008

Using the 2.6.26 Linux Kernel Debugger (KGDB) with VMware

Tuesday, November 18th, 2008

Reading the linux kernel documentation on KGDB wasn’t enough for me to be able to use the newly built-in KGDB kernel debugger with version 2.6.26 or 2.6.27. The breakthrough for me was reading part of Jason Wessel’s guide.

I have two machines:

  • developer – where I run gdb
  • target – where the kernel is being debugged, running in VMware

Configure VMware on the developer machine

  • Power down the guest (target)
  • Edit the VM guest settings
  • Add a serial port
    • Use named pipe /tmp/com_1 (it’s really a UNIX domain socket)
    • Configure it to “Yield CPU on poll” (under Advanced)
  • Install ‘socat’, if not already installed

Configure and Compile the kernel on the developer or the target machine

  • Get kernel 2.6.26 or newer
  • make menuconfig # or make gconfig
  • Under Kernel Hacking:
    • enable KGDB
    • enable the Magic SysRq key
    • enable “Compile the kernel with debug info”
  • Build kernel: make

Configure target

  • Enable Magic SysRq key on target:
    • Edit /etc/sysctl.conf and set kernel.sysrq = 1
    • or run sysctl -w kernel.sysrq=1 # this doesn’t survive a reboot
  • Install developer kernel
    • On the developer machine: rsync -av --exclude .git ./ root@target.host.name:/mnt/work/linux-2.6.26
    • On the target, a RedHat based system: make install make modules_install
  • Edit /boot/grub/grub.conf and set timeout=15
  • Boot into the newly installed kernel

Start debugging

  • On target: echo ttyS0 > /sys/module/kgdboc/parameters/kgdboc
  • On developer: socat -d -d /tmp/com_1 PTY: # notice what pty is allocated — /dev/pts/1 in my case gdb vmlinux set remotebaud 115200 target remote /dev/pts/1
  • On target, do one of the following:
    • echo "g" > /proc/sysrq-trigger
    • Type ALT-SysRq-G
  • Ready, get set, go! Go back to developer machine and use gdb to set breakpoints, continue, etc.

I set up debugging because I wanted to understand the behavior of the kernel when loading a module. It turns out that loading of the module failed because sitting in a debugger delayed the execution, causing a timeout in module load by the time I stepped through the code. Use of printk turned out to work better.