Welcome to 2006 - Im now on Twitter

After a long period of persistently avoiding social media, I have now moved (somewhat) into the Web 2.0 age and am now on Twitter, where I semi-regularly dispense 140 character packages of fun and frivolity.

So, if you like seeing pictures of cats dressed up as people, feel free to follow me - there's a button over on the side panel, or you can look me up by my Twitter name @SM_Bradshaw.


My Python gdb Extensions

Introduction

If you started to learn reverse engineering and exploit development on 32 bit Windows systems as I did, you were probably very unimpressed when you first attempted to try out your skills on *nix machines and started (trying to) use gdb.  I know I was.

Gdb is quite powerful, but it seems to be focused more on debugging applications with source and debug symbols.  While its certainly possible to debug applications while only having access to the stripped binary, a lot of gdb's frequently used features aren't that useful.  Gdb wasnt designed with a focus on reverse engineering in mind, and neither were a lot of the various gdb GUI front ends.  Things that are simple in OllyDbg such as getting an immediate view of the stack, disassembly and register values every time the program stopped, or searching for a particular value through all of program memory are just painful.

Thats why the last time I had to reverse engineer an application on Mac OSX, I wrote some extensions for gdb using the Python API that was added to gdb in version 7.  These extensions consisted of a few new gdb commands, as well as some nifty hooks that enabled me to get the necessary information out of the program in a way that I was familiar with.

The additional commands I added provide new ways to view register values, program memory, the stack, to do assembly level stepping and to search program memory.  There are also hooks that allow certain program data to be sent to fifos every time program execution pauses.  In practice, this allows me to have output like this:

gdb lookin like Olly!


What you see above is a terminal window with multiple panes.  The main pane, on the bottom left is running gdb, with a program currently in the paused state..  The other panes are running the tail command on different fifo pipes, and showing output from gdb relating to the programs disassembly, registers and stack, including some contextual information such as any strings or hex data that might be referenced by the various registers or stack entries.

The idea was to get the break output from gdb as close to the information provided by OllyDbg without having to code up an entire GUI fronted, and after having used the interface for a few weeks I think its a reasonable facsimile. Its not possible to right click on anything as you can in OllyDbg, obviously, but most of that stuff can be done from the gdb command line.  This does however allow you to easily see at a glance most of the important stuff that Olly can show you when you're spending a while stepping through a program waiting for something interesting to happen.

Compatibility


This code has been tested on Mac OSX Mountain Lion and Linux, and works for both 32 and 64 bit processes.  Ive tested this using Python 2.7 and the Mac Ports version of gdb on OSX, and the Ubuntu 13.04 repository version of gdb on Linux, but other versions of gdb over 7 and Pythons over 2.5 SHOULD work too.  Probably. The Python gdb extensions seemed to have 'evolved' a little bit since they were first introduced in gdb 7 – to the extent that some of the early examples of code you can find for creating gdb extensions in Python don't actually work any more.  Your version of gdb also needs to have been configured with the '--with-python' option, which I believe applies to most versions you will get in the various packaging systems around.

Most of the commands in this set of extensions should run on any system that supports the right version of gdb and Python, with the exception of the search commands (which required use of some OS specific memory listing commands) and the register display commands (because these use hard coded register name values, which I have only provided for x86 and amd64 processors).   The stack printer also assumes a little endian order, so if the processor you are using is big endian this wont display right either.

Note: If you're running this on OSX, please note that the Xcode Command Line tools version of gdb WILL NOT work, its too old and doesn't support the Python GDB extensions.  Install macports and their version of gdb.

Supported Commands


Heres a list of commands included:
  • save breakpoints - save breakpoints to a file.  These breakpoints can then be read back into a new session using the built in gdb 'source' command.  This is essentially a fixed version of a broken bit of example code I found on the 'Net.
  • readhexmemory - prints in hex format a given number of bytes from a given address in memory.
  • readhexdumpmemory - prints out a given number of lines (16 bytes per line) of a hex dump of memory starting at a given address. Each line contains the hex value for each byte as well as its ASCII representation, in a similar fashion as you would see in a Hex editor.
  • readstring - prints out any string located at a given address in memory.  Works with ASCII and basic Unicode strings.
  • searchstring - searches allocated program memory for a given string, and prints out all discovered instances along with their associated memory addresses.  Works for ASCII and simple Unicode strings.  Only linux and Masc OSX supported.  Warnings may appear when certain sections of memory cannot be searched – these are nothing to worry about.
  • searchbinary - searches allocated program memory for binary data, provided in hex format, e.g. 414243ae. Only linux and Masc OSX supported. Warnings may appear when certain sections of memory cannot be searched – these are nothing to worry about.
  • printstack - prints out a given number of lines of the stack view, including the address, the value stored in memory at that address and, if the stack entry could be interpreted as a pointer, any data (string if possible, hex if not) stored at that address.  If the number of lines is ommitted, a value of 10 is assumed.
  • printregisters – prints out the registers, along with contextual information of the data located at the memory address stored by each register.
  • printdisassembly – prints out the disassembly for the code located at the instruction pointer register.
  • fifodisplay – registers a number of handlers to automatically run the printstack, print registers and printdisassembly commands each time the program pauses, and to send the output to fifo pipes created in the current working directory named stack, registers and disassembly.  Combined with a multiple paned terminal window, can give you contextual information simialr to OllyDbg. Use the Stop parameter to disable the handlers.
  • no – Does an assembly level step operation that will skip over “call” function calls and wont break on new threads.
  • ns – Does an assembly level step operation that will skip into “call” operations and wont break on new threads.

Usage


You can load these extensions by copying the Python file somewhere where your instance of gdb can see it and using the gdb source command to read it.  You should ideally load the extensions after attaching your file in gdb for debugging, because the load process runs a function to determine the architecture of the file being debugged, and future commands may fail if this is not successful.  You'll get an error message if there are any problems with this setup function, or a success message if everything is OK.  If you get an error message just load the file again after fixing the problem.

Once the file is loaded correctly the commands listed above can be run like any other gdb command, including support for tab command completion and support for the inbuilt gdb help system.

The fifodisplay command, which was responsible for the screenshot shown above, is a little more fiddly to setup however, so I will give a brief example of how to set it up below.

UPDATE: The usage of fifodisplay has changed a little since the 1.10 update to the extensions.  Now, instead of requiring you to run the tail commands to read the fifos from memory the first time the program pauses, running fifodisplay will now prompt you to setup each fifo listener individually immediately.  The fifo files are also now stored in /tmp/ and are cleaned up by the extensions if you run 'fifodisplay stop' or if you exit gdb or stop debugging the target.

To summarise, this is what you need to do to set it up:

Have a terminal program that supports multiple panes.  Im using iTerm2 in the screenshot shown above for the Mac, and on Linux I use Terminator (the Gnome version).  Set up for different panes, all with the same present working directory, and run gdb in the bottom left pane.

Attach to the program you want to run in gdb, and load the extensions file into gdb using 'source' and run the program to a breakpoint.  Now run the fifodisplay command and make the program trigger a stop event in gdb.  The simplest way to do this is to step ahead by one assembly instruction ('stepi') or continue execution until you hit another breakpoint you have configured.

Now, in order, do a tail -f on the 'stack' file in the bottom right window, the 'registers' file in the top right window, and the 'disassembly' file in the top left window.  What you are doing here is displaying the content from fifo pipe files with these names that are created in the present working directory by the gdb event handlers created by the fifodisplay command. These files are created in this order, the first time the gdb instance stops after the fifodisplay command is run.  Gdb will appear to freeze up while it waits for you to run each of these tail commands, because its waiting for each of the fifo pipes to be read from before command execution can continue.  This will eventually time out if you take too long however – if this happens just 'stepi' once more and you can continue on from where you left off opening the fifo files.

When you want to disconnect the handlers, kill each of your tail commands and either run 'fifodisplay Stop' (case sensitive) in gdb, or quit the gdb session.  The fifo pipe files will not be deleted automatically, so on subsequent attempts you'll just be able to start your listening tail commands whenever you wish.  When you want to get rid of the fifo files, just delete them like any other file.

Usage Example


Here's a quick example of me using the fifodisplay command during a gdb session debugging /bin/ls.

First I setup a new terminal session with four panes. Then, in the bottom left hand pane I run gdb. Im using the macports version of gdb (the binary for this is called ggdb).
stephenbradshaw@lion:~$ ggdb /bin/ls
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-apple-darwin12.3.0".
For bug reporting instructions, please see:
...
Reading symbols from /bin/ls...(no debugging symbols found)...done.
(gdb)


Within this running session of gdb, I then run commands to find the entry point of the program and set a breakpoint there
(gdb) info file
Symbols from "/bin/ls".
Local exec file:
`/bin/ls', file type mach-o-x86-64.
Entry point: 0x100000e0c
0x0000000100000aa8 - 0x0000000100003e7b is .text
0x0000000100003e7c - 0x000000010000403e is __TEXT.__stubs
0x0000000100004040 - 0x000000010000433e is __TEXT.__stub_helper
0x0000000100004340 - 0x00000001000044c8 is .const
0x00000001000044c8 - 0x0000000100004900 is .cstring
0x0000000100004900 - 0x0000000100004998 is __TEXT.__unwind_info
0x0000000100004998 - 0x0000000100004ff8 is .eh_frame
0x0000000100005000 - 0x0000000100005028 is __DATA.__got
0x0000000100005028 - 0x0000000100005038 is __DATA.__nl_symbol_ptr
0x0000000100005038 - 0x0000000100005290 is __DATA.__la_symbol_ptr
0x0000000100005290 - 0x00000001000052b8 is .data
0x00000001000052c0 - 0x00000001000054f0 is .const_data
0x00000001000054f0 - 0x000000010000557c is __DATA.__common
0x0000000100005580 - 0x0000000100005640 is .bss
(gdb) break *0x100000e0c
Breakpoint 1 at 0x100000e0c


Now I run the program til it hits the first breakpoint:
(gdb) r
Starting program: /bin/ls

Breakpoint 1, 0x0000000100000e0c in _mh_execute_header ()


Now I load the extensions and run the fifodisplay command:
(gdb) source sb-gdb-extensions.py
gdb extensions 1.00 loaded
(gdb) fifodisplay


Now I step ahead by one assembly instruction. Execution will now seem to freeze in the debugger as it waits for each fifo pipe to be read from:
(gdb) stepi

At this point I ran the following commands, in this order, in the other panes of my terminal window:

Bottom right pane:
stephenbradshaw@lion:~$ tail -f stack


Top right pane:
stephenbradshaw@lion:~$ tail -f registers


Top left pane:
stephenbradshaw@lion:~$ tail -f disassembly


Now in the bottom left pane you should see the stepi command finish in gdb.
0x0000000100000e0d in _mh_execute_header ()
(gdb)


Now you should be able to continue your debugging session as normal, and every time program execution stops in gdb, the stack, register and disassembly information will be updated in each of the associated terminal panes.

Download

The latest version of the extension of the extensions can be found here.