Insomni'hack 2019: phpain

I participated at the Insomni'hack CTF 2019 with some colleagues. One of the challenges that we solved was the phpain challenge. I'd like to give a description here and explain how I solved it. This was one of the easiest challenges that most teams solved.
Given was a network IP address and the source code of the page. With this information you can also try to solve it.
So I downloaded the source code, which was PHP and I just found this obfuscated code, when opening it in Notepad:
View of source code in Notepad
So as you can see, this is nicely obfuscated code. As I was working on Windows, I suspected that the file endings were note working, so I opened it in Visual Studio:
Opening the obfuscated file in Visual Studio
My first idea was to manually de-obfuscate this. But given the size of the file, that idea was quickly discarded. Also, I wanted to try out how this works, because I was not used to PHP. The first statement probably takes the uninitialized variable $_ and adds one, so it probably has the value of 1. I wanted to try this out, so I used an online PHP sandbox tool:
one sandbox tool
But unfortunately this just resulted in an error message "undefined variable" and no result. I tried another tool:
another sandbox tool
but unfortunately the same there. It looks like I cannot configure this to ignore such warnings. On the webserver this can be configured though. For this write-up, I looked again and found another tool that works:
Working PHP sandbox tool
But as I didn't find this during the CTF, I continued by using a PHP installation that I already had on my machine from an earlier challenge.
So I used this sample file:
sample file
and ran it from the command line:
running the sample file
I did get various warnings that I didn't know how to turn off, but I did get the correct result. So that means I could start playing around with the code.


Looking at the code a little closer, I found near the middle of the file the following:
separation; two files
So it seems that there are actually two concatenated PHP files. The first one defines these strange underline variables and the second does the actual logic. So the best would be to find out what these variables contain before going into the second part. So I added a print (echo) statement for the assignment:
middle deobfuscated
This printed "flag" onto the console, so I knew "flag" was in the variable "$_____". I replaced this with static code. This also makes it clear that an input parameter is fetched with GET. I renamed this to $_inputflag in order to play around with it without having to pass data.


As next part there was some comparison:
comparison 0
So it compared the first character of the given input and if yes, does a lot of things and increments some variable by 10, otherwise it decrements it by 10.


Afterwards there's another block of code that does pretty much the same, but for the second character of the given input:
comparison 1
So that's probably how the input is validated! So I wrote a quick loop (using Google to find the exact PHP statements) and got something like this:
loop for character 0
This worked and gave me the first character as 'N'. I tried a few more but I quickly found that this is not very efficient, because this continued for 87 characters. Here's the last block:
comparison 86
So I tried to automate this. I made many mistakes trying to automate this. My first idea was the following:
- set $_inputflag = "....." (like in screenshot "loop for character 0" above)
- for each character in $_inputflag
- for each ASCII character
- run through all 87 comparison blocks


At the end, use the character that has the highest value of this variable that gets increased/decreased depending on the comparison result. As this loops through all 87 characters, this should return the correct input at the end. Unfortunately this did not work and just returned gibberish.


After these 87 comparison blocks, I noticed that there was a comparison of this counter value against 870:
comparison of counter value
I didn't get 870 though, only 850. Debugging this a bit further, I found that for one block that I investigated, the value was only for one specific character -10 (false) and for all others +10 (true). So I concluded that in some cases the true/false statements with the else were swapped and probably more than 87 input characters are used and in some cases exactly this character that results in -10 is the correct one. So I wrote some code to count the character statistics per block. So for the first character tried (let's say we try first with 'a') we get a total score of let's say 550. Then I store this. If the next round (trying with 'b') also returns a total score of 550, I've set the 550 to a weight of 2, and so on. I only stored the first two weights. For a normal block I would get something like 1 character found with total score of 550 and 254 characters found with a total score of 530. Then I took all the characters that resulted in weight 1 (no matter if true or false) and assumed that was the correct character. Unfortunately this also didn't work. I only got a total score of 760 or something, far less than the required 870. So I concluded that the number of blocks is indeed 87 and we have to reach 870 as total score, so somehow the first code was correct, but something was missing.
I then checked if I find exactly one matching character. If yes, I'll take that as solution character. If not, I'll set a question mark. This worked, but resulted in a only a total of 850, not matching the 870 comparison at the end. So I checked which block didn't match and I found one (block 42). Investigating this manually showed that exactly two characters match, not one. So I printed out all 256 characters with a yes/no and I found two characters that would match there (characters 82 and 83).
So this is the final code at the top before the 87 blocks:
final code, top
There is an additional loop that calculates just the first character; ignore that one.
And this is the code at the end after the 87 comparison blocks:
final code, bottom
There is a big block that checks some statistics for block 42 and finally sets character 83. It then prints the found character and at the end prints the solution. Additionally I added the score comparison:
final code, score comparison
When we run this, we now get:
final code, run
So we get the partial input flags:
INS{?0_t???_?l?cKl1s??HerE?1?_?????u?H_?u?Rs1N?_???h??-C?3?l????-???a????f0r_??be?r?:o?
INS{?0_t???_?l?cKl1s??HerE?1?_?????u?H_?u?Ss1N?_???h??-C?3?l????-???a????f0r_??be?r?:o?

(depending on whether we use 'R' or 'S' in block 42).
As there is some output that says "submit it on the challenge's web endpoint to get the real flag", we submit this value to the given IP address with GET-parameter ("?flag=I??{…" in the URL) and we then get the complete flag (works with both of the above):
INS{G0_t3lL_Bl4cKl1s_tHerE_1s_t00_muCH_gu3Ss1NG_iN_h1S-Ch3llS.In-xchaNg3_f0r_a_be3r?:o}
"Go tell Blacklis there is too much guessing in his chells in exchange for a beer."
Not sure what that means exactly (maybe Bl4cKl1s was the author), but that was the solution.
If I find some more time, maybe I'll manually de-obfuscate the code in order to get some more insight.

Comments

Popular posts from this blog

Insomni'hack 2024 - Award Challenge

Capture The Flag Challenges from Cyber Security Base with F-Secure 2017/2018

Insomni'hack Teaser 2018 CTF