OhMyRE!

Summary

The RE program from this challenge runs in a remote server and requires a password and username to output the flag. A valid password for the program is one that meets certain criteria or constraints (specific length, subsequences that are equal to a constant, etc), and thus consists of a combinatory problem with many different passwords that would match. The username is a unique hardcoded string that goes through a simple linear transformation routine.

This problem category is part of the SMT and CSP family of problems, and are commonly found in reverse engineering challenges as sub-problems. Solving these can prove to be quite hard regarding computing and runtime resources. That’s why SAT/SMT/CSP engines can be used in order to solve these sub-problems in an efficient and flexible manner.

The following blog post goes through the different steps to solve this specific challenge and shows one specific approach using Z3¹, the theorem prover from Microsoft Research.

Note: As this challenge only got 2 solves, I tried to go a little verbose in this post to explain some things in detail. If you’re an experienced reverse engineer just skip to part 2 for the z3 script.

Overview

Oh My RE! Me pidieron completar esto con mis datos, pero no funciona :(
– chall description

This challenge started with a simple ELF binary for x86 64-bits that is not stripped and with all modern protections enabled.

Output for file and checksec

Looking at the main function in IDA’s decompiler we can go fairly quickly through the major parts of the program and roughly divide it in 3 blocks.

Decompilation of main function

Tip: Always remember to comment chunks of code and name your variables in IDA/Ghidra/Binja!

Part 2 contains the most interesting part of the challenge. Rest are standard functionalities in CrackMe’s.

I. Transform Password Input

First chunk of decompiled code

Consists in:

Unbuffering stdin, stdout and stderr on setup() using setvbuf() (irrelevant to the challenge)
Getting 1000 bytes of input from the user and store it to s (no funny BOF business with this buffer). Note that fgets() reads until it finds either a null-byte (\x00) or a newline (\n, \x0a)
Get length of user input using strcspn() (index in buffer for char \n) and null-terminate string
Perform some transformation over the user input in function change using and a hardcoded byte (\x52) and the string length

change()

Looking at it closely under any decompiler after renaming vars, one can realize that it’s just a function that performs a xor operation byte-to-byte over the input string (s) using the given hardcoded parameter (xor_key = 0x52), stopping after reaching the string length.

Function change decompiled

That for-loop can be rewritten as the following C code:

1
2


for (int i = 0; i < len_limit; i++)
	s[i] = s[i] ^ xor_key;

II. Password Condition Check

As if my comments are not spoiler-ish enough… let’s break down the main dish on the menu.

Second chunk of decompiled code. Many if-conditions

fail()

The condition we want to avoid. Just prints a custom message to the screen and finishes the program execution through exit.

Decompilation for 2-liner function named fail

check()

After some renaming, the decompiled function looks simple enough: go char by char through some length (n_limit) of the input string, adding the raw numerical values of these chars to a sum variable. Finally, check if the sum is a multiple of a factor (n % factor == 0), passed as an argument.

Decompiled function named check. Nothing more than a for-loop at first look

However this has some caveats and it’s the reason why I encourage new reversers to not just depend only on the decompiler, but to check the ASM and other sources of information from the disassembler too.

sum is a 4-byte signed integer, and chars are only 1-byte. So how is the up-casting for the chars being done at the processor level?
How’s sum and factor handled for division and remaindering at the processor level?

Assembly for function named check. Instructions movsx and movzx are emphasized

Shows that the char (1-byte) from our input string is transformed into a 4-byte unsigned integer via zero-extending the char value (repeating zeros over the 24 most-significant bits) and storing it to eax (4-byte register). This is done through movzx . E.g. 0x61 => 0x00000061.
Now, get the lowest byte from eax (our original char) and sign-extend it back to eax (repeating ones over the 24 most-significant bits). This is done through movsx. E.g. 0x61 => 0xffffff61.
sum (a 4-byte signed int) is converted to a 8-byte long (quadword) using cdq (convert doubleword to quadword) with sign-extension. Then the idiv instruction is used to divide a signed long with factor being the divisor, with the results being the quotient in eax and remainder in edx.

Why is this relevant?

Because sum will wrap-around over and over again until settling in a value that will be handled as a signed int. Very different from just adding 1-byte chars as unsigned ints and then checking for multiplicity.

Not over yet

Another thing to take into consideration now that we know how check() works, is that the decompiler is being tricked to show the input string (s) at different indexes (s[4] and s[8]) as other variables.

Stack view of local variables for main function. RBP marks the stack frame start at a high-address

I named them as s_4 = &s[4] and s_8 = &s[8].

---
title: Memory representation of our variables
---
block-beta
	columns 32
	space:4 s_4:4 s_8:24
	s:32

Note that the ASM (the ground truth) shows these individual variables that overlap s in the stack, at s[2] and s[13], instead of s[4] and s[8].

If-conditional tree shown as a Control Flow Graph (CFG)

SMT, SAT and CSP

Going back to the original image, it should now become apparent that I summarized these check conditions in a python-esque way in the comments I left on the IDA decompiler window, for each if-condition.

If-conditionals from before

These can be seen as “constraints” that should be met by the password we supply. A very naive approach would be to bruteforce all possible values for a char (0-255) for the 4-5 char chunks in each check condition, also taking into account that our post-transformation password should follow s[2] == 'Q', s[13] == '4' AND that there should be no null-bytes (\x00) in-between, otherwise fgets would stop ingesting our input.

However, the search space for that would quickly become unmanageable if the password length increases (in an hypothetical program that specifies password_len == n) or if the chunk length gets increased too, which would mean a longer runtime in order to get one working solution. E.g. the naive approach would have you calculating $256^4 = 2^{32}$ possible combinations for one 4-char chunk.

Besides that, writing a program that solves that small feat through bruteforce, that also has to be drastically changed through each challenge variation does seem really unpractical, boring and dull to program. Implementing a heuristic-based search algorithm² would be a more flexible and smarter approach, but still lengthy and error-prone to do for yourself each time.

Backtracking algorithm in simple form

Figure credits goes to³

That’s how CSP and SAT/SMT solvers get introduced for tackling this family of problems. Note that they still make use of heuristic-based searches underneath, however we only have to worry about understanding the problem, modeling it (partly or completely) and then just translating it to a programming language.

One of my favorites is Z3¹ from Microsoft, which is the one I usually use for small tasks through the python bindings. Another one is SAGE⁴, but it’s complete enough and works for mathematical analysis too. Other instrumentation frameworks for symbolic execution, similar to Z3, that can also perform concolic execution, taint analysis and more advanced stuff are KLEE⁵, Triton⁶ and one of the most known in the CTF scene: Angr⁷ (most of these still use Z3 under-the-hood).

Z3 Script

Anyway, here goes my python-3 script for generating one of the many possible passwords that satisfy the constraints for this challenge.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62


#!/usr/bin/env python3
import z3

PASS_LEN = 16
MODS = [3, 8, 5, 3]

def change(inp, xor_key):
	mod_inp = [0 for i in range(len(inp))]
	for i in range(len(inp)):
		mod_inp[i] = inp[i] ^ xor_key

	return mod_inp

def check(inp, mod):
	mod_z3 = z3.BitVecVal(mod, 32)
	zero = z3.BitVecVal(0, 32)
	c = zero
	
	for i in range(len(inp)):
		c += z3.SignExt(n=24, a=inp[i])
		
	return ((c % mod_z3) == zero)


if __name__ == "__main__":
	# solver and C-styled flag array
	s = z3.Solver()
	flag_array = []

	# create list of bitvector chars
	for i in range(PASS_LEN):
		flag_array.append(z3.BitVec(f"c{i}", 8))

	# automagically create constraint trees
	flag = change(flag_array, 0x52)

	# set final constraints
	s.add(flag[2] == ord('Q'))
	s.add(flag[13] == ord('4'))
	s.add(check(flag[0:4], MODS[0]) == True)
	s.add(check(flag[4:9], MODS[1]) == True)
	s.add(check(flag[8:12], MODS[2]) == True)
	s.add(check(flag[12:16], MODS[3]) == True)

	# no null bytes in our input
	for i in range(PASS_LEN):
		s.add(flag_array[i] != 0)

	# routine boilerplate code
	if s.check() == z3.sat:
		m = s.model()
		flag_str = ""

		for i in range(PASS_LEN):
			flag_chr = r"\x" + hex(m[flag_array[i]].as_long())[2:].zfill(2)
			flag_str += flag_chr

		print(f"password: \"{flag_str}\"")

	else:
		print("[!] Not solvable!")
		exit(127)

Seeing it in action, it took $\approx1/5$ of a second to get one valid solution for the password.

Measuring runtime performance for my python-interpreted script with command time

III. Check Username Input and Get Flag

After progressing from the password check, the program asks for a 100-byte long username using fgets, calculates the username length through the same strcspn method as before, and then evaluates the username through the return value of the function compare_with_target(). If the ret value is 1, then the flag will be printed through succeed().

Third chunk of decompiled code inside main function

succeed()

Just prints out the contents of the file flag.txt.

Decompiled code for function named succeed

compare_with_target()

Does a sanity check in length vs the expected, harcoded username, and then goes through the given username input string, while comparing your input in a byte-for-byte fashion with the hardcoded correct username with a small extra: it adds the char 0x02 to each char of your input before check.

Last check to comply with before getting our flag. Hardcoded username requires a small transformation

A simple function in a python interpreter is enough for doing the inverse operation in order to get the expected username.

A bunch of lines in IPython to transform the hardcoded username

Username Generation - Code Snippet

A small script that contains the next lines can do the trick.

1
2
3
4


user, username_str = "", b"elb4rt0pwn"
for i in range(len(username_str)):
     user += chr(username_str[i] - 2)
print(f"[+] User: {user}")

Getting the Flag

Then, sending one of the many valid 16-byte-long passwords along with the transformed username using your favorite python library (e.g. pwntools, socket) will get you the flag.

Sending password, username and getting the flag in one script!

1

q4{preparand0_l0s_r3_par4_q42025!!!!}

Extra: Angr

As a short bonus for this post, I tried to solve this challenge after the CTF using Angr with the less possible amount of information from reverse engineering the program. Turns out, I couldn’t make Angr get a solution for both symbolic inputs from stdin on the same run, but they can be found using separate states that start and end from different addresses: one that stops before the 2nd call to fgets and one that starts from there until reaching the winning function. Also, I had to manually set the BitVectorSymbol length for the password (to 16 bytes), as fgets reads 1000 bytes from stdin and thus my Angr script ran some hours before I killed it.

So the minimum of RE required to make a script with Angr for this problem should be:

Get the max length for the secret (e.g. password, username)
Get the address where you want to start (e.g. main function), along with the address of the end of a certain state (e.g. before the 2nd fgets) to solve for one secret.
Also include addresses for functions to avoid (saves time and resources to the sym engine as those paths are instantly pruned), such as fail()

The script that achieves this is left as an exercise for the reader! (RTBM hehe)

Solving for initial password

Solving for username

https://github.com/Z3Prover/z3 ↩︎ ↩︎
like backtracking. ↩︎
Predicting process performance: A white‐box approach based on process models - Scientific Figure on ResearchGate. Available from: https://www.researchgate.net/figure/Backtracking-algorithm-taken-from-47_fig2_33204163 ↩︎
https://www.sagemath.org/ ↩︎
http://klee-se.org/docs/ ↩︎
https://triton-library.github.io/ ↩︎
https://docs.angr.io/en/latest/quickstart.html ↩︎

Write-up: OhMyRe!

This post goes through the steps in solving the challenge "OhMyRe!" from the Q4 mini-CTF (2024). A detailed walkthrough of the main SMT problem related to the program is shown, and a solution using Z3 is presented shortly after.