HITCON 2023 – The Blade
Beginner-friendly writeup for a nifty Rust reversing challenge.
This writeup is also intended for beginners. Anytime there's a set of questions, feel free to pause, challenge yourself, and think through them. :) If you want to follow along, you can grab the challenge binary here.
I'll be mainly using ghidra as my decompiler, along with GDB + GEF. For those unfamiliar with GDB, you may find my recently posted cheatsheet helpful.
My first Rust rev solve! Though in hindsight, not much Rust knowledge was needed.
Description
A Rust tool for executing shellcode in a seccomp environment. Your goal is to pass the hidden flag checker concealed in the binary.
Author: wxrdnx
40/683 solves.
Writeup
Running the Server
Let’s start by running the binary. We can get a feel by navigating the program with help
and other commands.
Turns out we’re given a C2 (Command and Control) interface which sends shellcode. Imagine we control a compromised machine. By running a malicious shellcode, we can trigger a reverse shell to our server, so that we can easily send more commands from the server.
Anyhow, we can start the server with:
We’re told to run some shellcode on the “client”. By default, this starts a connection to localhost:4444
.
- A simple alternative is to run
nc localhost 4444
(on a separate shell). This will initiate a connection to the server, but it won’t have the same effect as the shellcode. - To run the shellcode, we can compile a simple C program containing the shellcode and execute it.
After running, then what? The commands don’t seem to reveal much, and at this point it’s a bit guessy. Time to turn to a decompiler.
Identifying Interest Points
- What command triggers the flag checker?
- Where is the flag processed?
- How is the flag processed?
According to the description, the program contains a flag checker. So presumably we pass the flag as input at some point. But where and how?
By running strings
, we find an interesting set of strings: ?flag
, ?help
, ?exit
, ?quit
. This pattern can’t be a coincidence.
In your favourite decompiler, do a search for the bytes ?flag
. If you can’t find it, try playing with endian settings. This should lead you to seccomp_shell::shell::prompt()
.
Under a condition checking for flag, we’re led to seccomp_shell::shell::verify()
.
Although strings shows ?flag
as the full string, the actual string is just flag
. Questionable, no? This is because the byte before flag
happens to be \x22
(i.e. ?
). strings
doesn’t know better, because it doesn’t actually disassemble the program.
So how is the flag actually processed? This requires a careful study of verify()
, with a touch of dynamic analysis and experimentation.
Like most flag checkers, it turns out we just pass the flag as input (alongside the flag
command).
And like most flag checkers, we’re immediately hit with “Incorrect”.
We also get some hints about the flag's length...
...by glancing at the start of verify()
...
...64.
When we try sending a flag 64-bytes long, we get something on our other shell. We're not immediately hit with an "Incorrect".
Reversing the Encryption
Time to play the UNO reverse card on this binary!
- There are 3 parts to the encryption. What addresses do they begin and end?
- What is each part doing?
Let’s recognise some high level patterns.
It’s easy to be intimidated by the multitude of loops; but really, half the loops are the same, just wearing different clothes.
There are the 3 parts to the encryption:
- Part 1 (
112d10
–113017
). - Part 2 (
113020
–11309c
). - Part 3 (
11310e
–1133a5
).
The procedure is roughly:
1. Reversing the Permutation
Address: 112d10
– 113017
.
Eight loops, doing pretty much the same thing. Let’s focus on the first one.
(Some variables are renamed for clarity.)
Does this look familiar?
It’s good ol’ swap! (Though slightly unrolled.) There are eight of these loops, the only difference being the memcpy
source.
So how do we reverse this? Generally, there are two approaches to take:
- Static analysis. This means we reverse the binary by looking at the bytecode, assembly, decompiler, etc. We don't run or emulate anything.
- Dynamic analysis. In this approach, we observe the program's behaviour by running it. Common tools are
gdb
,strace
, andltrace
.
Approaching this statically seems faster, but integer semantics may get lost in translation. Our conclusion may be unstable.
Thus, for fun (and practice), we'll go the dynamic route. Let's insert some breakpoints, input our flag of 64 unique characters (e.g. Base64 alphabet), grab the permuted string, and construct a mapping.
In GDB:
All that's left is to match the characters.
First part done!
2. Constructing An Inverse Map
Address: 113020
– 11309c
.
Part 2 Decompile
If we reverse this statically, it seems like we get a one-to-one mapping of sorts... almost a bijection... but perhaps our implementation is wrong? Better check it with dynamic analysis. 🫠
By "mapping", I mean values are transformed and mapped from one value to another. For example, a
(0x61) is mapped to u
(0x75), while b
(0x62) is mapped to q
(0x71).
Like before, we could reverse this part statically... but overflow and types are tricky to get right. So we'll go dynamic again!
Let's start with some basic GDB analysis:
- What breakpoints should we add to check the result of one map iteration?
- What memory location should we examine?
- (And after all that, can you derive the mapping function?)
Oh where to go?
We'll break after the loop, at 11309e
.
As for memory location, the code still modifies __ptr
, so we'll read from the same location.
Two things I'd like to point out:
Previously, we used
x/s
to print a string from memory. This time, I usedx/16wx
to print bytes, since some mapped bytes aren't printable.1There's another problem... previously, we input the bytes through
flag <bytes>
, and this is great if we're using printable chars. But what about non-printable chars? There are different solutions around this:- Employ GDB input tricks.
- Track cycles of characters.
- Break before mapping, and hard-code
__ptr
by modifying its memory. Since the goal is to discover the remaining mappings, we'll hard-code__ptr
with the rest of the bytes outside our Base64 alphabet.
Y'know what? Let's go with the last option!
Finished gathering data? Let's analyse it!
And just like that, we obtained a reverse mapping, thanks to it being a bijection!
What is a bijection?
A bijection is a function where...
- each input maps to a unique output. (injective)
- each possible output is mapped from a corresponding input. Specifically, every value in the function's range has a mapping. (surjective)
This characteristic is crucial as it guarantees an invertible operation.
3. Cracking the Shellcode
Address: 11310e
– 1133a5
.
The final part. Subtle, but delectable.
What are the 255 bytes copied into the Rust
vec
?What is the purpose of the data loaded at
11310e
?
The 255 bytes loaded into a vec
? Guess what? That also happens to be a shellcode!
- Follow-up question: if this is shellcode, what does it do? how is it run?
We can disassemble it with pwntools.disasm
to get the following ASM.
Small caveat: you'll want to set context.arch = 'amd64'
for disasm
to interpret the shellcode correctly. In the disassembly, we see our two points of insertion (0xdeadbeef
) treated as values, so amd64
is probably the right choice.
I've included the juiciest part above (with some annotations). Essentially, we perform several reversible operations (add, xor, ror, not) on 4 bytes of input; and the result is checked against 4 bytes of static data. Finally, it sets rax = 1
if correct, and rax = 0
if false.
But are we actually comparing 0xdeadbeef
. Nope—we're comparing something else instead.
- What are the two
0xdeadbeef
being replaced with?
The Guts: What does the shellcode load?
Here is some (simplified) code which overwrites 0xdeadbeef
values at instruction 113168
. Remember, vec
is the shellcode and __ptr
is our permuted + mapped input.
Does 0xA7
, 0x51
, 0x68
, and 0x52
look familiar? 🙃 Check the wall of bytes in 11310e
.
Effectively, the Rust performs a little surgery on shellcode before using it.
But how is it used? Leafing around the decompiled code, you may notice Rust IO write and read functions... those seem sus...
By now, you probably know what the shellcode does: flag-checking. But there are a few more things we need to reverse...
In the calculations, three mystery values (r12
, r13
, r14
) are used. These were computed in the preceding shellcode. To efficiently obtain our flag input from the expected output, we need to find out what these mystery values are.
In case you'd like to have a stab at dissecting the assembly, the full (unblemished) shellcode is in the box below. Try to figure out what r12
, r13
, and r14
are!
- If you're stuck, try using GDB on our shellcode program (main.c from Running the Server) with watchpoints.
- Important: the shellcode we saw when starting the server is different from the shellcode we're reversing here! The former acts as a client, receiving commands and executing them. The latter is a flag-checker payload that is sent to the client over the network.
- Here's a useful list of Linux x86-64 syscalls: filippo.io: Linux Syscall Table.
Full Shellcode
Have fun! :)
It turns out r12
and r13
are just the first 4 bytes of /bin/sh
and /etc/passwd
, which is respectively \x7fELF
and root
. These correspond to 0x464c457f
and 0x746f6f72
in little endian. And r14
is just 0x31f3831f
, computed with a bit of arithmetic. Armed with these 3 values, we can now reverse the encryption.
Tying it All Together
All that's left is to tie the three parts together.
Debugging Our Mess
Some small tips on debugging.
Sometimes the solution is simple and straightforward. But occasionally, we make programming mistakes or misunderstand the problem. Debugging code can be painful.
Prove inverse function holds. Sounds mathy, but the basic principle is to check if our reversed output equals our input. In math terms:
f(g(x)) = g(f(x)) = x
, whereg
is the inverse off
. If it's not equal, clearly we messed up somewhere.- This is useful for the second part (mapping), because our domain is small, just 0 - 255.
- For example, with shellcode encryption, we can do a forward pass, to make sure we're getting the same data.
Use
assert
s. Great for intermediate checks.Verify assumptions with dynamic analysis. For example, I had falsely assumed in the shellcode that
r12 == 0
, since seemed to be pushed by the assembly. But as we found out,r12 == L"FLE\x7f"
. However, be wary of the observer effect, where the program changes behaviour when observed. I haven't seen this much in CTFs, but it's certain to be out there...
Final Remarks
It's easy to miss things out. I know I did. Lots of hair was lost until I realised I left out the shellcode. I also tried to search the shellcode on ExploitDB, but no luck, because it was hand-spun.
But overall, a sweet challenge. And one that left me with nice rave music to power me through Saturday.
Solve Script
# HITCON 2023 – The Blade | |
# Solve Script by TrebledJ | |
# WriteUp: https://trebledj.github.io/posts/hitcon-2023-the-blade | |
import string | |
############### | |
### Part 1. ### | |
############### | |
p = string.ascii_letters + string.digits + '+/' | |
assert len(p) == 64 | |
# print(p) | |
permuted = 'Rp5v+AZmM8XWy1sgNhTB/oCzYVdPrGn6KD3Q9lke4qtFxHb0uUOcS2jIEJfL7aiw'[:64] | |
perm = [0] * 64 # Permutation . | |
rperm = [0] * 64 # Reverse permutation. | |
for i, c in enumerate(p): | |
j = permuted.index(c) | |
perm[i] = j | |
rperm[j] = i | |
############### | |
### Part 2. ### | |
############### | |
# Values obtained with `x/16wx 0x5555555d63e0`. | |
# Values before mapping. | |
dst = """ | |
0x5555555d63e0: 0x76357052 0x6d5a412b 0x5758384d 0x67733179 | |
0x5555555d63f0: 0x4254684e 0x7a436f2f 0x50645659 0x366e4772 | |
0x5555555d6400: 0x5133444b 0x656b6c39 0x46747134 0x30624878 | |
0x5555555d6410: 0x634f5575 0x496a3253 0x4c664a45 0x77696137 | |
0x5555555d63e0: 0x04050607 0x00010203 0x0c0d0e0f 0x08090a0b | |
0x5555555d63f0: 0x14151617 0x10111213 0x1c1d1e1f 0x18191a1b | |
0x5555555d6400: 0x24252627 0x20212223 0x2d2e3a3b 0x28292a2c | |
0x5555555d6410: 0x405b5c5d 0x3c3d3e3f 0x7c7d7e7f 0x5e5f607b | |
0x5555555d63e0: 0x84858687 0x80818283 0x8c8d8e8f 0x88898a8b | |
0x5555555d63f0: 0x94959697 0x90919293 0x9c9d9e9f 0x98999a9b | |
0x5555555d6400: 0xa4a5a6a7 0xa0a1a2a3 0xacadaeaf 0xa8a9aaab | |
0x5555555d6410: 0xb4b5b6b7 0xb0b1b2b3 0xbcbdbebf 0xb8b9babb | |
0x5555555d63e0: 0xc4c5c6c7 0xc0c1c2c3 0xcccdcecf 0xc8c9cacb | |
0x5555555d63f0: 0xd4d5d6d7 0xd0d1d2d3 0xdcdddedf 0xd8d9dadb | |
0x5555555d6400: 0xe4e5e6e7 0xe0e1e2e3 0xecedeeef 0xe8e9eaeb | |
0x5555555d6410: 0xf4f5f6f7 0xf0f1f2f3 0xfcfdfeff 0xf8f9fafb | |
""" | |
# Values after mapping. | |
mpd = """ | |
0x5555555d63e0: 0x2e616c58 0xe2cb3269 0xa002e0b3 0xc16b1c86 | |
0x5555555d63f0: 0xd2799cec 0x74d9c29e 0x9f043b0c 0xed14031e | |
0x5555555d6400: 0xca978fa2 0x39a4d8da 0xaf7e645b 0x0f71930b | |
0x5555555d6410: 0x0a81fd99 0x3aef66b7 0xe1ff00ee 0x09ab75ad | |
0x5555555d63e0: 0x51158ddb 0xfb7b4ebb 0xaab260eb 0xb0aca58e | |
0x5555555d63f0: 0x2bc6a635 0x635cde42 0xbd24b1e3 0x3043d65f | |
0x5555555d6400: 0x7c6d8b17 0x8ca7d52a 0x59a92706 0x9d83fe10 | |
0x5555555d6410: 0x41a880c0 0x25dc5ee7 0xc42d4ff9 0x164d2f6a | |
0x5555555d63e0: 0x896f5de8 0xfa94b61f 0x88c57f77 0xeab55a65 | |
0x5555555d63f0: 0x3ff44847 0x11cff11b 0xe9626eb4 0x12e4badf | |
0x5555555d6400: 0x4b28d7d1 0x96cd1353 0x2c9b2944 0x33b8e67a | |
0x5555555d6410: 0x31d3b940 0x52f720f2 0x1a01a192 0xd034f554 | |
0x5555555d63e0: 0xddbc19f3 0xfc8507be 0x4c7dc7d4 0x36f67298 | |
0x5555555d63f0: 0x8ae55046 0x454a9ac3 0xc90e3c57 0xcc687667 | |
0x5555555d6400: 0x840d90a3 0xf022bf26 0x91058770 0xae3d1dc8 | |
0x5555555d6410: 0x553e3723 0x08732149 0x389578f8 0x1856ce82 | |
""" | |
def mkbytes(s): | |
return [int(bs[2*(i+1):2*(i+2)], 16) for l in s.strip().split('\n') if l.strip() for bs in l.split(': ')[1].split() for i in range(4)] | |
dbytes = mkbytes(dst) | |
mbytes = mkbytes(mpd) | |
mp = [0]*256 # Value map. | |
rmp = [0]*256 # Reverse value map. | |
for a, b in zip(dbytes, mbytes): | |
mp[a] = b | |
rmp[b] = a | |
# Assert bijection. | |
assert len(mp) == len(rmp) == 256 | |
def apply_rmp(bs): | |
"""Applies one iteration of the reverse map""" | |
return bytes([rmp[b] for b in bs]) | |
def apply_rperm(bs): | |
"""Applies one iteration of reverse permutation""" | |
out = [0]*64 | |
for i, j in enumerate(perm): | |
out[i] = bs[j] | |
return bytes(out) | |
def apply_mp(bs): | |
"""Applies one iteration of the forward map""" | |
return bytes([mp[b] for b in bs]) | |
def apply_perm(bs): | |
"""Applies one iteration of forward permutation""" | |
out = [0]*64 | |
for i, j in enumerate(perm): | |
out[j] = bs[i] | |
return bytes(out) | |
############### | |
### Part 3. ### | |
############### | |
""" | |
cb: b8 ef be ad de mov eax, 0xdeadbeef | |
d0: 44 01 e0 add eax, r12d | |
d3: 44 31 e8 xor eax, r13d | |
d6: c1 c8 0b ror eax, 0xb | |
d9: f7 d0 not eax | |
db: 44 31 f0 xor eax, r14d ; eax = ~(roll(0xb, (0xDEADBEEF + r12) ^ r13)) ^ r14 | |
de: 3d ef be ad de cmp eax, 0x526851A7 | |
""" | |
def ror(x, n): | |
left = x >> n | |
right = (x & (0xFFFFFFFF >> (32 - n))) << (32 - n) | |
return right | left | |
def neg(x): | |
assert x >= 0 | |
return int(''.join('01'[c == '0'] for c in f'{x:032b}'), 2) | |
def shelldec(b): | |
"""Reverse shellcode encryption (decryption).""" | |
assert b >= 0 | |
return ((ror(neg(b ^ r14), 32 - 0xb) ^ r13) - r12) % 2**32 | |
r12 = 0x464c457f | |
r13 = 0x746f6f72 | |
r14 = 0x31f3831f | |
byteorder = 'little' | |
encrypted = [0x526851a7, 0x31ff2785, 0xc7d28788, 0x523f23d3, 0xaf1f1055, 0x5c94f027, 0x797a3fcd, 0xe7f02f9f, 0x3c86f045, 0x6deab0f9, 0x91f74290, 0x7c9a3aed, 0xdc846b01, 0x743c86c, 0xdff7085c, 0xa4aee3eb,] | |
decrypted = [shelldec(u) for u in encrypted] | |
def shellenc(a): | |
"""Forward shellcode encryption.""" | |
assert a >= 0 | |
u32 = lambda x: x & 0xFFFFFFFF | |
return neg(ror(u32(a + r12) ^ r13, 0xb)) ^ r14 | |
assert encrypted == [shellenc(v) for v in decrypted] | |
#################### | |
### All Together ### | |
#################### | |
bs = b''.join(u.to_bytes(4, byteorder) for u in decrypted) | |
for i in range(256): | |
bs = apply_rmp(bs) | |
bs = apply_rperm(bs) | |
assert bs == b'hitcon{<https://soundcloud.com/monstercat/noisestorm-crab-rave>}' | |
print(bs.decode()) |
Flag
(I don't know what this music was. It was my first time listening to it. And it's epic! So thank you, wxrdnx, for introducing this nice rave and meme.)
Footnotes
A better specifier would be
x/64bd
. This displays 64d
ecimalb
ytes, and is more convenient to parse in Python. But I chosex/16wx
since I didn't know better at the time and caused myself extra pain. 🥲 ↩︎
Comments are back! Privacy-focused, without ads, bloatware 🤮, and trackers. Be one of the first to contribute to the discussion — I'd love to hear your thoughts.