HKCERT CTF 2023 – Decompetition: Vitamin C++
A beginner-friendly writeup to reverse-engineering C++ a lá decompetition. Years of complex shenanigans condensed!
Oh boy, another C++ reverse challenge. :rubs_hands_in_delight:
Decompetition: Vitamin C++ is a reverse engineering challenge in this year's HKCERT CTF, an annual online capture-the-flag competition hosted in Hong Kong. The format is slightly different from usual rev chals in that we’re required to derive the source code of a binary. This really tests our understanding of how the language is compiled into machine code.
So whether you’re a first-timer or a veteran at reversing C++, this is a fun(?) way to dive deep into or review neat aspects of the language.
If you want to follow along, you can grab the challenge here: GitHub (Backup). I’ll also be relying on Ghidra as my decompiler, because I am poor want to support open-source.
Description
Author: harrier
4/5 stars ⭐️. 5/311 solves.
So let's learn reverse with Decompetition!1 The goal is simple: try to recover the original source code as much as possible, while understand the code logic deeply to get the internal flag! Only with two of those together, you will win this flag.
STL is used everywhere, so it would be nice to be able to reverse them!
Note there is an internal flag with flag format
internal{}
. Please do not submit this directly to the platform.g++ version: g++ (Debian 12.2.0-14) 12.2.0
And a note on testing:
If you want to run this locally, you can install all the prerequisite library with
pip
, and runpython compiler trie.disasm
.
Thus, to get the flag, we need to:
- Obtain the source code (97.5% similarity).
- Obtain the internal flag by reversing the source code.
- Submit the internal flag (not to the platform, but to the remote connection).
- Profit!
(You can see this process play out in compiler.py.)
The first step is the most challenging. Even if we have a decent understanding of the program, we still need the source code to continue.
Let’s start by analysing what we’re given and how we can approach the problem. We'll aim for 100% similarity, but go step by step.
Analysis
Unzipping our bag of goodies, we’re given:
- compiler.py. This is the backend doing all the compilation and diffing.
- Only
TrieNode
methods,wordhash
, andmain
are diffed. - Prior to compiling, our code is prefixed with some boilerplate (includes of
unordered_map
,string
, andiostream
).
- Only
- build.sh. Checks our code against bad patterns, and compiles the program.
- trie. This is the binary file of our target source code. Open this in your favourite decompiler.
- trie.disasm. This is the disassembly used by compiler.py for diffing.
- flag.txt. Read and printed by compiler.py after submitting the internal flag.
- A bunch of other Python files to make things work.
Is there a way we can simplify the process?
Inline assembly with asm
, attribute
trickery, and macros are disallowed.
A quick Google search for TrieNodes resulted in disappointment. The mix()
function is especially unique, as tries generally just do insert/search. So we can probably conclude: the implementation was either hand-spun or modified substantially.
It appears the most productive approach is to tackle the problem head on.
But hey, it’s just a simple lil’ trie, not a friggin standard template container or Boost/Qt library. We can do this!
Trie Me
Let’s briefly review tries. What is a trie?
Tries, or prefix trees, are data structures commonly used to efficiently store and retrieve strings. They are particularly useful for tasks like autocomplete or spell checking. The key idea behind tries is that each node in the tree represents a prefix of a string, and the edges represent the characters that can follow that prefix.
- In terms of matching an exact string, the complexity is similar to a hashmap:
O(n)
insert/search time, w.r.t. the length of the string. But a hashmap is typically faster as it requires fewer operations. - The power of tries comes with alphabetical ordering and prefix search (which is why they’re useful for autocomplete). Hashmaps can't do this.
Setting Up the Structure
Demangling Names
Let’s start by demangling functions! This way, we’ll roughly know its name and signature—big clues, from a functional viewpoint.
In C++, function and class names are mangled. So instead of using sensible names like TrieNode::mix
, std::string::substr
, and std::endl
, the compiler stores hellish sequences like _ZN8TrieNode3mixEc
, _ZNKSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE6substrEmm
, and _ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
. (Yes, endl
is a function.)
Why do C++ compilers behave this way?
This has to do with function overloading. For example:
With function overloading, names are reused. Now if the names are the same, how can the linker find and call the right function? The compiler solves this by encoding a function’s signature into its name, so that all names are unique. (We don't have this problem in plain old C, because function overloading isn’t even a concept!)
To demangle these cryptic monstrosities, we can throw them into online tools (e.g. demangler.com) or just use a C++-enabled decompiler (e.g. Ghidra) which automatically demangles names.
Classy Types
When discussing C++, it’s hard to avoid the topic of classes. These supercharged C-structs are the basis of any object-oriented program.
Looking at the demangled function names, we can identify the TrieNode
class. What next?
There are two parts to reversing a class:
- Methods/functions. How does the class behave? What does it do?
- These are easy to find due to the prefix (e.g.
TrieNode::
). Reversing their content is a different question, which we'll address in upcoming sections.
- These are easy to find due to the prefix (e.g.
- Members. What comprises a class? What is its state?
- This is a tricky question to answer, as variable names are usually stripped. Careful analysis of reads/writes is required (xrefs are useful!).
- Is it set to only 0 or 1? And used in conditions? Probably a boolean.
- Is it compared to other numbers a lot and used near loops? Probably an integer representing size.
- By peeking at the
TrieNode
constructor, we figure out thatTrieNode
has three members.- An unordered map (aka hashmap) from chars to nodes. Size: 0x38 bytes. As this resembles the edges of the node, we'll call variable this
next_node
. - A bool. Size: 1 byte.
- Another bool. Size: 1 byte.
- An unordered map (aka hashmap) from chars to nodes. Size: 0x38 bytes. As this resembles the edges of the node, we'll call variable this
- This is a tricky question to answer, as variable names are usually stripped. Careful analysis of reads/writes is required (xrefs are useful!).
Overall, our structure should resemble:
On Struct vs. Class
Structs are public by default. Classes are private by default.
Public/private are concepts which fall under OOP encapsulation. With encapsulation, we bundle data and only expose certain API methods for public users, whilst hiding implementation. With a cyber analogy, this is not unlike exposing certain ports (HTTP/HTTPS) on a machine, and protecting other ports with a firewall.
I chose to use struct
here because I'm lazy and want to make members public.2 Some of them are accessed directly in main
anyway.
Read more:
Reversing
Plagiarise a Decompiler
Now that we have a basic structure set up, it's time to dig deeper. We need to go from binary to source code. Hmm… that sounds like a job for… a decompiler!
So let’s start with that! We can plagiarise copy output from Ghidra and rewrite it to make programmatic sense.
Exercise: Reverse the wordhash
function.
Solution
This is a simple hash function which xors all characters in a string. It's not a very effective hasher, because it's prone to collisions (also it's not const-correct). But eh, this is just for a CTF challenge.
Implement the Data Structure
Time to implement the core of the program: the TrieNode class. As before, we can refer to Ghidra's output.
Exercise: Reverse TrieNode::insert
, TrieNode::search
, and TrieNode::mix
.
- We can make good use of Ghidra's Rename, Retype, and Edit Function Signature tools to clean up the code.
- Ghidra sometimes loads incorrect function signatures (e.g. for
operator[]
). You may wish to edit the signature so that it displays arguments properly.3
I'll leave the first two functions as an exercise for the reader. :)
mix()
seems to be a total oddball, as tries don't usually have such a function.
TrieNode::mix
: Possible Solution
Now if you run this through the compiler diff, it should respond with some lines:
Subtle. But there is a good reason why this occurs. We'll look at this in more detail later.
On Various Features
Common, but worth mentioning.
On auto
auto
is a special keyword introduced in C++11 typically used to tell the compiler: "figure out this type for me".
It has seen wide adoption and growing support in the standard (more features for auto
are added each standard). Now (C++20) it's used widely in template parameters and lambda parameters.
On Iterators
The previous solution for mix()
made use of iterators. These are commonly used by the STL, providing a generic interface for iterating over containers.
We generally start with .begin()
and iterate with prefix increment (++it
) until we hit the .end()
iterator. With iterators, we can apply generic algorithms on generic containers.
On Unordered Map
You may be wondering: why std::unordered_map
? Why not std::map
? Why type 10 extra keystrokes?
The reason is time complexity.
std::map
is a binary search tree, givingO(log n)
search time on average (wheren
is the number of entries).std::unordered_map
is a hashmap, givingO(1)
search time on average. Takes more space though.
As the value of n increases, the number of operations required for std::map
will increase at a faster rate compared to std::unordered_map
. This is because std::unordered_map
is not affected by the number of entries in the map (except in the case of rehashing); hence, the constant time complexity, O(1)
.
In the interest of performance, it's typical to opt for unordered_map
.
But in our case, since a node only has 256 possible edges (char
), the potential speed boost is limited, and the choice between map
and unordered_map
is debatable. ¯\_(ツ)_/¯
On Scoping
An object in C++ has a constructor and destructor, functions that run at the beginning and end of its lifetime. The object's scope affects the placement of its constructor and destructor.4
Let’s look at some examples:
Do you C the difference?
Things become complicated when we further consider lvalues and rvalues (think: variables and temporaries).
Complicated Examples
I don't intend to cover every single possible case. But yes, C++ is extremely nuanced in this regard. (See also: classes, special member functions, move semantics.)
The point is: object scoping is all reflected at assembly level. We can get a good understanding where an object is declared by paying attention to its constructors and destructors.
This applies to classes, such as STL containers. Primitives (int, char, pointers) don't have constructors/destructors, so it’s trickier to tell where they're instantiated. It's even trickier with heavy optimisations.
Exercise: Reverse the main
function.
- Use the position of constructors and destructors to determine the scope of various strings.
- Beware backslashes in the inserted strings.
Possible Solution
From here on, we'll continue making incremental adjustments to improve our score, while learning various C++ nuances and features.
On Structured Bindings
Here we take a detour to peek at build.sh. And something sparkly catches our eye:
Our code is compiled with C++17—what an oddly specific standard!
One cool feature introduced by this standard is structured bindings, which is as close as we can get to Python iterable unpacking.
Since it
is an iterator over key-value pairs, we can dereference, then bind (unpack) the pair to ch
and node
.
One telltale sign of structured bindings is in the second loop of TrieNode::mix()
. Notice how the first item of the pair (ch2 = std::get<0>(pair);
) is read but never used.
Another giveaway is that std::get
is rarely used to access map pairs, unless in generic code. The idiomatic ways are:
std::pair
members (through iterator):it->first
,it->second
std::pair
members (through pair):- structured bindings (since C++17)
N.B. With optimisations, these indicators would be less obvious. Thankfully the program wasn't compiled with optimisations.
On Const Ref
We're still short of our target though. Some diff lines stand out:
- Looks like we declared 16-bytes of extra stack variables.
- Local variables are stored on the stack, which allocates memory by a simple
sub
instruction. - Larger subtraction = more stack memory allocated.
- Local variables are stored on the stack, which allocates memory by a simple
- It also looks like we called the wrong overload. The mangled names — simplified for readability — translate to:
We can fix both these issues by qualifying our binding as const&
.
With auto
, our binding was creating new char
and TrieNode*
copies. (Hence, the extra 16 bytes.) With const auto&
, we take a constant reference.
- Constant: meaning we only read the value. No modifications. This fixes the second issue.
- Reference: meaning we refer (point) to the original objects instead of copying them. This fixes the first issue.
Using const-refs is good practice for maintainability and performance (imagine copying a 64-byte struct each iteration 🤮).
On For-Range Loops
The astute may notice the above can be refactored slightly with the help of range-based for
-loops. These were introduced in C++11, and are like Python for
-in
loops, but less powerful.
Example
In fact, range-based for
-loops are syntactic sugar for the plain loops we all know and love.
translates to...
(See more: cppreference.)
On Control Flow
Decompiler output may not accurately present the control flow of the original program. Changing:
- the order of control flow, and
- which statement is used
may lead us closer to 100%.
Would it make more sense to use a switch
instead of an if
in a certain place?
Other Useful Tips
- Use Godbolt’s Compiler Explorer to play around with disassembly output. It helps with analysing small details such as variable declaration.
- Remember to set x86-64 gcc 12.2 and
-std=c++17
.
- Remember to set x86-64 gcc 12.2 and
- Two good sources for standard library documentation are cppreference (high quality) and cplusplus.com (beginner-friendly).
- Version control is incredibly useful for tracking incremental changes.
The Infernal Flag
Once we recover enough source code, it's time to find the internal flag. This is probably the least interesting part (for me), so I'll keep it short.
Notice:
- In
main
, a bunch of garbage strings are inserted into the trie. - Afterwards, mixing is turned on (
node->should_mix = true
), so that the nextnode->insert()
will jumble the trie...
Now it's time to take a really close look at mix()
.
- What is jumbled? All the strings.
- How? All chars are xor'ed with a char (the
wordhash
of a string). - How many possible chars are there? 256.
- Two words: brute force.
Maybe one of the strings just happens to be the internal flag xor'ed. Who knows?
After getting ≥ 97.5% similarity and finding the internal flag, submit both to the platform and profit!
Final Remarks
I'm sure the chal is called Vitamin C++ because it's designed to make us (mentally) stronger. Every time you trie harder, you lose a brain cell but strengthen a neuron. Indeed, we covered quite a lot of concepts today:
- Language Features:
auto
, structured bindings, for-range loops, const-ref. - Library Features: iterators, unordered map.
- Unordered map is preferred for performance in lookup.
- Scoping (and lvalue-rvalueness) affects position of constructors/destructors. (Very good takeaway for C++ reversing.)
- Pay attention to groups of
sub
andmov
instructions to check if we declared too little/many stack variables. - Ghidra is pretty powerful.
- C++ is fun.
Lots of tuning was involved; but the various tricks employed above netted us a first blood, so I can't complain. Despite a couple lines of janky const-uncorrect code, it was a nice challenge.
Also, who doesn't like a good pun hidden in a challenge?
Solve Sauce
(Files not embedded, as they're a bit big.)
- rev.cpp: Fully reversed (100% similarity) source.
- solve.py: For cracking the internal flag.
- send.py: Driver program to test rev.cpp.
Flag
Footnotes
Decompetition is a reverse-engineering CTF held irregularly. ↩︎
In proper engineering, we would hide the implementation behind
private
, sonext_node
should be a private variable. But since this is a CTF, proper engineering comes second. 😛 ↩︎Or just read the assembly and follow the call conventions (thiscall for member functions, fastcall for everything else). ↩︎
This also depends on optimisations, and whether the object contains any other classes. In some cases, constructors or destructors may be inlined or optimised away. ↩︎
Comments are back! Privacy-focused, without ads, bloatware, and trackers. Be one of the first to contribute to the discussion — I'd love to hear your thoughts.