Linux Heap Exploitation internals

Posted on Jan 14, 2025

Table of contents:

LIFO AND FIFO

LIFO (Last-In, First-Out) and FIFO (First-In, First-Out) are two fundamental methods for managing data structures, each with distinct characteristics and applications.

LIFO (Last-In, First-Out):

Structure: Commonly implemented using a stack, where the most recently added item is the first to be removed.
Analogy: Think of a stack of plates; you add and remove plates from the top.
Applications:
- Memory Management: Utilized in function call management and recursive processes, where the last called function is the first to return.
- Undo Functionality: In applications like text editors, the last action performed is the first to be undone.
- Browser History: The most recently visited page is the first to be revisited when navigating backward.

FIFO (First-In, First-Out):

Structure: Typically implemented using a queue, where the earliest added item is the first to be removed.
Analogy: Similar to a line of people waiting; the first person in line is the first to be served.
Applications:
- Task Scheduling: Operating systems often use FIFO queues to manage processes, ensuring the earliest tasks are handled first.
- Print Spooling: Print jobs are managed in the order they are received, with the first job submitted being the first to print.
- Data Buffers: In networking, data packets are processed in the order they arrive to maintain sequence integrity.

Circular, singly linked-list and doubly linked-list

A circular singly linked list is a data structure where each node contains data and a single reference to the next node, with the last node pointing back to the first, forming a continuous loop.

In contrast, a circular doubly linked list has nodes with two references: one to the next node and another to the previous node, allowing traversal in both directions, with the last node linking back to the first and vice versa. These structures are useful in scenarios requiring cyclic traversal, such as implementing round-robin scheduling or buffering systems.

GLIBC

GLIBC, or the GNU C Library, is a core component of the GNU operating system and is used by many other Unix-like operating systems, including Linux. It provides the system’s C library, which is a collection of standard functions and macros that programs written in the C programming language can use to perform common tasks, such as input/output operations, memory management, and string manipulation.

In essence, GLIBC acts as an intermediary between the application software and the kernel, facilitating communication and ensuring that programs can run smoothly on the system. It’s a crucial part of the software stack that allows applications to function correctly on Unix-like systems.

Malloc

malloc stands for “memory allocation.” It’s a function in the C programming language that dynamically allocates a block of memory on the heap. When you call malloc, it reserves a specified number of bytes and returns a pointer to the beginning of the allocated memory block. This allows you to create data structures whose size can change during the execution of a program.

Malloc Internals

Chunks are the basic units of memory that malloc works with. They are usually parts of the heap but can also be created separately using mmap(). A chunk includes a size field and the user data.

When a program uses a pointer to access a chunk’s user data, malloc internally considers the chunk to start 8 bytes before this program pointer. This 8-byte area contains the chunk’s size field, which includes the size of the user data plus the size of the size field itself. For example, if the user data is 24 bytes, the size field will be 32 bytes (0x20 in hexadecimal). The smallest usable chunk size is 32 bytes (0x20), though malloc uses internal chunks of 16 bytes (0x10) for specific purposes.

Chunk sizes increase in 16-byte steps: after 0x20 comes 0x30, then 0x40, and so on. The least-significant four bits (or nibble) of the size field are used for flags that indicate the chunk’s state:

PREV_INUSE (P bit): If set, it means the previous chunk is in use; if clear, the previous chunk is free.
IS_MMAPPED (M bit): If set, it indicates the chunk was allocated using mmap().
NON_MAIN_ARENA (A bit): If set, it signifies the chunk does not belong to the main arena.

These flags help malloc manage memory allocation and deallocation efficiently. When a program requests memory using malloc, it receives a pointer to the chunk’s user data—the portion of memory available for the program’s use. Internally, malloc manages chunks, which can be in one of two states: allocated (in use) or free (available for allocation).

When a chunk is free, malloc repurposes up to five 8-byte segments (quadwords) of its user data to store metadata for memory management. This metadata may even overlap with the succeeding chunk. The specific use of these quadwords depends on the bin (a data structure used by malloc to manage free chunks) the chunk belongs to:

First quadword: Used as a forward pointer (fd) in all bins to link to the next free chunk (pointer points to the first two quadwords of the chunk)
Second quadword: Used as a backward pointer (bk) in doubly linked lists, such as the unsorted bin or smallbins, to link to the previous free chunk.
Third and fourth quadwords: Used as fd_nextsize and bk_nextsize pointers, respectively, in large bins to manage chunks by size.

This repurposing of user data in free chunks allows malloc to efficiently manage and organize free memory blocks.

In bins that support consolidation , when a chunk is freed, (help: imagine three chunks side-by-side and you free first two chunks, on the seond chunk’s) the last 8 bytes (quadword) of its user data are repurposed as the prev_size field. This field records the size of the freed chunk, similar to the size field but without any flags. malloc treats the prev_size field as part of the next (succeeding) chunk. When the prev_size field is present, the succeeding chunk’s PREV_INUSE flag is cleared, indicating that the previous chunk is free.

In simple words, When you have three adjacent memory chunks and free the first two, the second chunk’s last 8 bytes (quadword) are repurposed as the prev_size field. This field records the size of the preceding free chunk and is considered part of the third chunk’s metadata. Additionally, the PREV_INUSE flag bit in the third chunk’s metadata is cleared to zero, indicating that the previous chunk is free.

In GLIBC versions 2.29 and later, when a chunk is freed and added to a tcache bin, the second 8-byte segment (quadword) of its user data is repurposed as a “key” field. This key is a pointer to the tcache_perthread_struct and serves as a safeguard against double-free vulnerabilities by allowing the allocator to detect if a chunk has been freed more than once.

The tcache_entry structure, which represents a chunk in the tcache, is defined as follows:

typedef struct tcache_entry {
    struct tcache_entry *next; // Pointer to the next free chunk
    struct tcache_perthread_struct *key; // Pointer to the tcache structure
} tcache_entry;

When a chunk is added to the tcache via the tcache_put() function, the key field is set to point to the current thread’s tcache structure:

static __always_inline void tcache_put(mchunkptr chunk, size_t tc_idx) {
    tcache_entry *e = (tcache_entry *) chunk2mem(chunk);
    e->key = tcache;
    e->next = tcache->entries[tc_idx];
    tcache->entries[tc_idx] = e;
    ++(tcache->counts[tc_idx]);
}

This mechanism enables the allocator to detect double-free errors by checking if a chunk’s key field matches the current tcache pointer during subsequent free operations. If a match is found, it indicates that the chunk is already in the tcache, suggesting a double-free attempt, and appropriate action can be taken to prevent exploitation.