CTF-PWN glibc源码阅读[1]: 寻找libc中堆结构的定义(2.31-0ubuntu9.16)

源代码在这里下载

来到malloc/malloc.c

在980行发现这段代码

// 定义最大 mmap 值为 -4
#define M_MMAP_MAX             -4// 如果没有定义 DEFAULT_MMAP_MAX，则将其定义为 65536
#ifndef DEFAULT_MMAP_MAX
#define DEFAULT_MMAP_MAX       (65536)
#endif// 引入 malloc.h 头文件，通常包含内存分配和释放相关的函数声明
#include <malloc.h>// 如果没有定义 RETURN_ADDRESS 宏，定义为一个空操作，返回 NULL
#ifndef RETURN_ADDRESS
#define RETURN_ADDRESS(X_) (NULL)
#endif/* 结构体和类型的前向声明 */// 定义 malloc_chunk 结构体（实际定义可能在代码的其他地方）
struct malloc_chunk;// 定义 mchunkptr 为指向 malloc_chunk 结构体的指针类型
typedef struct malloc_chunk* mchunkptr;/* 内部函数声明 */// 内部函数：分配内存
static void*  _int_malloc(mstate, size_t);// 内部函数：释放内存
static void     _int_free(mstate, mchunkptr, int);// 内部函数：调整内存块大小
static void*  _int_realloc(mstate, mchunkptr, INTERNAL_SIZE_T, INTERNAL_SIZE_T);// 内部函数：内存对齐分配
static void*  _int_memalign(mstate, size_t, size_t);// 内部函数：内存对齐分配的中间实现
static void*  _mid_memalign(size_t, size_t, void *);// 内部函数：打印内存分配错误信息，且该函数不会返回
static void malloc_printerr(const char *str) __attribute__ ((noreturn));// 内部函数：检查内存块的有效性
static void* mem2mem_check(void *p, size_t sz);// 内部函数：检查堆的顶端是否正常
static void top_check(void);// 内部函数：通过 munmap 释放内存块
static void munmap_chunk(mchunkptr p);// 如果系统支持 mremap，则声明 mremap_chunk 函数，用于调整内存映射
#if HAVE_MREMAP
static mchunkptr mremap_chunk(mchunkptr p, size_t new_size);
#endif// 内部函数：检查 malloc 操作的合法性
static void*   malloc_check(size_t sz, const void *caller);// 内部函数：检查 free 操作的合法性
static void      free_check(void* mem, const void *caller);// 内部函数：检查 realloc 操作的合法性
static void*   realloc_check(void* oldmem, size_t bytes, const void *caller);// 内部函数：检查内存对齐分配操作的合法性
static void*   memalign_check(size_t alignment, size_t bytes, const void *caller);

查看malloc_chunk结构体

// 定义内存块的元数据结构体，用于管理堆中的内存块
struct malloc_chunk {INTERNAL_SIZE_T mchunk_prev_size;  // 前一个内存块的大小（如果该块是空闲的）INTERNAL_SIZE_T mchunk_size;       // 当前内存块的大小，包括元数据的开销// 双向链表指针，用于空闲内存块的链表管理struct malloc_chunk* fd;           // 指向下一个空闲块的指针（free list 链表的正向指针）struct malloc_chunk* bk;           // 指向上一个空闲块的指针（free list 链表的反向指针）// 只用于较大的内存块，指向比当前内存块大的下一个内存块struct malloc_chunk* fd_nextsize;  // 双向链表指针，用于管理按大小排列的空闲块struct malloc_chunk* bk_nextsize;  // 双向链表指针，指向下一个比当前块大的空闲块};

在下面还能看到关于堆结构的注释

/*malloc_chunk details:(The following includes lightly edited explanations by Colin Plumb.)Chunks of memory are maintained using a `boundary tag' method asdescribed in e.g., Knuth or Standish.  (See the paper by PaulWilson ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps for asurvey of such techniques.)  Sizes of free chunks are stored bothin the front of each chunk and at the end.  This makesconsolidating fragmented chunks into bigger chunks very fast.  Thesize fields also hold bits representing whether chunks are free orin use.An allocated chunk looks like this:chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+|             Size of previous chunk, if unallocated (P clear)  |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+|             Size of chunk, in bytes                     |A|M|P|mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+|             User data starts here...                          ..                                                               ..             (malloc_usable_size() bytes)                      ..                                                               |
nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+|             (size of chunk, but used for application data)    |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+|             Size of next chunk, in bytes                |A|0|1|+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+Where "chunk" is the front of the chunk for the purpose of most ofthe malloc code, but "mem" is the pointer that is returned to theuser.  "Nextchunk" is the beginning of the next contiguous chunk.Chunks always begin on even word boundaries, so the mem portion(which is returned to the user) is also on an even word boundary, andthus at least double-word aligned.Free chunks are stored in circular doubly-linked lists, and look like this:chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+|             Size of previous chunk, if unallocated (P clear)  |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+`head:' |             Size of chunk, in bytes                     |A|0|P|mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+|             Forward pointer to next chunk in list             |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+|             Back pointer to previous chunk in list            |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+|             Unused space (may be 0 bytes long)                ..                                                               ..                                                               |
nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+`foot:' |             Size of chunk, in bytes                           |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+|             Size of next chunk, in bytes                |A|0|0|+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+The P (PREV_INUSE) bit, stored in the unused low-order bit of thechunk size (which is always a multiple of two words), is an in-usebit for the *previous* chunk.  If that bit is *clear*, then theword before the current chunk size contains the previous chunksize, and can be used to find the front of the previous chunk.The very first chunk allocated always has this bit set,preventing access to non-existent (or non-owned) memory. Ifprev_inuse is set for any given chunk, then you CANNOT determinethe size of the previous chunk, and might even get a memoryaddressing fault when trying to do so.The A (NON_MAIN_ARENA) bit is cleared for chunks on the initial,main arena, described by the main_arena variable.  When additionalthreads are spawned, each thread receives its own arena (up to aconfigurable limit, after which arenas are reused for multiplethreads), and the chunks in these arenas have the A bit set.  Tofind the arena for a chunk on such a non-main arena, heap_for_ptrperforms a bit mask operation and indirection through the ar_ptrmember of the per-heap header heap_info (see arena.c).Note that the `foot' of the current chunk is actually representedas the prev_size of the NEXT chunk. This makes it easier todeal with alignments etc but can be very confusing when tryingto extend or adapt this code.The three exceptions to all this are:1. The special chunk `top' doesn't bother using thetrailing size field since there is no next contiguous chunkthat would have to index off it. After initialization, `top'is forced to always exist.  If it would become less thanMINSIZE bytes long, it is replenished.2. Chunks allocated via mmap, which have the second-lowest-orderbit M (IS_MMAPPED) set in their size fields.  Because they areallocated one-by-one, each must contain its own trailing sizefield.  If the M bit is set, the other bits are ignored(because mmapped chunks are neither in an arena, nor adjacentto a freed chunk).  The M bit is also used for chunks whichoriginally came from a dumped heap via malloc_set_state inhooks.c.3. Chunks in fastbins are treated as allocated chunks from thepoint of view of the chunk allocator.  They are consolidatedwith their neighbors only in bulk, in malloc_consolidate.
*/

翻译并整理后

malloc_chunk 详细信息

（以下内容包含由 Colin Plumb 轻微编辑的解释。）

内存块是通过一种 边界标签 方法来管理的，如 Knuth 或 Standish 所述。（有关此类技术的概述，请参见 Paul Wilson 的论文：ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps。）空闲块的大小会同时存储在每个块的前面和后面。这使得将碎片化的块合并为更大的块非常快速。大小字段还包含表示块是空闲还是正在使用的位。

分配的块结构

一个已分配的内存块如下所示：

以下是转换为中文Markdown表格的格式：

位置	内容说明
chunk->	前一个块的大小（当P标志清除时，表示未分配）
	当前块的大小（字节数）\|A\|M\|P\|
mem->	用户数据起始位置
	（可用内存大小：malloc_usable_size()字节）
nextchunk->	（块的大小，但用于应用程序数据）
	下一个块的大小（字节数）\|A\|0\|1\|

其中，chunk 是块的前端，用于大部分 malloc 代码，但 mem 是返回给用户的指针。nextchunk 是下一个连续块的开始。块总是以偶数字边界开始，因此返回给用户的 mem 部分也会以偶数字边界开始，从而至少是双字对齐的。

空闲块结构

空闲块存储在循环双向链表中，结构如下：

以下是转换为中文Markdown表格的格式：

位置	内容说明
chunk->	前一个块的大小（当P标志清除时，表示未分配）
‘head:’	当前块的大小（字节数）\|A\|0\|P\|
mem->	指向列表中下一个块的前向指针
	指向列表中前一个块的后向指针
	未使用空间（可能长度为0字节）
nextchunk->
‘foot:’	当前块的大小（字节数）
	下一个块的大小（字节数）\|A\|0\|0\|

关键位说明

P（PREV_INUSE）位：存储在块大小的低位（总是一个双字的倍数），表示前一个块是否正在使用。如果该位清除，则当前块之前的字包含前一个块的大小，可用来找到前一个块的前端。分配的第一个块总是将此位设置为 1，防止访问不存在的（或未拥有的）内存。如果某个块的 prev_inuse 位设置，则无法确定前一个块的大小，甚至在尝试访问时可能会引发内存访问错误。
A（NON_MAIN_ARENA）位：该位在初始的主 arena 中清除，在由 main_arena 变量描述的 arena 中的块中。如果分配了额外的线程，每个线程会获得自己的 arena（直到达到可配置的限制，之后多个线程共享相同的 arena）。这些 arena 中的块会将 A 位设置为 1。要查找这种非主 arena 中的块所在的 arena，heap_for_ptr 通过位掩码操作和 heap_info 中每个堆头部的 ar_ptr 成员进行间接访问（详见 arena.c）。

特殊情况

当前块的 foot 实际上表示的是下一个块的 prev_size。这种设计使得对齐处理更简便，但在尝试扩展或适应代码时可能会非常困惑。

三个特殊情况

特殊块 top：由于没有下一个连续块需要索引，因此 top 块不使用尾部的大小字段。在初始化后，top 块必须始终存在。如果它变得小于 MINSIZE 字节，它会被补充。
通过 mmap 分配的块：这些块在其大小字段中设置了次低位的 M（IS_MMAPPED）位。由于它们是逐一分配的，因此每个块必须包含自己的尾部大小字段。如果 M 位设置，则其他位会被忽略（因为 mmap 分配的块既不在 arena 中，也不与空闲块相邻）。M 位也用于那些最初来自转储堆并通过 malloc_set_state 恢复的块（见 hooks.c）。
快速分配区的块：从块分配器的角度来看，快速分配区的块被视为已分配的块。它们只在 malloc_consolidate 中与相邻块进行合并，而不是逐块合并。