ARM Cortex-M的栈结构及回溯

1. 概述

最近在研究ARM Cortex-M系列的单片机的栈结构及栈回溯。研究这个有什么用呢？有以下几个方面：

深入了解处理器指令、程序运行的原理等知识，对程序设计技能的提高有一定的帮助。
当你的程序出现问题了，可以根据栈数据找到出问题的点。有助于问题的查找和定位能力的提高。

我的目的就是要搞明白程序在不同状态下调用函数时的压栈顺序是什么？如何在栈中开辟局部变量？为了搞明白这两个问题我查阅了很多资料也借助了许多工具。

2. ELF文件及相关工具

在搞清楚我们关注的问题前我们需要预备一些必要的知识以及需要准备一些必要的工具去帮助我们去探索我们关注的问题，其中ELF文件及操作ELF文件相关的工具就必不可少。

2.1 ELF文件

首先，我们需要简单的了解一下ELF文件。我们用keil开发ARM Cortex-M的单片机程序，编译完成后输出的可执行文件的后缀名是axf。而实际上axf文件是ELF文件的一种，实际上大多数编译器输出的可执行文件都是ELF文件包括windows可执行文件和Linux可执行文件。

ELF文件按照一定格式保存了程序的二进制指令代码、符号表、调试信息等。通过ELF文件我们几乎可以获得程序的全部信息。

2.2 ELF文件解析工具

网上有很多ELF的解析工具，但是我更喜欢自己用脚本去处理一些事情。所以我用python结婚elftoos库（eliben/pyelftools: Parsing ELF and DWARF in Python (github.com)）去处理和解析ELF文件，在Linux系统下readelf也是不错的工具。

该脚本库有很多有用的例子，常见的功能都可以从example中获得。我们在示例的基础上进行必要的改动即可实现我们想要的功能。例如，我们需要通过程序地址去找到函数名的功能。

可以通过下面的命令安装库：

pip install pyelftools

3. 调用函数时栈的变化

ARM Cortex-M的处理器没有栈边界寄存器，而且为了节约系统资源压栈的寄存器个数也是随函数的实现变化的。这也进一步提高了这类处理器栈回溯的难度。即便如此，我们一旦掌握了函数调用时压栈的规律就可以实现栈回溯和解析。

申明：由于编译器不同的情况下相同的C语言源码编译出的汇编指令会有所不同，所以本文中展示的汇编指令使用的都是ARMCC编译器。
这里分析关注的是无操作系统的裸机程序，操作系统的情况下可能存在多个栈空间。

下面我们分析几个函数的汇编代码来总结一下函数压栈的规律：

我们主要关注PUSH指令以及对SP的操作

uint32_t nrf_atomic_flag_set(nrf_atomic_flag_t * p_data)
nrf_atomic_flag_set0x000073fe:    b510        ..      PUSH     {r4,lr}0x00007400:    4604        .F      MOV      r4,r00x00007402:    2101        .!      MOVS     r1,#10x00007404:    4620         F      MOV      r0,r40x00007406:    f000f822    ..".    BL       nrf_atomic_u32_or ; 0x744e0x0000740a:    bd10        ..      POP      {r4,pc}

这是一个有参数有返回值的函数，函数内部没有用到局部变量。这种类型的函数只压入栈两个寄存器。

uint32_t nrf_atomic_u32_add(nrf_atomic_u32_t * p_data, uint32_t value)
{uint32_t old_val;uint32_t new_val;NRF_ATOMIC_OP(add, old_val, new_val, p_data, value);UNUSED_PARAMETER(old_val);UNUSED_PARAMETER(new_val);return new_val;
}

nrf_atomic_u32_add0x0000740c:    b5f8        ..      PUSH     {r3-r7,lr}0x0000740e:    4604        .F      MOV      r4,r00x00007410:    460d        .F      MOV      r5,r10x00007412:    466a        jF      MOV      r2,sp0x00007414:    4629        )F      MOV      r1,r50x00007416:    4620         F      MOV      r0,r40x00007418:    f7f8ff31    ..1.    BL       __asm___12_nrf_atomic_c_85ca2469__nrf_atomic_internal_add ; 0x27e0x0000741c:    4606        .F      MOV      r6,r00x0000741e:    9800        ..      LDR      r0,[sp,#0]0x00007420:    bdf8        ..      POP      {r3-r7,pc}

这个函数有参数有返回值，有两个局部变量。我们发现压入栈中的寄存器变多了。所以，一部分寄存器被用作了局部变量使用。

nrf_cli_fprintf0x00007f6c:    b40f        ..      PUSH     {r0-r3}0x00007f6e:    b57c        |.      PUSH     {r2-r6,lr}0x00007f70:    4604        .F      MOV      r4,r00x00007f72:    460d        .F      MOV      r5,r10x00007f74:    9808        ..      LDR      r0,[sp,#0x20]0x00007f76:    b920         .      CBNZ     r0,0x7f82 ; nrf_cli_fprintf + 220x00007f78:    a120         .      ADR      r1,{pc}+0x84 ; 0x7ffc0x00007f7a:    f640305f    @._0    MOV      r0,#0xb5f

可能会多次压栈。

nrf_cli_help_print0x00008024:    e92d4ff0    -..O    PUSH     {r4-r11,lr}0x00008028:    b089        ..      SUB      sp,sp,#0x240x0000802a:    4606        .F      MOV      r6,r00x0000802c:    460c        .F      MOV      r4,r10x0000802e:    4691        .F      MOV      r9,r20x00008030:    b926        &.      CBNZ     r6,0x803c ; nrf_cli_help_print + 240x00008032:    a1b6        ..      ADR      r1,{pc}+0x2da ; 0x830c0x00008034:    f44f603e    O.>`    MOV      r0,#0xbe00x00008038:    f7fdf92e    ....    BL       assert_nrf_callback ; 0x52980x0000803c:    68b0        .h      LDR      r0,[r6,#8]0x0000803e:    b118        ..      CBZ      r0,0x8048 ; nrf_cli_help_print + 36

当局部变量占用的空间比较大，系统寄存器无法满足要求时会使用SUB指令对栈指针做减法操作。其作用就是开辟栈空间供局部变量使用。该函数就是开辟了0x24字节的局部变量空间。

现在我们来总结一下，一般在跳转到函数地址后会用PUSH指令进行压栈，这里压栈保存的一般是寄存器的值，压栈的最后一般是lr的值。压入栈中的寄存器个数不确定，这由函数的参数、返回值、局部变量占用空间的大小决定。对于局部变量占用空间比较大的函数，寄存器无法满足函数的实现需求，编译器会使用SUB指令对SP进行减法操作从而开辟栈空间供局部变量使用。

如果我们想通过栈数据获取函数调用的层级关系，我们只需要关注栈中的lr的值。

4. 中断处理函数

如果当前CPU执行在中断处理函数，这是一个特殊情况。因为ARM Cortex-M处理器在进入中断处理函数前会自动将R0 <- R1 <- R2 <- R3 <- R12 <- LR <- PC <- xPSR压入栈中。在这种情况下需要特殊处理。

5. 通过python脚本实现栈回溯

python有一个库可以控制jlink对单片机进行调试，这由就可以方便的读取系统寄存器的值和栈数据。也可以控制单片机。基于此，实现了栈回溯的python脚本。

从前面几个章节的探索中我们发现通过分析固件的汇编源码结合当前寄存器的值就可以实现ARM Cortex-M系列单片机的栈回溯，回溯的过程分以下几个步骤：

1.读取当前sp、pc、lr、xPSR的值
2.通过pc的值通过ELF工具获取到函数名
3.通过函数名检索估计的asm文件得到压栈信息
4.通过压栈信息结合当前sp的值获取到栈中保存lr的地址，更新sp的值
5.重复2-5步直到遍历到栈底

下面是python的实现：

from elftools.elf.elffile import ELFFile
from elftools.dwarf.descriptions import describe_form_class
import subprocess
import argparse
import pylink
import json
import sys
import os
import reVERSION = '1.0.0'def decode_funcname(dwarfinfo, address):# Go over all DIEs in the DWARF information, looking for a subprogram# entry with an address range that includes the given address. Note that# this simplifies things by disregarding subprograms that may have# split address ranges.for CU in dwarfinfo.iter_CUs():for DIE in CU.iter_DIEs():try:if DIE.tag == 'DW_TAG_subprogram':lowpc = DIE.attributes['DW_AT_low_pc'].value# DWARF v4 in section 2.17 describes how to interpret the# DW_AT_high_pc attribute based on the class of its form.# For class 'address' it's taken as an absolute address# (similarly to DW_AT_low_pc); for class 'constant', it's# an offset from DW_AT_low_pc.highpc_attr = DIE.attributes['DW_AT_high_pc']highpc_attr_class = describe_form_class(highpc_attr.form)if highpc_attr_class == 'address':highpc = highpc_attr.valueelif highpc_attr_class == 'constant':highpc = lowpc + highpc_attr.valueelse:print('Error: invalid DW_AT_high_pc class:',highpc_attr_class)continueif lowpc <= address < highpc:return DIE.attributes['DW_AT_name'].valueexcept KeyError:continuereturn Nonedef decode_file_line(dwarfinfo, address):# Go over all the line programs in the DWARF information, looking for# one that describes the given address.for CU in dwarfinfo.iter_CUs():# First, look at line programs to find the file/line for the addresslineprog = dwarfinfo.line_program_for_CU(CU)if lineprog is None:continuedelta = 1 if lineprog.header.version < 5 else 0prevstate = Nonefor entry in lineprog.get_entries():# We're interested in those entries where a new state is assignedif entry.state is None:continue# Looking for a range of addresses in two consecutive states that# contain the required address.if prevstate and prevstate.address <= address < entry.state.address:filename = lineprog['file_entry'][prevstate.file - delta].nameline = prevstate.linereturn filename, lineif entry.state.end_sequence:# For the state with `end_sequence`, `address` means the address# of the first byte after the target machine instruction# sequence and other information is meaningless. We clear# prevstate so that it's not used in the next iteration. Address# info is used in the above comparison to see if we need to use# the line information for the prevstate.prevstate = Noneelse:prevstate = entry.statereturn None, Nonedef get_function_info_by_addr(dwarfinfo, addr, path):'''通过地址获取函数名、源码文件路径和函数名。并打印keil可跳转的字符串到Build Output窗口中'''patten = re.compile('^[A-Fa-f0-9]+$')patten0x = re.compile('0x[0-9a-fA-F]+')if patten.match(addr) or patten0x.match(addr):funcname = decode_funcname(dwarfinfo, int(addr, base=16))file, line = decode_file_line(dwarfinfo, int(addr, base=16))filename = file.decode()# print(file.decode(), '(', line, ')', ':', funcname.decode())if path == None or path == '':print(filename, '(', line, ')', ':', funcname.decode())else:# 再输入的路径中查找文件并输出结果result = []for root, lists, files in os.walk(path):for file in files:if filename in file:write = os.path.join(root, file)# 对文件后缀进行筛选，只保留C语言和c++相关的文件if write.endswith(tuple(['c', 'cc', 'cpp', 'h'])):result.append(write)for subpath in result:print(subpath, '({:d})'.format(line), ':', funcname.decode())def funtion_push_info(asmpath, funcname):'''通过函数名再asm文件中获取该函数的压栈信息asmpath: 通过elf文件生成的asm文件funcname: 函数名ret: 返回压栈时的地址偏移和压入栈的寄存器列表'''push_str = []# 打开asm文件with open(asmpath, 'r') as asmf:# 将文件数据按行读入并解析lines = asmf.readlines()match_index = Falsefor index in range(len(lines)):# 查找函数if lines[index].strip() == funcname:cunter = 0while True:if lines[index].find('PUSH') > 0:push_str.append(lines[index])match_index = True# 包含局部变量的函数再压栈后会在栈里给局部变量# 分配空间，具体的做法就是用SUB指令将SP指针的# 减去某个值。所以，我们解析栈数据时先看到的是# 局部变量的值。elif lines[index].find('SUB') > 0:push_str.append(lines[index])else:if match_index:breakindex = index + 1cunter = cunter + 1if (not match_index) and cunter >= 2:breakbreakasmf.close()if len(push_str) <= 0:return 0, 0# 获取函数地址funcaddr = push_str[0].strip().split(':')[0]funcaddr = int(funcaddr, base=16)# 对压栈信息进行解析reglist = []subw = 0for line in push_str:# 获取到压栈指令行，并解析出寄存器列表if line.find('PUSH') > 0:start = line.find('{')end = line.find('}')line = line[start+1:end]reglist.extend(line.split(','))continuesub_start = line.find('SUB')if sub_start > 0:# SUB指令行不一定就是对sp指针的操作line = line[sub_start:]splist = line.split(',')# 确保是对sp的操作，并解出偏移if splist[1] == 'sp':offset = line.split('#')[1].rstrip()if offset[0:2] == '0x':subw = subw + int(offset, base=16)else:subw = subw + int(offset, base=10)reg_num = 0for reg in reglist:if reg.find('-') > 0:tmp_list = reg.split('-')reg_num = reg_num + (int(tmp_list[1][1:]) - int(tmp_list[0][1:]) + 1)else:reg_num = reg_num + 1return (reg_num*4 + subw), funcaddrdef vector_from_asm(asmpath):'''从asm文件中获取向量表'''vector = []with open(asmpath, 'r') as asmf:lines = asmf.readlines()flag = Falsefor line in lines:if flag:if line.strip() == '$t':breakif line.find('DCD') > 0:tmp = line.split('DCD')[1].strip()vector.append(int(tmp, base=10))else:if line.strip() == 'RESET':flag = Truecontinueasmf.close()return vectordef stack_backtrace(jlink, asmpath, dwarfinfo):'''对栈进行回溯regs: [sp, lr, pc]'''# 获取向量表vector = vector_from_asm(asmpath)# sp, lr, pc, IPSRregs = link.register_read_multiple([13, 14, 15, 16])result = []# 通过pc获取当前函数名funcname = decode_funcname(dwarfinfo, regs[2])file, line = decode_file_line(dwarfinfo, regs[2])result.append([regs[2], file.decode(), line, funcname.decode()])sp = regs[0]ipsr = regs[3] & 0x1F# 对栈进行回溯while True:offset, funcaddr = funtion_push_info(asmpath, funcname.decode())if offset == 0:break# 如果ipsr大于0，说明当前pc在中断处理函数当中# ARM Cortex-M处理器在进入中断处理函数前会自动# 将R0 <- R1 <- R2 <- R3 <- R12 <- LR <- PC <- xPSR# 压入栈中，针对这一特点需要对这种情况特殊处理if ipsr > 0 and funcaddr == vector[ipsr]-1:sp = sp + offset + 32lr = jlink.memory_read32(sp-8, 1)[0]funcname = decode_funcname(dwarfinfo, lr)file, line = decode_file_line(dwarfinfo, lr)else:sp = sp + offsetlr = jlink.memory_read32(sp-4, 1)[0]funcname = decode_funcname(dwarfinfo, lr - 1)file, line = decode_file_line(dwarfinfo, lr - 1)if not funcname:breakresult.append([lr-1, file.decode(), line, funcname.decode()])return resultif __name__ == '__main__':parse = argparse.ArgumentParser(description='根据地址定位函数名、文件名和行号')parse.add_argument('-v', '--version', action='version', version=VERSION, help='Show version and exit.')parse.add_argument('-addr', help='函数地址')parse.add_argument('-json', help='Json配置文件的路径')args = parse.parse_args()# 参数检查if args.json is None:parse.print_help()sys.exit(0)# 读取json文件获取配置config = Nonetry:with open(args.json, 'r') as jf:config = json.load(jf)jf.close()except:parse.print_help()sys.exit(0)elfpath = config.get('Elfpath')projectpath = config.get('Projectpath')if not (elfpath and projectpath):parse.print_help()sys.exit(0)elfpath = projectpath + elfpathwith open(elfpath, 'rb') as f:elffile = ELFFile(f)if not elffile.has_dwarf_info():print('file has no SWARF info')else:dwarfinfo = elffile.get_dwarf_info()mode = config.get('Mode')if not mode:sys.exit(0)if mode == 'funcname':if args.addr != None:get_function_info_by_addr(dwarfinfo, args.addr, projectpath)if mode == 'stack':keilpath = config.get('Keilpath')device = config.get('Device')if not (keilpath and device):sys.exit(0)cmd = [keilpath + 'fromelf.exe', '--text', '-c', '-o', projectpath + '\\frimwar.asm', elfpath]subprocess.Popen(cmd)# 获取jlin的sn号obj = subprocess.Popen(['JLink'], shell=True, stdout=subprocess.PIPE, stdin=subprocess.PIPE)# time.sleep(10)obj.stdin.write('q\n'.encode())out, err = obj.communicate()sn = ''for line in out.splitlines():str = line.decode()if str.find('S/N') >= 0:sn = str.split(':')[1].rstrip()break# 连接jlinklink = pylink.JLink()link.open(sn)link.set_tif(pylink.enums.JLinkInterfaces.SWD)link.connect(device)ok = link.halt()asmpath = projectpath + '\\frimwar.asm'result = stack_backtrace(link, asmpath, dwarfinfo)out_result = []for item in result:filename = item[1].split('\\')[-1].rstrip()for root, lists, files in os.walk(projectpath):for file in files:if filename in file:write = os.path.join(root, file)# 对文件后缀进行筛选，只保留C语言和c++相关的文件if write.endswith(tuple(['c', 'cc', 'cpp', 'h'])):out_result.append([write, item])# 对栈帧进行打印print('backtrace:')for subpath in out_result:print(subpath[0], '({:d})'.format(subpath[1][2]), ':', subpath[1][3])# 程序继续运行link.restart()os.remove(asmpath)