源码基于:Linux 5.4
0. 前言
对于计算机而言是没有符号这个概念的,只有0 和 1,但是我们比较容易理解的是函数名、变量名这样的符号。在Linux 内核中用 System.map 来记录Linux 内核中的符号信息,称为内核的符号表,该文件会在每次内核编译的时候,都会产生一个新的对应的 System.map 文件。在内核运行出错时,通过 System.map 中的符号表解析,就可以查到一个地址值对应的变量名、函数名。
在Android 系统中,内核的符号表通过两个文件查看:
- out/target/product/shift/obj/kernel/shift-5.4/System.map 编译产生;
- 平台中 /proc/kallsyms 动态加载内核模块的符号;
1. System.map 文件的内容
0000000000000000 A __rela_size
0000000000000000 A _kernel_flags_le_hi32
0000000000000000 A _kernel_offset_le_hi32
...
0000000003ca6000 A _kernel_size_le_lo32ffffffc010080000 t $x
ffffffc010080000 t __efistub__text
ffffffc010080000 t _head
ffffffc010080000 T _text //-----从ffffffc010080000,即 _text 进入代码段
ffffffc010080040 t pe_header
ffffffc010080044 t coff_header
...
ffffffc010081000 t $x.1778
ffffffc010081000 T __exception_text_start
ffffffc010081000 T _stext //-----从ffffffc010081000,代码段中的 _stext函数
ffffffc010081000 t efi_header_end
ffffffc010081000 t sun4i_handle_irq
...
ffffffc011380000 D __entry_tramp_data_start
ffffffc011380000 d __entry_tramp_data_vectors
ffffffc011380000 D __start_rodata
ffffffc011380000 T _etext //-----从ffffffc011380000,退出代码段
ffffffc011380008 d __entry_tramp_data_this_cpu_vector //----进入初始化数据段
ffffffc011381000 D vdso_start
ffffffc011382000 D vdso32_start
ffffffc011382000 D vdso_end
ffffffc011383000 D vdso32_end
...
ffffffc0120beb88 r __pci_fixup_qcom_fixup_class1536 //----进入常量数据段
ffffffc0120beb88 R __start_pci_fixups_early
ffffffc0120beba0 r __pci_fixup_qcom_fixup_class1537
...ffffffc0127a3200 D __bss_start //----内核初始化数据结束,之后为未初始化数据
ffffffc0127a3200 d __efistub__edata
ffffffc0127a3200 D _edata //----内核初始化数据结束,之后为未初始化数据
...ffffffc013d26000 b __efistub__end
ffffffc013d26000 B _end //----内核未初始化数据结束
ffffffc013d26000 B init_pg_end
System.map 文件是通过 nm vmlinux 命令重定向到文件中产生的。
2. kallsyms 文件的内容
对于 /proc/kallsyms 内容摘录如下:
130|shift:/ # cat /proc/kallsyms
0000000000000000 t _head
0000000000000000 T _text
0000000000000000 t pe_header
0000000000000000 t coff_header
0000000000000000 t optional_header
0000000000000000 t extra_header_fields
0000000000000000 t section_table
0000000000000000 t efi_header_end
0000000000000000 t sun4i_handle_irq
0000000000000000 T _stext
0000000000000000 T __exception_text_start
0000000000000000 t gic_handle_irq
0000000000000000 T do_undefinstr
0000000000000000 T do_cp15instr
0000000000000000 T do_sysinstr
0000000000000000 T do_mem_abort
0000000000000000 T do_el0_irq_bp_hardening
0000000000000000 T do_el0_ia_bp_hardening
0000000000000000 T do_sp_pc_abort
0000000000000000 T do_debug_exception
kallsyms 文件包含了动态加载的内核模块的符号,抽取了内核用到的所有函数地址和非数据变量地址,生成了一个数据块,作为只读数据链接进 kernel image。
3. 两个文件区别
区别1:
- /proc/kallsysms 具有动态加载模块的符号以及静态代码(kernel image)的符号表;
- system.map 仅是静态代码(kernel image)的符号表;
正在运行的内核可能和System.map不匹配,所以/proc/kallsyms才是内核符号参考的主要来源,我们应该通过/proc/kallsyms获得符号的地址。
区别2:
- System.map 是文件系统上的实际文件。 每次内核编译都会生成一个新的 System.map;
- /proc/kallsyms 是内核启动时动态创建的“proc 文件”。 实际上,它并不是真正的磁盘文件。 它是内核数据的表示,已经加载到内存中。因此,对于当前正在运行的内核,它总是正确的;
4. 符号表内容
内容分三个部分:
- 符号地址;
- 符号类型;
- 符号名;
4.1 符号类型
如果是小写,符号通常是局部的; 如果是大写,则符号是全局的(外部的)
type | comment |
---|---|
A | The symbol’s value is absolute, and will not be changed by further linking. |
B/b | The symbol is in the BSS data section. This section typically contains zero-initialized or uninitialized data, although the exact behavior is system dependent. |
C | The symbol is common. Common symbols are uninitialized data. When linking, multiple common symbols may appear with the same name. If the symbol is defined anywhere, the common symbols are treated as undefined references. |
D/d | The symbol is in the initialized data section. |
G/g | The symbol is in an initialized data section for small objects. Some object file formats permit more efficient access to small data objects, such as a global int variable as opposed to a large global array. |
i | For ELF format files this indicates that the symbol is an indirect function. This is a GNU extension to the standard set of ELF symbol types. It indicates a symbol which if referenced by a relocation does not evaluate to its address, but instead must be invoked at runtime. The runtime execution will then return the value to be used in the relocation. |
I | The symbol is an indirect reference to another symbol. |
N | The symbol is a debugging symbol |
p | The symbols is in a stack unwind section |
R/r | The symbol is in a read only data section |
S/s | The symbol is in an uninitialized or zero-initialized data section for small objects. |
T/t | The symbol is in the text (code) section. |
U | The symbol is undefined. |
u | The symbol is a unique global symbol. This is a GNU extension to the standard set of ELF symbol bindings. For such a symbol the dynamic linker will make sure that in the entire process there is just one symbol with this name and type in use. |
V/v | The symbol is a weak object. When a weak defined symbol is linked with a normal defined symbol, the normal defined symbol is used with no error. When a weak undefined symbol is linked and the symbol is not defined, the value of the weak symbol becomes zero with no error. On some systems, uppercase indicates that a default value has been specified. |
W/w | The symbol is a weak symbol that has not been specifically tagged as a weak object symbol. When a weak defined symbol is linked with a normal defined symbol, the normal defined symbol is used with no error. When a weak undefined symbol is linked and the symbol is not defined, the value of the symbol is determined in a system-specific manner without error. On some systems, uppercase indicates that a default value has been specified. |
- | The symbol is a stabs symbol in an a.out object file. In this case, the next values printed are the stabs other field, the stabs desc field, and the stab type. Stabs symbols are used to hold debugging information. |
? | The symbol type is unknown, or object file format specific. |