X64 汇编/AT&T
在 x64 汇编语法的领域内分成两个流派: NASM 和 AT&T. 大多数国人可能更为熟悉 NASM 语法, 因为它源自 8086 CPU, 许多大学都会开设 8086 CPU 的实验课, 我也是在那是第一次接触 NASM. 后者则流行于 Unix/Linux 平台上, 是目前更为主流的语法.
对我而言, 我更为喜欢 NASM 语法, 不过 AT&T 与 NASM 区别也不大. 在两种语法之间切换并不困难. 我们先来看下典型的 AT&T 语法的样子. 我们编译如下的 C 代码:
int main() {
return 0;
}
$ gcc -S -o main.s main.c
-S
命令表示 gcc 在生成汇编代码后停止后续工作. 打开 main.s, 内容如下:
.file "main.c"
.text
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0"
.section .note.GNU-stack,"",@progbits
AT&T vs NASM (简要区别)
AT&T 语法和 NASM(Netwide Assembler) 语法主要区别如下:
- 语法结构: AT&T 语法使用逗号作为操作数分隔符, 操作数的顺序是目标操作数在前, 源操作数在后. 而 NASM 语法使用逗号作为分隔符, 操作数的顺序是源操作数在前, 目标操作数在后.
例如, 将寄存器 eax 的值加到寄存器 ebx 中, AT&T 语法中的指令是:
addl %eax, %ebx
而在 NASM 语法中的指令则是:
add ebx, eax
- 寄存器表示: AT&T 语法在寄存器名前使用 % 符号, 而 NASM 语法中不使用 % 符号. 例如, 表示 eax 寄存器时, AT&T 语法为 %eax, 而 NASM 语法为 eax.
- 立即数表示: AT&T 语法使用 $ 符号表示立即数, 而 NASM 语法中不使用 $ 符号. 例如, 表示立即数 10 时, AT&T 语法为 $10, 而 NASM 语法为 10.
- 操作数大小: AT&T 语法在操作码后面使用后缀字符来表示操作数的大小, 例如, b 表示字节, w 表示字, l 表示双字. 而 NASM 语法使用关键字来表示操作数的大小, 例如, BYTE 表示字节, WORD 表示字, DWORD 表示双字.
例如, 将立即数 10 存储到寄存器 eax 中, AT&T 语法为:
movl $10, %eax
而在 NASM 语法中的指令则是:
mov eax, 10
AT&T vs NASM (详细区别)
AT&T vs. NASM
There are two main forms of assembly syntax:AT&T and Intel.
AT&T syntax is used by the GNU Assembler(gas), contained in the gcc compiler suite, and is often used by Linux developers.
Of the Intel syntax assemblers, the Netwide Assembler(NASM) is the most commonly used.
The NSAM format is used by many windows assemblers and debuggers.
The two formats yield exactly the same machine language; however, there are a few differences in style and format:
The source and destination operands are reversed, and different symbols are used to mark the beginning of a comment:
NASM format CMD <dest>, <source><comment>
AT&T format CMD <source>, <dest><#comment>
AT&T format uses a % before registers; NASM does not
AT&T format uses a $ before literal values; NASM does not
AT&T handles memory reference differnetly than NASM
mov
The mov command copies data from the source to the destination. The value is not removed from the source location.
NASM Syntax NASM Example AT&T Example
mov<dest>,<source> mov eax, 51h;comment movl $51, %eax#comment
Data cannot be moved directly from memory to a segment register. Instead, you must use a general purpose register as an intermediate step;
mov eax, 1234h; sotre the value 1234 (hex) into EAX
mov cs, ax; then copy the value of AX into CS.
add and sub
The add command adds the source to the destination and stores the result in the destination.
The sub command subtracts the source form the destionation and stores the result in the destination
NASM Syntax NASM Example AT&T Example
add <dest>, <source> add eax, 51h addl $51h, %eax
sub <dest>, <source> sub eax, 51h subl $51h, %eax
push and pop
The push and pop commands push and pop items from the stack
NASM Syntax NASM Example AT&T Example
push <value> push eax pushl %eax
pop <dest> pop eax popl %eax
xor
The xor command conduts a bitwise logical "exclusive or" (XOR) function. XOR value, value to zero out or clear a register or memory location
NASM Syntax NASM Example AT&T Example
xor <dest>, <source> xor eax, eax xor %eax, %eax
The jne, je, jz, and jmp commands branch the flow of the program to another location based on the value of the eflag "zero flag."
jne/jnz jumps if the "zero flag"=0; je/jz jumps if the "zero flag"=1; and jmp always jumps.
NASM Example NASM Example AT&T Example
jnz <dest> / jne <dest> jne start jne start
jz <dest> / je <dest> jz loop jz loop
jmp <dest> jmp end jmp end
call and ret
The call command calls a procedure (not jumps to a label). The ret command is used at the end of a procedure to return the flow to the command after the call.
NASM Example NASM Example AT&T Example
call <dest> call subroutine1 call subroutine1
ret ret ret
inc and dec
The inc and dec commands increment or decrement the destination, respectively.
NASM Example NASM Example AT&T Example
inc <dest> inc eax incl %eax
dec <dest> dec eax decl %eax
lea
The lea command loads the effective address of the source into the destination
NASM Example NASM Example AT&T Example
lea <dest>, <source> lea eax, [dsi+4] leal 4(%dsi), %eax
int
The int command throws asystem interrupt signal to the processor. The common interrupt you will use is 0x80, which signals a system call to the kernel.
NASM Syntax NASM Example AT&T Example
int <val> int 0x80 int $0x80
Addressing Modes
In assembly, several methods can be used to accomplish the same thing.
In particular, there are many ways to indicate the effective address to manipulate in memory.
These options are called addressing modes.
Register: Registers hold the data to be manipulated / No memory interaction / Both registers must be the same size
NASM Example: mov ebx, edx / add al, ch
Immediate: The source operand is a numerical value / Decimal is assumed; use h for hex
NASM Example: mov eax, 1234h / mov dx, 301
Direct: The first operand is the address of memory to manipulate / It's marked with brackets.
NASM Example: mov bh, 100 / mov [4321h], bh
Register Indirect: The first operand is a regsiter in brackets that holds the address to be manipulated
NASM Example: mov [di], ecx
Based Relative: The effective address to be manipulated is calculated by using ebx or ebp plus an offset value
NASM Example: mov edx, 20[ebx]
Indexed Relative: Same as Based Relative, but edi and esi are used to hold the offset
NASM Example: mov ecx, 20[esi]
Based Indexed-Relative: The effective address is found by combining Based and Indexed Relative modes
NASM Example: mov ax, [bx][si]+1
参考
- [1] Allen Harper. Gray Hat Hacking the Ethical Hacker's Handbook.