Easy addressing on the AMD64

From Spivey's Corner
Jump to: navigation, search

Consider this C program:

int a[10];
int g(int x) { return a[x]; }

If you've written in 32-bit assembly languageA symbolic representation of the machine code for a program. for the x86, you can probably work out how the body of g could be compiled. (Let's agree to turn off PIC, even if it's the default on the OS.) According to the calling convention, the argument x arrives on the stack, so we need a load instruction, written movl, to fetch it into a register, and another load with scaling to get the element of a. Unlike a RISC(Reduced Instruction Set Computer) A style of computer design where there are multiple, identical registers, arithmetic instructions that operate between registers, and separate load and store instructions with a limited set of addressing modes. machine, the x86 allows the whole address of a to appear as part of the addressing mode. In the AT&T syntax:

   movl 4(%esp), %eax        8b 44 24 04
   movl a(,%eax,4), %eax     8b 04 85 00 00 00 00
   ret                       c3

Using gcc -S -O2 -fno-pic has elided the frame pointer for this leaf routine. For entertainment, we can examine the disassembly shown at the right. The first movl instruction has

  • an opcode of 8b for load or move-to-register,
  • a ModR/M byte of 44 = 01|000|100 = (1, EAX, 4) that means the result goes in the EAX register, there is an 8-bit displacement, and there is an SIB byte.
  • an SIB byte of 24 = 00|100|100 = (0, 4, ESP) that means the addressing mode is ESP+d8.
  • a displacement d8 of 4.

It seems a shame that eliding the frame pointer makes us use SP-relative addressing, which is awkwardly encoded.

The other movl instruction makes better use of the encoding. It has:

  • an opcode of 8b again.
  • a ModR/M byte of 04 = 00|000|100 = (0, EAX, 4) that puts the result in the EAX register, implies a 32-bit displacement, and expects an SIB byte.
  • an SIB byte of 85 = 10|000|101 = (2, EAX, 5) that means the addressing mode is EAX<<2+d32.
  • a displacement that the linker will fill in with the address of a.

So far, so good.

Now, what happens if we compile for the AMD64 = x86_64 = whatever? Here is the code:

    movslq %edi, %rdi
    movl a(,%rdi,4), %eax
  0:	48 63 ff             	movslq %edi,%rdi
  3:	8b 04 bd 00 00 00 00 	mov    0x0(,%rdi,4),%eax
  a:	c3                   	retq   

Compiled again without the -fno-pic: g: .LFB0: .cfi_startproc leaq a(%rip), %rax movslq %edi, %rdi movl (%rax,%rdi,4), %eax ret .cfi_endproc

  0:	48 8d 05 00 00 00 00 	lea    0x0(%rip),%rax        # 7 <g+0x7>
  7:	48 63 ff             	movslq %edi,%rdi
  a:	8b 04 b8             	mov    (%rax,%rdi,4),%eax
  d:	c3                   	retq
Personal tools