x86-64 Code Models
AMD64 does not allow an instruction to encode arbitrary 64-bit constants as immediate operand. Most instructions accept 32-bit immediates that are sign extended to 64-bit.
- 
32-bit operations with register destinations implicitly perform zero extension. 
- 
Branch instructions accept 32-bit immediate operands that are sign extended. 
This can be due to the following reasons (I could not find the exact reason documented) :
- The immediate operands of instructions has not been extended to 64 bits to keep instruction size smaller, instead they remain 32-bit sign extended: Source
- When using 64-bit immediates, then there would be no space for displacement, hence instructions like jmpwould not work (This is what I could get from chatgpt, not sure tho. I have to check this claim!).
So based on the requirements and to reduce the size / improve the performance of the program, several code models are present.
Code Models
Code models usually tell the compiler how far the code and data might be from each other. These define constraints for symbolic values that allow the compiler to generate better code.
- These models differ in: addressing,code size,data sizeandaddress range. To use a particular code model, you can compile your program in GCC using-mcmodel=flag (Source).
- In terms of instructions, addressing methods and steps for movandcallwill change.
- The System V ABI describes the following code models: Small, Medium, Large, Kernel, Small PIC, Medium PIC, Large PIC.
| Model | Address Range / Layout | Addressing | 
|---|---|---|
| Small (Default) | Code and data in lower 2 GB of address space | RIP-relative addressing for code and data (within ±2 GB). Pointers are still 64-bit | 
| Kernel | Code and data in upper (negative) 2 GB of address space | RIP-relative (same limits as small model) | 
| Medium | Code in lower 2 GB, small data in lower 2 GB. Large data > 2 GB (or larger than -mlarge-data-threshold) in large data sections like.ldata,.lrodata,.lbss. | Mixed: RIP-relative for code/small data, movabs(absolute) for large data | 
| Large | Code and data anywhere in address space | Absolute addressing using movabs | 
| Small PIC | Code and data in lower 2 GB, accessed via GOT if needed | RIP-relative for nearby symbols; GOT-based indirect addressing for globals | 
| Medium PIC | Code in lower 2 GB, data can be anywhere | Code: RIP-relativeData: via GOT using base in %r15 | 
| Large PIC | Code and data anywhere | All addressing via GOT/PLT (absolute built from %rip+ offset) | 
Addressing in Action
Let’s compile the following program on Godbolt with the compiler version x86-64 gcc 15.2 and argument: as given below with the programs. We will focus only a small part of the program
Program (taken from the System V ABI Manual) :
extern int src[65536];
extern int dst[65536];
extern int *ptr;
static int lsrc[65536];
static int ldst[65536];
static int *lptr;
int main() {
    dst[0] = src[0];
    ptr = dst;
    *ptr = src[100];
    ldst[0] = lsrc[0];
    lptr = ldst;
    *lptr = lsrc[0];
}
Small, Medium and Large Model
- Argument: -mcmodel=small -mlarge-data-threshold=65535
main:
    ...
    ; ptr = dst
    movq    $dst, ptr(%rip)
    ; *ptr = src[100]
    movq    ptr(%rip), %rax
    movl    src+400(%rip), %edx
    movl    %edx, (%rax)
    ...
    
- Argument: -mcmodel=medium -mlarge-data-threshold=65535
main:
    ...
    ; ptr = dst
    movabsq $dst, %rax
    movq    %rax, ptr(%rip)
         
    ; *ptr = src[100]
    movq    ptr(%rip), %rax
    movabsq $src, %rdx
    movl    400(%rdx), %edx
    movl    %edx, (%rax)
    ...
- Argument: -mcmodel=large -mlarge-data-threshold=65535
main:
    ...
    ; ptr = dst
    movabsq $ptr, %rax
    movabsq $dst, %rcx
    movq    %rcx, (%rax)
    ; *ptr = src[100]
    movabsq $ptr, %rax
    movq    (%rax), %rax
    movabsq $src, %rdx
    movl    400(%rdx), %edx
    movl    %edx, (%rax
    ...
Small, Medium and Large PIC Model
- Argument: -mcmodel=small -fPIC -mlarge-data-threshold=65535
main:
    ...
    ; ptr = dst
    movq    ptr@GOTPCREL(%rip), %rax
    movq    dst@GOTPCREL(%rip), %rdx
    movq    %rdx, (%rax)
    ; *ptr = src[100] 
    movq    ptr@GOTPCREL(%rip), %rax
    movq    (%rax), %rax
    movq    src@GOTPCREL(%rip), %rdx
    movl    400(%rdx), %edx
    movl    %edx, (%rax)
    ...
    ; ldst[0] = lsrc[0];
    movl    lsrc(%rip), %eax
    movl    %eax, ldst(%rip)
    ...
- Argument: -mcmodel=medium -fPIC -mlarge-data-threshold=65535
main:
    ...
    ; ptr = dst
    movq    ptr@GOTPCREL(%rip), %rdx
    movq    dst@GOTPCREL(%rip), %rcx
    movq    %rcx, (%rdx)
    ; *ptr = src[100] 
    movq    ptr@GOTPCREL(%rip), %rdx
    movq    (%rdx), %rdx
    movq    src@GOTPCREL(%rip), %rcx
    movl    400(%rcx), %ecx
    movl    %ecx, (%rdx)
    ...
    ; ldst[0] = lsrc[0]
    movabsq $lsrc@GOTOFF, %rdx
    movl    (%rax,%rdx), %edx
    movabsq $ldst@GOTOFF, %rcx
    movl    %edx, (%rax,%rcx)
    ...
- Argument: -mcmodel=large -fPIC -mlarge-data-threshold=65535
main:
    ...
    ; ptr = dst
    movabsq $ptr@GOT, %rdx
    movq    (%rax,%rdx), %rdx
    movabsq $dst@GOT, %rcx
    movq    (%rax,%rcx), %rcx
    movq    %rcx, (%rdx)
    ; *ptr = src[100] 
    movabsq $ptr@GOT, %rdx
    movq    (%rax,%rdx), %rdx
    movq    (%rdx), %rdx
    movabsq $src@GOT, %rcx
    movq    (%rax,%rcx), %rcx
    movl    400(%rcx), %ecx
    movl    %ecx, (%rdx)
    ...
    ; ldst[0] = lsrc[0]
    movabsq $ptr@GOT, %rdx
    movq    (%rax,%rdx), %rdx
    movabsq $dst@GOT, %rcx
    movq    (%rax,%rcx), %rcx
    movq    %rcx, (%rdx)
    ...