Function Calls and The Stack
Calling Convention
The calling convention describes how parameters are passed to a function using registers and the stack. LLVM has many calling conventions available that can be used by programmers, language developers, etc. to decide on the convention that suits best to their needs. These different conventions provide different optimisations (see Optimisations).
To be compliant with System V ABI, all global functions must follow the convention. Local functions (not reachable from other compilation units such as when using static in C) can follow their own conventions. These optimisations are usually done by the compiler itself and the programmer doesn’t have to do much.
A function call usually involves the following steps:
- Calculating the values of arguments.
- Allocate a stack frame.
- Push the arguments to the registers and the stack. For each argument, we must determine the following:
- If the argument should be pushed to stack or the registers. For this we have to classify the parameters and then merge the classes if an argument consists of multiple sub-classes.
- If to be put to registers, the number of General Purpose Registers (GPRs) and SSE Registers required to hold the argument.
- The order of the registers that have to be used.
Classification of Parameters
The specification defines the following classes:
| INTEGER | Integral types that fit into one of general purpose registers (GPRs). |
| SSE | Types that fit into vector registers. |
| SSEUP | Types that fir in vector regs, but can be passed and returned in upper bytes of it. |
| NO_CLASS | Used as initialiser in algorithms and for padding and empty structures and unions |
| MEMORY | Types that will be passed/returned in memory via the stack. |
Other types such as x87, x87UP, COMPLEX_x87 are also there.
Classification Algorithm
- The size of each argument is rounded upto eightbytes. This will help keep the stack aligned to 8 bytes (See Stack section for more information).
- The following table denotes the classification based on the type of the argument:
| _Bool, char, short, int, long, long long, pointer | INTEGER |
| _Float16, float, double, _Decimal32, _Decimal64, __m64 | SSE |
| __float128, etc. | Least significant - SSE, Others - SSEUP |
| __int28 | Treated as struct of two consecutive Integers. Exception that it must be stored on 16-byte boundary. |
| struct, arrays, unions | - Size of object is > 8 eightbytes (8 * 64-bits) or has unaligned fields, then MEMORY |
If the size if more than 1 eightbyte (64-bits), then it is broken down into subclasses of eightbytes as NO_CLASS. Then recursively classified and then the classes are merged by rules below.
Merging Subclasses
The rules for merging subclasses are:
- If both classes are equal, the result is the same class.
- If one of the class is
NO_CLASS, the result is the other class. - If one of the class is
MEMORY, the result isMEMORY. - If one is
INTEGER, the result isINTEGER.
Registers Used and Order of Parameters
The following rules define where to push the argument:
MEMORY→ stack (with stack alignment rules, the alignment can be more than the alignment of the type)INTEGER→ next available register (from left-to-right): .SSE→ next available register between .X87, X87UP or COMPLEX_X87→ Memory
%alis used to indicate the number of vector arguments passed to a function requiring a variable number of arguments.%r10is used for passing a function’s static chain pointer.
Things to keep in mind:
- If there are more than 6 arguments (of type INTEGER) then the extra are pushed to the stack.
- The arguments are pushed right to left, so that the first argument address can be calculated statically using the stack pointer arithmetic. This is useful especially in case of variadic arguments or functions that are called with ellipses
(...).- For variadic arguments the
alpart of theraxregister contains the upper bound on the number of arguments.
- For variadic arguments the
Note: Golang does not follow the platform ABI, instead it has it’s own internal ABI and a stable ABI called ABI0. Initially, every call was stack based and arguments and results were on stack only. But then register based calling was added.
Reference: https://go.googlesource.com/proposal/+/master/design/40724-register-calling.md
Stack
- Memory region that holds local variables and arguments. It grows downwards from high addresses.

- Stack is always aligned by at least 16 bytes or 128 bits. If the stack variable are less than 16 bytes, then too stack is of 16 bytes. This must be ensured before making the
callinstruction. - The 128-byte area beyond the location pointed to by
%rspis considered to be reserved and should not be modified by signal or interrupt handlers. This area is called thered zone. This can be used by leaf functions for their stack frame instead of changing the stack pointer. This saves the epilogue and prologue instructions.
Stack Unwinding
Process Initialisation
When the _start is called, this is the state of the stack:
