Data Accessing methods
Immediate Mode
The data to access is embedded in the instruction.
Example:
movl $0 %eax
<— Move the literal number zero into register eax
register addressing mode
The instruction contains a register to access
Memory Access Modes
- Direct Addressing Mode
The instruction contains the memory address to access. Example:
movl 202 %eax
<— word starting at address 202 into registereax
- Indexed Addressing Mode
Like
Direct Addressing Mode
, where an address is given, but an index register is also specified to offset the address. On x86 processors, you can also specify a multiplier - Indirect Addressing Mode
The instruction contains a register that holds a memory address to access. Example:
movl (%esp) %eax
<— Grabs the data at the memory address held inesp
- Base Pointer Addressing Mode
Like
Indirect Addressing Mode
, but with an offset also given. Example:movl 4(%esp) %eax
<— Adds 4 to the address inesp
then fetches that data in memory.
Calling Convention
The way variables are stored, parameters are passed, and return values are transferred. This varies from language to language. One can use any calling convention when programming in Assembly. The programmer can even come up with their own calling convention
Linux uses the C calling conventions
interoperability
Other languages can have interoperability with assembly functions. In order for this to work though, the language's calling convention has to be used with the Assembly functions.
C calling convention
In the C calling convention, the stack
is used heavily to hold function information.
The stack
holds:
- local variables
- parameters
- return address
The Stack
The stack lives at the very top addresses of memory, and grows down
Things can be pushed onto the stack with pushl
and popl
, where "l" is presumably for "long"
Stack registers
%esp
: holds a pointer to the top of the stack, starting at the highest address
Pushing and popping
When we push onto the stack, %esp
gets subtracted by 4 (stack grows down)
When we pop from the stack, %esp
gets added by 4
Calling functions
Before executing a function, all the parameters are pushed onto the stack
in reverse order.
Afterwards, The call instruction is issued which indicates which function
will be executed, the address of the next instruction is pushed onto the
stack (the return address), and the instruction pointer (%eip
) is modified
to point to the start of the function.
Here's the Stack so far:
Stack |
---|
Parameter #N |
Parameter 2 |
Parameter 1 |
Return Address <— (%esp) |
The function still has some work to do:
- Save the current base pointer register (
%ebp
) onto the stack. The%ebp
register is used to access parameters and local variables. We don't want to overwrite it if there was something previously there, so it gets "backed up" somewhere else. Copy the stack pointer to
%ebp
(movl %esp %ebp
) This lets you use the base pointer as a sort of index to access the parameters. Using the stack pointer isn't recommended, as the stack may change as things are pushed and popped onto it (Like pushing arguments to other functions). Copying the stack pointer at the start of every function makes it so you always know where your function parameters are (As well as local variables).The stack now looks like this:
Stack |
---|
Parameter #N <— N*4+4(%ebp) |
Parameter 2 <— 12(%ebp) |
Parameter 1 <— 8(%ebp) |
Return Address <— 4(%ebp) |
Old %ebp <— (%esp) and (%ebp) |
The function then reserves space on the stack for any local variables.
This is done by moving the stack pointer down a certain number of bytes.
Let's say we wanted to reserve 8 bytes for local variables.
We can do that by just subl $8 %esp
.
This is done so we don't have to worry about clobbering them with pushes for
function calls.
All of this is being done on the function's stack frame, so when it returns, all the variables will cease to exist
The stack now looks like this:
Stack |
---|
Parameter #N <— N*4+4(%ebp) |
Parameter 2 <— 12(%ebp) |
Parameter 1 <— 8(%ebp) |
Return Address <— 4(%ebp) |
Old %ebp <— (%ebp) |
Local Variable 1 <— -4(%ebp) |
Local Variable 2 <— -8(%ebp) and (%esp) |
All the data can be accessed using base pointer addressing
, and using different
offsets from %ebp
.
%ebp
exists for this exact purpose
Other registers can be used for base pointer addressing
, but x86 architecture makes using
%ebp
really fast.
returning from the function
When a function is done executing, it has to:
- Store the return value in
%eax
. - Restore the stack to what it looked like previously
- Return control back to wherever it was called from.
This is done using the
ret
instruction, which pops whatever was at the top of the stack, and sets the%eip
register to that value (the address of the instruction aftercall
).
This all has to be done in that exact order for everything to work properly.
If the stack wasn't restored before ret
, ret
wouldn't function properly.
So to return, the following must be done:
movl %ebp, %esp # Have the stack pointer (esp) point to where ebp is pointing popl %ebp ret
You should consider all local variables inaccessible, as future stack pushes will overwrite them. Let's look at this visually.
movl %ebp, %esp
:
Stack |
---|
Parameter #N <— N*4+4(%ebp) |
Parameter 2 <— 12(%ebp) |
Parameter 1 <— 8(%ebp) |
Return Address <— 4(%ebp) |
Old %ebp <— (%ebp) and (%esp) |
Local Variable 1 <— -4(%ebp) |
Local Variable 2 <— -8(%ebp) |
popl %ebp
:
Stack |
---|
Parameter #N |
Parameter 2 |
Parameter 1 |
Return Address <— (%esp) |
Local Variable 1 |
Local Variable 2 |
Here, the Old %ebp
value that was on the stack is popped, and placed back into %ebp
.
And since we ran popl
, %esp
now points to the return address. We can finally execute the ret
instruction.
Once out of the function, the parameters need to be popped off