Functions and Calling Convention in MIPS
Introduction
A function is an algorithmic concept used to structure programs to make them easier to understand and manipulate. In fact, a function
- is a block of instructions that can be called as needed at different points in a program.
- can receive parameters and return results. Parameters and results act as an interface between the function and the rest of the program.
To execute a function, the program must perform the following steps:
- The program must put the parameters where the callee function can access them.
- Transfers control to the callee function.
- Executes the instructions of the callee function.
- The callee function must put the results where the caller function can access them.
- Returns control to the caller function.
Registers are the fastest place to pass parameters to a function and return results. The MIPS R3000 processor has 32 registers, 24 of which can be “freely” used for data storage by your program. You might think that this is a problem if we ever plan to implement complex programs that use dozens or even hundreds of functions. This is, however, not really a limitation because we have access to stack memory which we can also use for temporary data storage.
Indeed, data contained in the registers can be stored on the stack, so that these same registers can be reused by a callee function. When this function returns, and so that the program can continue its normal execution, these registers are restored to their initial state from the data saved on the stack.
There is an established convention that dictates who (between the caller and the callee functions) is responsible for saving registers on the stack. This is called the “MIPS calling convention.” In particular, the MIPS architecture respects the following software convention for transmitting parameters and returning results:
$a0
-$a3
: Four registers for passing arguments to the callee function.$v0
-$v1
: Two registers for returning results to the caller function.$ra
: This register contains the address to which the program will return in the caller function
In MIPS, the jal
instruction (i.e. jump-and-link) is used to call a function; and the jr
instruction (i.e. jump register) is used to return control to the caller function. So, to call a function, we use the jal
instruction as follows:
jal func # func is a label that marks the start of the function we would like to call.
... # next instruction
The above instruction stores the address of the instruction that succeeds jal
in register $ra
and jumps to the first instruction of the function marked with the label func
. To return from a function, we use the instruction:
jr $ra # $ra register contains the address of 'next instruction' above
This instruction assigns to the $pc
register (program counter) the value stored in $ra
. I.e. it allows one to jump to the address contained in register $ra
.
Consider the example shown below of a C function (left) that checks whether a character ch
is in lowercase or not. The equivalent version in MIPS is shown on the right. In accordance with the MIPS calling convention, function islower
assumes that the ch
argument is passed in register $a0
. The result of the function is returned in register $v0
.
int islower(char ch) {
if (ch>='a' && ch<='z')
return 1;
else
return 0;
}
islower:
blt $a0, 'a', sinon # branch if $a0 < 'a'
bgt $a0, 'z', sinon # branch if $a0 > 'z'
li $v0, 1 # $v0 = 1
jr $ra # return to caller
sinon:
li $v0, 0 # $v0 = 0
jr $ra # return to caller
To call islower
in MIPS, the caller must first copy the ch
character into the $a0
register, then execute the “function call” with the jal
instruction:
move $a0, ... # copy of character ch into register $a0
jal islower # calling the islower function
. . . # next instruction (returns here after executing islower)
In fact, the MIPS ISA provides a second instruction for calling functions: the jalr
instruction (i.e jump-and-link-register). Indeed, unlike the jal
instruction where the jump address is coded in the instruction itself via a label, the jalr
instruction allows one to call functions whose addresses are stored in registers.
The Stack Segment
When loaded into memory by the operating system, each program has three segments: There is the code segment which contains the machine code instructions, the data segment which includes the program data and constants, and the stack segment which provides a space that can be used by functions. The programmer has no control over the location of these segments in memory.
The stack segment can be used by functions to
- transmit numerous parameters,
- allocate local variables, and
- preserve the content of registers between function calls. Without this segment, it would be “mission impossible” to write nested and/or recursive functions.
In MIPS, when a program is loaded into memory, the operating system initialises the stack-pointer register $sp
(i.e. register $29
) with the address of the stack segement. In fact, the address in $sp
points to the top of the stack. For example, in the MARS simultor, the (initial) value of the stack-pointer $sp
is set to 0x7fffeffc
.
A function can reserve space on the stack to save registers and/or to allocate local variables. This space is called a stackframe To allocate a stackframe of n
bytes, the stack pointer is decremented by n
:
addiu $sp, $sp, -n # allocate n bytes on the stack (note: n is a constant)
To free a stackframe of n bytes, one increments the stack pointer by n
:
addiu $sp, $sp, n # free n bytes from the stack (note: n is a constant)
The figure below illustrates the state of the stack segment before the call, during execution, and after returning from a function call. Initially, the stack register $sp
points to the top of the caller function’s stackframe. Once the call is made, register $sp
is decremented to make space on the top of the stack segment to store the stackframe of the callee function. Finally, when the callee function returns, register $sp
is incremented so that it points again to the stackframe of the caller function, thus freeing the memory allocated on the stack. Note that, the stack segment increases (resp. decreaases) toward lower (resp. higher) memory addresses.
An example of a function f()
that allocates a stackframe is shown below. The function is not final in a sens that it in turn calls other functions read()
, reverse()
and print()
. Therefore, the return address of f()
(i.e. contained in $ra
) must be saved on the stack before calling read()
, reverse()
or print()
. Additionally, the stackframe of f()
must also have space for the local integer array
(10 integer elements = 40 bytes) as shown below.
void f()
{
int array[10];
read(array, 10);
reverse(array, 10);
print(array, 10);
}
The translation of function f()
into MIPS is shown below. The function allocates a stackframe of 44 bytes. The stack-pointer $sp
contains the address of the top of the stack. The stack segment is accessed using the same instructions we use to read or write data to the .data
segment (i.e. instructions lw/sw
and the like). Various offset values are used to access different elements on the stack.
f: addiu $sp, $sp, -44 # reserves a 44-byte stack frame
sw $ra, 40($sp) # save $ra on the stack
add $a0, $zero, $sp # $a0 is set to the stack pointer
addi $a1, $zero, 10 # $a1 = 10
jal read # call function 'read'
add $a0, $zero, $sp # $a0 is set to the stack pointer
addi $a1, $zero, 10 # $a1 = 10
jal reverse # call function 'reverse'
add $a0, $zero, $sp # $a0 is set to the stack pointer
addi $a1, $zero, 10 # $a1 = 10
jal print # call function 'print'
lw $ra, 40($sp) # restore the value of $ra from the stack
addiu $sp, $sp, 44 # free the 44-byte stack frame
jr $ra # return to the caller function
Convention between the Caller and Callee functions in MIPS
According to the calling convention, MIPS registers are divided into two groups: registers saved/restored by the caller function and registers saved/restored by the callee function. The following table show which register is managed by wihch function type.
Number | Name | Use |
---|---|---|
$0 | $0 | Always equal to 0 |
$1 | $at | is used by the assembler tool for pseudo instruction expansion |
$2-$3 | $v0-$v1 | hold the values returned by the called functions |
$4-$7 | $a0-$a3 | used to pass arguments to functions |
$8-$15 , $24-$25 | $t0-$t9 | for handling temporary data |
$16-$23 | $s0-$s7 | store local data to functions |
$26-$27 | $k0-$k1 | kernel registers – used by exception/interrupt routines |
$28 | $gp | pointer to global data of the program |
$29 | $sp | stack pointer |
$30 | $fp | frame pointer |
$31 | $ra | stores the return address (i.e. where to return to in the calling function) |
Saved/Restored by the callee function | |
Saved/Restored by the caller function |
Temporary registers ($t0
-$t9
), argument registers (a0
-$a3
) and value registers ($v0
-$v1
) are managed by the caller function. This means that it is the responsibility of the current function to save these registers (on the stack) before calling another function if the content of those registers are required after the call! The callee function has the freedom to modify any of these registers without worrying about their previous content. Once the called function completes, these registers may possibly have changed and it is the responsibility of the caller function to restore these registers (from the stack) to their initial state prior to the function call.
On the other hand, the registers highlighted in salmon colour in the table above are saved and restored by the callee function. I.e. it is the responsibility of the callee function to save these registers (on the stack) before possibly modifying their content. However, before the callee function returns, and if these registers have actually been modified, then the callee function must restore these registers (from the stack) to their initial state.
With this calling convention, one notes the following: 1) The caller function assumes that the values in the « salmon-coloured » registers will not change after a function call (because the callee function should save/restore them); 2) The callee function assumes that the values in the « sky blue » registers are free to be modified in any way (because the caller function must save them before the call if it needs their contents later on).
Example of saving/restoring registers between function calls
Let us now try to put this calling convention into practice. Consider the assembly code structure below for a function named func1
. Suppose that func1
is a function called by another function (say, the main
function). Looking at the code, one observes that func1
also calls another function named func2
. When func1
was called by the main
function, func1
was the callee. However, when func1
calls func2
, func1
becomes the caller. Therefore, func1
will have to assume the responsibilities of being a callee and a caller function!
func1: # modifies $a0, $t0, $v0 and $s0. $ra points to the `main` function.
# Checkpoint 1: What do you need to do before you start modifying registers?
# Some block of code using $a0, $t0, and $s0
# Checkpoint 2: What do you need to do before you call another function?
# input argument at $a0, return value at $v0
jal func2 # call func2
# Checkpoint 3: What do you need to do after a function call?
# Some code using $v0, $t0, and $s0
# Checkpoint 4: What do you need to do before this function returns?
jr $ra # function return
Storing values on the stack is done using the sw
instruction to the address supplied by the stack pointer $sp
adjusted with an appropriate offset. Conversely, retrieving values from the stack is done using the lw
instruction. So, whenever we want to store values on the stack, we need to adjust the stack pointer by decreasing it. On the other hand, whenever we want to retrieve values from the stack, the stack pointer is adjusted in the opposite direction (i.e. by increasing the stack pointer). Following these procedures will ensure that $sp
points to the same position in memory at the beginning and end of the function call. This is how functions use the stack memory segment as temporary storage.
Let’s try to complete the code at each indicated “checkpoint”.
For checkpoint 1, since func1
is a callee here (i.e. called by main
or any other function), it must save all the « salmon-coloured » registers that it is likely to modify. In particular, as func1
lists a jal
instruction (i.e. jal func2
), it must save $ra
on the stack (remember that the jal
instruction modifies the content of the $ra
register). Therefore, since we must save two registers (the second being $s0
), the $sp
pointer is adjusted (decremented) by two words (8 bytes):
# Checkpoint 1: What do you need to do before you start modifying registers?
addi $sp $sp -8 # Push the stack pointer down by 2 words (8 bytes)
sw $ra 0($sp) # Save the return address register $ra
sw $s0 4($sp) # Save the "stored" register $s0
For checkpoint 2, we notice that the $a0
, $t0
, and $s0
registers have potentially been modified, and we are now about to call the func2
function (func1
now becomes the caller). We also notice that later in the code (i.e. after checkpoint 3), we use the contents of $t0
and $s0
registers again.
So, from the perspective of func1
, these two registers should be unchanged after calling func2
. We know that if func2
follows the calling convention, it should save the register $s0
on the stack and retrieve its initial value before exiting and returning to func1
… this coincides with our goal regarding $s0
! However, according to this same convention, the content of $t0
will not be saved/restored by func2
. In fact, function func2
has complete freedom to use and modify the content of this register without caring about its initial state before the call. So, as the caller of func2
, function func1
is also responsible for saving the content of $t0
on the stack. As a result, we have the following code:
# Checkpoint 2: What do you need to do before you call another function?
addi $sp $sp -4 # Push the stack pointer down by 1 word (4 bytes)
sw $t0 0($sp) # Save the temporary register ($t0)
For checkpoint 3, function func2
has completed and $v0
contains the value returned by that function. func1
is now about to perform some operations using that value and the data in registers $t0
and $s0
. Again, if func2
had followed the calling convention, the value of register $s0
would be unchanged from func1
’s perspective. However, $t0
may have changed. Good thing we saved its contents on the stack before calling func2
! So all we have to do is retrieve and restore that value. Hence, the following code:
# Checkpoint 3: What do you need to do after a function call?
lw $t0 0($sp) # Retrieve the saved temporary register $t0 from the stack
addi $sp $sp 4 # Adjust the stack pointer up by 1 word (4 bytes)
Finally, for checkpoint 4, we are now at the point where func1
has completed its operations and is about to exit and return to main
. However, before executing the return instruction, func1
must ensure that it has also fulfilled its responsibilities as a callee function. Earlier in checkpoint 1, we saved the contents of $ra
and $s0
registers to the stack.. now it is time to retrieve them! Indeed, during the execution of func1
, the function modified the contents of $s0
and $ra
registers. As the callee function relative to main
, func1
has the responsibility of returning the values of these registers to their initial states before returning to main
. Concequently, we have the following code:
# Checkpoint 4: What do you need to do before this function returns?
lw $s0 4($sp) # Retrieve the original stored register $s0
lw $ra 0($sp) # Retrieve the original return address $ra. This points back to the main function.
addi $sp $sp 8 # Adjust the stack pointer up by 2 words (8 bytes)
This completes our implementation of the MIPS calling convention for func1
. Here is the complete code:
func1: # modifies $a0, $t0, $v0 and $s0. $ra points to the `main` function.
# Checkpoint 1: What do you need to do before you start modifying registers?
addi $sp $sp -8 # Push the stack pointer down by 2 words (8 bytes)
sw $ra 0($sp) # Save the return address register $ra
sw $s0 4($sp) # Save the "stored" register $s0
# Some block of code using $a0, $t0, and $s0
# Checkpoint 2: What do you need to do before you call another function?
addi $sp $sp -4 # Push the stack pointer down by 1 word (4 bytes)
sw $t0 0($sp) # Save the temporary register $t0
# input argument at $a0, return value at $v0
jal func2 # call func2
# Checkpoint 3: What do you need to do after a function call?
lw $t0 0($sp) # Retrieve the saved temporary register from the stack
addi $sp $sp 4 # Adjust the stack pointer up by 1 word (4 bytes)
# Some block of code using $a0, $t0, and $s0
# Checkpoint 4: What do you need to do before this function returns?
lw $s0 4($sp) # Retrieve the original saved register $s0
lw $ra 0($sp) # Retrieve the original return address $ra. This points back to the main function.
addi $sp $sp 8 # Adjust the stack pointer up by 2 words (8 bytes)
jr $ra # function return
Suppose the register $sp
points to memory address 0x7FFFFFF0
at the beginning of func1
.
From C to MIPS (functions calling)
As done previously , we will translate the C program below into MIPS. The added difficulty here is that we must follow the MIPS calling convention when writing MIPS code.
int source[] = {3, 1, 4, 1, 5, 9, 0};
int dest[10];
int fun(int x) {
return -x * (x + 1);
}
int main() {
int k;
int sum = 0;
for (k = 0; source[k] != 0; k++) {
dest[k] = fun(source[k]);
sum += dest[k];
}
printf("sum: %d\n", sum);
}
Let’s first rewrite the C source code into an equivalent yet “ unstructured ” form.
int source[] = {3, 1, 4, 1, 5, 9, 0};
int dest[10];
int fun(int x) {
return -x * (x + 1);
}
int main() {
int k = 0;
int sum = 0;
WHILE:
if (source[k] == 0) goto ELIHW;
dest[k] = fun(source[k]);
sum += dest[k];
k++;
goto WHILE;
ELIHW:
printf("sum: %d\n", sum);
}
Now let’s move on to the MIPS conversion and start by defining the source
and dest
arrays. We need to declare our arrays in the .data
section as shown below:
.data
source:
.word 3, 1, 4, 1, 5, 9, 0
dest:
.word 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 # alternatively, you could also put: .space 40 # ( 10 x 4 bytes )
Next, we translate the C function:
int fun(int x) {
return -x * (x + 1);
}
The MIPS calling convention states that
- The value of
x
can be retrieved from register$a0
. - The returned value must be put in the register
$v0
.
The MIPS version of fun()
is given below with exhaustive comments.
.text
fun:
addi $t0, $a0, 1 # $t0 = x + 1
sub $t1, $0, $a0 # $t1 = -x
mul $v0, $t0, $t1 # $v0 = (x + 1) * (-x)
jr $ra # return to the caller
Let’s now translate the main
function to MIPS (for the moment, we’ll ignore the MIPS calling convention relative to storing/restoring data from the stack) …
int main() {
int k = 0;
int sum = 0;
The C code above becomes in MIPS:
main:
addi $t0, $0, 0 # $t0 <-> k = 0
addi $s0, $0, 0 # $s0 <-> sum = 0
Next, let’s get the addresses of the two arrays source[]
and dest[]
.
la $s1, source
la $s2, dest
Remember that the la
instruction retrieves the address associated with the label given as a parameter. This is the only way to access the source[]
and dest[]
addresses. $s1
is now a pointer to the source[]
array and $s2
is a pointer to the dest[]
array.
Now to translating the loop:
WHILE:
if (source[k] == 0) goto ELIHW;
dest[k] = fun(source[k]);
sum += dest[k];
k++;
goto WHILE;
ELIHW:
First, we’ll translate the iteration instructions of the loop.
WHILE:
sll $s3, $t0, 2 #1
add $t1, $s1, $s3 #2
lw $t2, 0($t1) #3
beq $t2, $0, ELIHW #4
...
addi t0, t0, 1 #5
j WHILE #6
ELIHW:
- Lines 1 to 3 are needed to access element
k
(stored in$t0
) in arraysource[]
. We start by calculating the index in bytes of the element in the array. Assource[]
is an array of integers, so the size of each element is4 bytes
. This means we need to multiply$t0
by4
to calculate the index in bytes. To multiply a value by four, simply shift it to the left by two binary positions. - Next, we need to add this byte index to the pointer
$s1
to calculate the address ofsource[k]
. - Once the address is known, we can load in
$t2
the value of the elementsource[k]
. - Next, we check if the loaded element is equal to 0. If so, we jump to the exit of the loop.
- At the end of the loop, we increment
k
by 1 - Finally, we loop back to the beginning of the loop
Let’s now add instructions to implement the body of the loop while taking into account the MIPS calling convention in relation to passing (resp. returning) parameters (resp. results) between functions…
WHILE:
sll $s3, $t0, 2
add $t1, $s1, $s3
lw $t2, 0($t1)
beq $t2, $0, ELIHW
add $a0, $0, $t2 # 1
jal fun # 2
add $t3, $s2, $s3 # 3
sw $v0, 0($t3) # 4
add $s0, $s0, $v0 # 5
addi $t0, $t0, 1
j WHILE
ELIHW:
fun
takes the argumentx
. in MIPS, We need to pass this argument via register$a0
so thatfun
can retrieve it.- Calling the
fun
function. Thejal
instruction automatically saves the address of the instruction (#3) in$ra
. - Next, we want to store the value returned by
fun
in thedest[]
array. First, we need to calculate the address of where we want to store the value indest[]
. In this sense, and since it is the same positionk
in an array of integers, we can reuse the index in bytes calculated previously (i.e.$s3
) and add it to the array pointerdest[]
(i.e.$s2
). - Storing the value returned by
fun
indest[k]
. Recall thatfun
should put the returned value in$v0
. - Increment the
sum
variable bydest[k]
.
Last, to the MIPS implementation of the exit of the main
function (still excluding the MIPS calling convention relative to storing/restoring data from the stack) …
ELIHW:
addi $v0, $0, 1 # syscall service number, 1 = 'print an integer'
addi $a0, $s0, 0 # argument to syscall, $a0 = integer to print
syscall # system call (prints an integer)
addi $v0, $0, 10 # syscall service number, 10 = program exit
syscall # system call (exit the program)
The content of the sum
variable is stored in the $s0
register. Remember that in order to print this value, we need to put it in $a0
before the first system call (and set service number 1
in $v0
).
Now that most of the logic of our program is translated into MIPS, we need to finaliee it by introducing the necessary instructions in order to respect the calling convention for the main
and fun
functions (i.e. adding save/restore instructions on the stack).
First, let’s add the appropriate instructions around jal fun
. We need to save all registers of the caller function (i.e. the main
function) whose values we want to remain the same after calling fun
. In this case, we can see that we are using registers $t0
, $t1
, $t2
, $t3
, $a0
and $v0
in main
.
Let’s add the appropriate instructions around jal fun
.
addi $sp, $sp, -4
sw $t0, 0($sp)
jal fun
lw $t0, 0($sp)
addi $sp, $sp, 4
Now to the prologue and epilogue of the main
function…
It can be difficult to understand why we need to save $ra
. Recall that another function (i.e. the operating system) called main
. When this function called main
, it stored a return address in $ra
so that main
would know where to return when it finished executing. When main
calls fun
, it must store a return address in $ra
so that fun
knows where to return when it has finished executing. Therefore, main
must save $ra
before overwriting it with jal fun
.
Concequently, here is the prologue and epilogue of the main
function:
main:
# BEGIN PROLOGUE
addi $sp, $sp, -20
sw $s0, 0($sp)
sw $s1, 4($sp)
sw $s2, 8($sp)
sw $s3, 12($sp)
sw $ra, 16($sp)
# END PROLOGUE
...
...
ELIHW:
addi $v0, $0, 1 # syscall service number, 1 = 'print an integer'
addi $a0, $s0, 0 # argument to syscall, $a0 = integer to print
syscall # system call (prints an integer)
# BEGIN EPILOGUE
lw $s0, 0($sp)
lw $s1, 4($sp)
lw $s2, 8($sp)
lw $s3, 12($sp)
lw $ra, 16($sp)
addi $sp, $sp, 20
# END EPILOGUE
addi $v0, $0, 10 # syscall service number, 10 = program exit
syscall # system call (exit the program)
The whole MIPS program is available from here.
Recursive calls in MIPS
A recursive function is a function that calls itself. The translation of a recursive C function into its equivalent in the MIPS assembly language is done in the same way as before. The trick here is to assume the recursive call as if we were calling some different function from the function itself, then apply the MIPS calling convention for the caller and callee functions accordingly.
// C version
int fact(int n)
{
if (n<2)
return 1;
else
return (n * fact(n-1));
}
# Version MIPS (n <-> $a0, n! <-> $v0)
0 fact:
1 bge $a0, 2, else # branch to else if (n >= 2)
2 li $v0, 1 # $v0 = 1
3 jr $ra # return to the caller function
4 else:
5 addi $sp, $sp, -8 # allocate a stackframe of 8 bytes
6 sw $a0, 0($sp) # save n onto the stack
7 sw $ra, 4($sp) # save the returning address on the stack
8 addi $a0, $a0, -1 # argument $a0 = n - 1
9 jal fact # recursive call. I.e. fact(n-1)
10 lw $a0, 0($sp) # restore $a0 = n from the stack
11 lw $ra, 4($sp) # restore the returning address
12 mul $v0, $a0, $v0 # put in $v0 = n * fact(n-1)
13 addi $sp, $sp, 8 # free the stackframe
14 jr $ra # return to the caller function
The code snippet above shows an example of a recursive function (the factorial function) in C language (left) and its version in MIPS assembly (right). If (n < 2) then the function completes and returns to the caller function. There is no need to allocate a stack frame in this case. However, if (n >= 2) then the factorial function allocates an 8-byte stack frame to save the $a0
and $ra
registers before calling itself again.
The register $a0
(i.e. argument n
) is saved on the stack because it is necessary to recover the initial value of this register (used in line 12) after the recursive call returns (this register is modified in the recursive call on line 8). The register $ra
is saved on the stack because its value is modified by the recursive call (line 9).