Lab 4: RISC-V Calling Convention

Deadline: Friday, September 23, 11:59:59 PM PT

Setup

In your labs directory, pull the files for this lab with:

git pull starter main

If you get an error like the following:

fatal: 'starter' does not appear to be a git repository
fatal: Could not read from remote repository.

make sure to set the starter remote as follows:

git remote add starter https://github.com/61c-teach/fa22-lab-starter.git

and run the original command again.

RISC-V Simulator

Like last week, we will be using the Venus RISC-V simulator. Also, please refer to the Venus reference on our course website when you need a refresher on any of the Venus features.

Mount the lab 4 files as you did with Lab 3.

Calling Convention Review

We have provided a review of calling convention below for your reference. We recommend reading over it if you are not feeling comfortable with calling convention.

The RISC-V CPU that we are using only has 32 registers and not all of those registers can be used for data storage by every program. You might think this is a problem if ever we are planning to implement complex programs that need to hold several data variables. However, this is not really a limitation since we have access to the stack memory which we can also use for temporary storage. The data in the registers can be saved somewhere in the stack, so that the same registers can be reused by other functions that will be called (callee). Later on, the data saved in the stack can then be retrieved again back to the registers after the function returns back to the calling function (caller). There is a convention that was set that dictates who (between the caller and the callee) is responsible in saving the registers into the stack. This is called the RISC-V calling convention.

The 32 registers in RISC-V are divided into two groups: caller saved registers and callee saved registers. The RISC-V reference card highlights them. In general, the temporary registers (those named t0 to t6) and function arguments and return values (those named a0 to a7) are caller saved registers. This means that it is the responsibility of the calling function to save these registers into the stack first before calling another function. The function that will be called has the freedom to modify any of these registers. When the callee returns, these registers might have already changed their values. Thus, it is the responsibility of the caller to retrieve the original register values from the stack. The saved registers (those named s0 to s11), on the other hand, are called callee saved registers. This means that it is the responsibility of the callee (the function that was called) to save these registers into the stack before changing their values if ever. Afterwards, the callee is free to use these registers. However, before the callee returns, if ever the saved registers were indeed modified, it would have to retrieve the original values from the stack. With this calling convention, we observe the following: 1) The caller will assume that the values in saved registers will not change after a function call (because the callee should be saving them first); 2) The callee will assume that the values in the temporary registers and function argument registers are free to be modified in any way (because the caller should be saving them first).

So now, let's put this calling convention into practice. Let's say we have the following assembly code structure for a function named func1. Let's say func1 is a function being called by another function (let's say the main function). From the code, you will also see that func1 will also call another function named func2. When func1 was called by the main function, func1 was the callee. However, when func1 calls func2, func1 becomes the caller. Therefore, func1 will have to assume the responsibilities of being a callee and a caller when following the calling convention.

func1: # modifies a0, t0, and s0. ra points to the `main` function.
    # Checkpoint 1: What do you need to do before you start modifying registers?
    
    # Some block of code using a0, t0, and s0
    # Checkpoint 2: What do you need to do before you call another function?
    
    # input argument at a0, return value at a0  
    jal ra, func2  # call func2
    # Checkpoint 3: What do you need to do after a function call?
    
    # Some block of code using a0, t0, and s0
    # Checkpoint 4: What do you need to do before this function returns?
    
    jr ra   # function return

Saving values into the stack is done using the sw instruction to the appropriate stack pointer (sp). Retrieving values from the stack is done using the lw instruction on the corresponding stack pointer. Whenever you want to save values into the stack, you need to adjust the stack pointer (by decreasing the stack pointer). Whenever you want to retrieve values from the stack, you need to adjust the stack pointer back in reverse (by increasing the stack pointer). Doing these procedures will ensure that the stack pointer will end up being the same at the start and end of the function call. That is how functions utilize the stack memory as temporary storage.

Let's try to fill in the code for every checkpoint in the code.

For Checkpoint 1, since func1 is the callee (called my main), it has to save the callee saved registers that it will be modifying. The code says that we will be modifying s0 which is a saved register. Moreover, since this func1 will also be calling another function func2, it will also need to save the return address (ra). While the ra register is labelled as a caller saved register, saving ra into the stack is typically done at this step. Since we will be saving 2 registers, we will have to adjust the sp register by 2 words (8 bytes). Accordingly, we have the following code:

    # Checkpoint 1: What do you need to do before you start modifying registers?
    addi sp sp -8   # Push the stack pointer down by 2 words (8 bytes)
    sw ra 0(sp)     # Save the return address register (ra)
    sw s0 4(sp)     # Save the saved register (s0)

For Checkpoint 2, we see that a0, t0, s0 have been modified before, and we are now about to call func2 (func1 now becomes the caller). We see that func2 takes in the input argument at a0 and sets the return value also at the same register. Moreover, we see that later on in the code (after Checkpoint 3), we will be using t0 and s0 registers again. Thus, in the perspective of func1, both of these registers should be unchanged before and after the function call to func2. We know that if func2 is following the calling convention, it should be saving the s0 register into the stack and retrieving it back to the original value before it returns. That is something that we want. However, t0 is not saved by func2 and will have the freedom to change it within the function. Thus, as the caller to func2, func1 is now responsible in saving t0 into the stack as well. Accordingly, we have the following code:

    # Checkpoint 2: What do you need to do before you call another function?
    addi sp sp -4   # Push the stack pointer down by 1 word (4 bytes)
    sw t0 0(sp)     # Save the temporary register (t0)

For Checkpoint 3, func2 has now returned with the return value stored at a0. func1 is now about to do some operations using this return value (a0) and the t0 and s0 registers. Again, if func2 followed the calling convention, the value of s0 register should've been unchanged in the perspective of func1. However, t0 might have been modified. Good thing is that we have already saved it in the stack before calling func2. We just have to retrieve it now. Accordingly, we have the following code:

    # Checkpoint 3: What do you need to do after a function call?
    lw t0 0(sp)     # Retrieve the saved temporary register from the stack
    addi sp sp 4    # Return the stack pointer up by 1 word (4 bytes)

For Checkpoint 4, we are now at the point where func1 has finished doing its operations and will be returning back to main. However, before it returns, it has to make sure it has accomplished its callee responsibilities as well. Earlier in Checkpoint 1, we saved the ra and s0 registers into the stack. It is now time to retrieve them. During the operation of func1, it has modified the s0 register and the ra register (due to the function call to func2). As the callee, it has the responsibility to bring them back to their original values such that in the perspective of the main function, these registers were unchanged. Accordingly, we have the following code:

    # Checkpoint 4: What do you need to do before this function returns?
    lw s0 4(sp)     # Retrieve the original saved register (s0)
    lw ra 0(sp)     # Retrieve the original return address (ra). This points back to the main function.
    addi sp sp 8    # Return the stack pointer up by 2 words (8 bytes)

And with this, we have now satisfied the RISC-V calling convention for func1. Here is the complete code:

func1: # modifies a0, t0, and s0. ra points to the `main` function.
    # Checkpoint 1: What do you need to do before you start modifying registers?
    addi sp sp -8   # Push the stack pointer down by 2 words (8 bytes)
    sw ra 0(sp)     # Save the return address register (ra)
    sw s0 4(sp)     # Save the saved register (s0)
    
    # Some block of code using a0, t0, and s0
    # Checkpoint 2: What do you need to do before you call another function?
    addi sp sp -4   # Push the stack pointer down by 1 word (4 bytes)
    sw t0 0(sp)     # Save the temporary register (t0)
    
    # input argument at a0, return value at a0  
    jal ra, func2  # call func2
    # Checkpoint 3: What do you need to do after a function call?
    lw t0 0(sp)     # Retrieve the saved temporary register from the stack
    addi sp sp 4    # Return the stack pointer up by 1 word (4 bytes)
    
    # Some block of code using a0, t0, and s0
    # Checkpoint 4: What do you need to do before this function returns?
    lw s0 4(sp)     # Retrieve the original saved register (s0)
    lw ra 0(sp)     # Retrieve the original return address (ra). This points back to the main function.
    addi sp sp 8    # Return the stack pointer up by 2 words (8 bytes)
    
    jr ra   # function return

Let's say register sp starts at 0x7FFFFFF0 at the start of func1 call.

At what memory address is s0 saved at?

The stack pointer was adjusted -8, but s0 is stored at 4 + sp. Thus, 0x7FFFFFF0 - 8 + 4 = 0x7FFFFFEC

At what memory address is t0 saved at?

The stack pointer was adjusted -8 earlier, then before saving t0, we adjust it again by -4. Thus, 0x7FFFFFF0 - 8 - 4 = 0x7FFFFFE4

At the end of func1 (at the line jr ra), where is the stack pointer pointing to?

It should be back at the original value since we have retrieved the saved data and returned the stack pointer afterwards. Thus, 0x7FFFFFF0.

If we didn't implement the code in Checkpoint 2 and 3, what can go wrong?

After Checkpoint 3, it was described that func1 will be using t0 again. Since t0 is a caller saved register, the callee (which is func2) has the freedom to change t0 and does not need to return the original value back. If t0 was indeed modified by func2, then the calculations done by func1 will be wrong since the function call changed one of its temporary variables.

If we didn't save ra at all, what can go wrong?

ra points to the address of the calling function. At the start of func1, it is pointing to somewhere in the main function. Whenever you run the jal instruction, it modifies ra such that it points to the next line after that instruction. This is done so that when jr ra is run, it will continue execution to where the ra is pointing at. func1 has a function call to func2. This changes ra such that it will point to the instruction after that (lw t0 0(sp)). This is now pointing somewhere in func1 itself. If ra was not saved at all, then when func1 executes the jr ra instruction at the end, it will jump back to that instruction (lw t0 0(sp)), which is not the intended execution flow.

Exercise 1: Translating from C to RISC-V (with calling convention)

In this exercise, you will be guided through how to translate the below C program into RISC-V. If you are looking for an additional challenge, you can translate the code first before looking at the solution.

int source[] = {3, 1, 4, 1, 5, 9, 0};
int dest[10];

int fun(int x) {
	return -x * (x + 1);
}

int main() {
    int k;
    int sum = 0;
    for (k = 0; source[k] != 0; k++) {
        dest[k] = fun(source[k]);
        sum += dest[k];
    }
    printf("sum: %d\n", sum);
}

Let's start with initializing the source and dest arrays. Just like we did in Lab 3, we need to declare our arrays in the .data section as seen below:

.data
source:
    .word   3
    .word   1
    .word   4
    .word   1
    .word   5
    .word   9
    .word   0
dest:
    .word   0
    .word   0
    .word   0
    .word   0
    .word   0
    .word   0
    .word   0
    .word   0
    .word   0
    .word   0

Next, let's write fun.

int fun(int x) {
	return -x * (x + 1);
}

Calling convention states that

We can find x in register a0.
We must put our return value in register a0

The rest of the code is explained in the comments below.

.text
fun:
    addi t0, a0, 1 # t0 = x + 1
    sub t1, x0, a0 # t1 = -x
    mul a0, t0, t1 # a0 = (x + 1) * (-x)
    jr ra # return

Ask yourself: why did we not save t1 before using it?

t1 is a caller saved register. Calling convention does not guarantee that caller saved registers will remain unchanged after a function call. Therefore, fun can change t1 without saving its old value. If the function that called fun had a value stored in t1 that it wants to use after fun returns, it would need to save t1 before calling fun.

Let's move onto main. (We are going to ignore calling convention for a minute).

int main() {
    int k;
    int sum = 0;

The above code becomes the following:

main:
    addi t0, x0, 0 # t0 = k = 0
    addi s0, x0, 0 # s0 = sum = 0

We have to initialize k to 0 because there is no way to declare a variable in RISC-V and not set it.

Next, let's load in the address of the two arrays.

    la s1, source
    la s2, dest

Remember that la loads the address of a label. This is the only way that we can access the address of source and dest. s1 is now a pointer to the source array and s2 is now a pointer to the dest array.

Let's move on to the loop.

for (k = 0; source[k] != 0; k++) {
    dest[k] = fun(source[k]);
    sum += dest[k];
}

First, we'll construct the outer body of the loop.

loop:
#1  slli s3, t0, 2 
#2  add t1, s1, s3 
#3  lw t2, 0(t1) 
#4  beq t2, x0, exit 
    ...
#5  addi t0, t0, 1
#6  jal x0, loop
exit:

Lines 1-3 are needed to access source[k]. First we want to compute the byte offset of the element. We are dealing with int arrays, so the size of each element is 4 bytes. This means that we need to multiply t0 (k) by 4 to compute the byte offset. To multiply a value by 4, we can just shift it left by 2.
Next, we need to add the offset to the array pointer to compute the address of source[k].
Now that we have the address, we can load the value in from memory.
Then, we check to see if source[k] is 0. If it is, we jump to the exit.
At the end of the loop, we increment k by 1
Finally, we loop back the to beginning

Now, Let's fill in the rest of the loop (ignoring calling convention at first)

loop:
    slli s3, t0, 2
    add t1, s1, s3
    lw t2, 0(t1)
    beq t2, x0, exit
#1  add a0, x0, t2 # 1
    ...
#2  jal fun # 2
    ...
#3  add t3, s2, s3 # 4
#4  sw a0, 0(t3) # 5
#5  add s0, s0, a0 # 6
    addi t0, t0, 1
    jal x0, loop
exit:

Fun takes in the argument x. We must pass this argument through a0 so that fun will know where to find it.
Call fun. jal automatically saves the return address in ra.
Next, we want so store this value in dest. First we need to compute the address of where we want to store the value in dest. Remember that we can reuse the offset that we computed earlier (this can be found in s3). s2 is a pointer to the beginning of dest.
Store value at dest[k]. Remember that fun placed the return value in a0.
Increment sum by dest[k]

Now, let's add in the proper calling convention around jal fun. Before scrolling down, ask yourself what code we need to add to meet calling convention.

To meet calling convention (and therefore have our code behave as expected), we need to save any caller saved registers whose values we want to remain the same after calling fun. In this case, we can see that we use registers t0, t1, t2, and t3 in main.

Do we need to save and restore all of these registers?

No, we only need to save and restore t0. We use t1 and t2 before fun, but we don't reuse their after. We write to t3 after fun, so we don't care what its old value was.

t0 is the only caller saved register we have whose value must stay the same before and after fun.

Let's add the proper calling convention code around jal fun.

addi sp, sp, -4
sw t0, 0(sp)
jal fun
lw t0, 0(sp)
addi sp, sp, 4

Next, let's move on to exit (excluding calling convention for the moment).

exit:
    addi a0, x0, 1 # argument to ecall, 1 = execute print integer
    addi a1, s0, 0 # argument to ecall, the value to be printed
    ecall # print integer ecall
    addi a0, x0, 10 # argument to ecall, 10 = terminate program
    ecall # terminate program

The final sum is stored in s0. To return this value, we need to store it in a0.

Now we have completed the logic of our program. Next we need to finish up calling convention for main.

Think to yourself, which piece of the calling convention is missing?

We are overwriting callee saved registers without saving them! Remember that it is the callee's job to ensure that callee saved registers have the same value at the start and end of the function.

Which callee saved registers do we need to save?

We need to save any callee saved registers that we used. We don't know which of these registers are being used by the function that called us, so we have to save all of the ones that we overwrite. In this case, that would be registers s0-s3 and ra.

It might be tricky understanding why we need to save ra. Remember that another function called main. When that function called main, it stored a return address in ra so that main would know where to return to when it finished executing. When main calls fun, it needs to store a return address in ra so that fun knows where to return to when it finishes executing. Therefore, main must save ra before it overwrites it.

Below, you can find the prologue and epilogue for main:

main:
    # BEGIN PROLOGUE
    addi sp, sp, -20
    sw s0, 0(sp)
    sw s1, 4(sp)
    sw s2, 8(sp)
    sw s3, 12(sp)
    sw ra, 16(sp)
    # END PROLOGUE
    ...
    ...
exit:
    addi a0, x0, 1 # argument to ecall, 1 = execute print integer
    addi a1, s0, 0 # argument to ecall, the value to be printed
    ecall # print integer ecall
    # BEGIN EPILOGUE
    lw s0, 0(sp)
    lw s1, 4(sp)
    lw s2, 8(sp)
    lw s3, 12(sp)
    lw ra, 16(sp)
    addi sp, sp, 20
    # END EPILOGUE
    addi a0, x0, 10 # argument to ecall, 10 = terminate program
    ecall # terminate program

You can find the entire program in ex1.s.

Exercise 2: Calling Convention Checker

Calling convention errors can cause bugs in your code that are difficult to find. The calling convention checker is used to detect calling convention violations in your code. To enable the calling convention checker, go to the Venus tab at the top of the page. In the settings box, click on "Calling Convention" and click "Enable"

Run cc_tests.s in the simulator tab.

Alternatively, you can run Venus locally with the following command:

java -jar tools/venus.jar -cc lab04/cc_test.s

The -cc flag enables the calling convention checker, and detects some basic violations.

In the terminal, you should see something similar to the following.

[CC Violation]: (PC=0x0000004C) Setting of a saved register (s0) which has not been saved! cc_test.s:56 li s0, 1
[CC Violation]: (PC=0x00000054) Setting of a saved register (s0) which has not been saved! cc_test.s:59 mul s0, s0, a0
[CC Violation]: (PC=0x00000054) Setting of a saved register (s0) which has not been saved! cc_test.s:59 mul s0, s0, a0
[CC Violation]: (PC=0x00000054) Setting of a saved register (s0) which has not been saved! cc_test.s:59 mul s0, s0, a0
[CC Violation]: (PC=0x00000054) Setting of a saved register (s0) which has not been saved! cc_test.s:59 mul s0, s0, a0
[CC Violation]: (PC=0x00000054) Setting of a saved register (s0) which has not been saved! cc_test.s:59 mul s0, s0, a0
[CC Violation]: (PC=0x00000054) Setting of a saved register (s0) which has not been saved! cc_test.s:59 mul s0, s0, a0
[CC Violation]: (PC=0x00000054) Setting of a saved register (s0) which has not been saved! cc_test.s:59 mul s0, s0, a0
[CC Violation]: (PC=0x00000064) Save register s0 not correctly restored before return! Expected 0x00000000, Actual 0x00000080. cc_test.s:66 ret
[CC Violation]: (PC=0x00000070) Setting of a saved register (s0) which has not been saved! cc_test.s:80 mv s0, a0 # Copy start of array to saved register
[CC Violation]: (PC=0x00000074) Setting of a saved register (s1) which has not been saved! cc_test.s:81 mv s1, a1 # Copy length of array to saved register
[CC Violation]: (PC=0x000000A4) Setting of a saved register (s0) which has not been saved! cc_test.s:115 addi s0, t1, 1
Found 12 warnings!
--------------------
[ERROR] An error has occurred!

Error:
`SimulatorError: Attempting to access uninitialized memory between the stack and heap. Attempting to access '4' bytes at address '0x63080013'.

More information about these errors can be found in the Venus reference.

Note: Venus's calling convention checker will not report all calling convention bugs; it is intended to be used primarily as a basic check. Most importantly, it will only look for bugs in functions that are exported with the .globl directive - the meaning of .globl is explained in more detail in the Venus reference.

Action Items

Resolve all the calling convention errors in cc_test.s.

The fixes for all of these errors (both the ones reported by the CC checker and the ones it can't find) should be added near the lines marked by the FIXME comments in the starter code.
After you finish the exercise, be sure that you can answer the following questions.

Is next_test a function?

No, it's just a label that is used to skip over the call to failure. If a label is a function, we will jump and link to it because we always want our functions to return back to us. In this case, we just jumped to next_test.

What caused the errors in pow, and inc_arr that were reported by the Venus CC checker?

pow: Missing epilogue and prologue.
inc_arr: Failure to save s0, s1 in prologue/epilogue, and failure to save t0 before calling helper_fn.

In RISC-V, we call functions by jumping to them and storing the return address in the ra register. Does calling convention apply to the jumps to the pow_loop or pow_end labels?

No, since they're not functions, we don't need to return to the location the function was called.

Why do we need to store ra in the prologue for inc_arr, but not in any other function?

inc_arr itself calls another function - ra holds the address of the instruction to continue executing after returning, which is overwritten when we call another function since we need to be able to return to the body of inc_arr.

Why wasn't the calling convention error in helper_fn reported by the CC checker? (Hint: it's mentioned above in the exercise instructions.)

It's not declared .globl. The calling convention checker will not test functions that are not declared .globl. Note: The bug in helper_fn will initially be reported by the checker, but when the bugs in the function that calls it are fixed, this one is no longer reported.

Testing

After fixing the errors in cc_test.s, rerun the program to make sure the behavior of the functions hasn't changed and that you've remedied all calling convention violations.

Once you have fixed everything, running the above Venus command should output the following:

Tests passed.
Found 0 warnings!

Exercise 3: List Manipulation

In this exercise, you will be writing a program that will square each element withing in a
list linked list of int arrays. The corresponding C function will look like this

struct node {
    int *arr; // pointer to the beginning of the array
    int size; // size of the array of the current node
    struct node *next; // pointer to the next node
};

void update_list(struct node *curr_node) {
    if (!curr_node) return; 
    for (int i = 0; i < curr_node->size; i++) {
      curr_node->arr[i] = square(curr_node->arr[i]);
    }
    update_list(curr_node->next);
}

The venus calling convention checker is not able to catch all calling convention bugs. When your code goes through the autograder, it performs some additional calling convention tests. In order to perform these tests, we need to modify the update_list function slightly so that it looks like the following:

void update_list(struct node *curr_node, int (*square)(int)) {
    if (!curr_node) { return; }
    for (int i = 0; i < curr_node->size; i++) {
      curr_node->arr[i] = square(curr_node->arr[i]);
    }
    update_list(curr_node->next, square);
}

Here, we passed in a pointer to the square function. You don't need to worry about how function pointers work, but if you would like to understand them, you can read about it here). We have already implemented the parts of the program that deal with handling the function pointer. In essence, all you need to understand is that instead of using jal square to call the square function, we will use jalr s1 to call the function (where s1 holds the pointer to the function).

Adel has provided a detailed example on implementing a recursive function call in RISC-V. We highly recommend that you check it out below to help you understand how the stack is used during recursive calls.

Adel's Recursive Function Notes

The importance of saving temporary variables into the stack is more evident when implementing recursive functions. Here's a function that computes for the factorial using recursion.

int factorial(int n){
    if(n==1) return n;
    else return factorial(n-1)*n;
}

Using RISC-V instructions, we have the following:

# a0 = n: input of the function
factorial: 
    addi sp sp -8      # Push the stack pointer down by 2 words (8 bytes)
    sw ra 0(sp)        # Save the return address register (ra)
    sw s0 4(sp)        # Save the saved register (s0)
    # if (n==1), return n;
    addi t0 zero 1     # Set t0 to 1
    beq a0 t0 done     # Compare the input against 1
    # s0 = current 'n'
    mv s0 a0           # Use s0 to save a0 before modifying it
    # factorial(n-1)
    addi a0 a0 -1      # Set a0 = a0-1 in preparation for recursion
    jal ra factorial   # Recursion
    # return factorial(n-1)*n
    mul a0 a0 s0       # Perform the multiplication
done:
    lw ra 0(sp)        # Retrieve the original return address (ra)
    lw s0 4(sp)        # Retrieve the original saved register (s0)
    addi sp sp 8       # Return the stack pointer up by 2 words (8 bytes)
    jr ra              # Function return

You will note that this function implementation already saves the relevant variables into the stack for proper operation. Just like in the Calling Convention Review section above, ra and s0 are saved into the stack at the start of the function call and are retrieved back to the original values before the function returns. However, one interesting thing about recursive functions is that the function calls itself before it even returns. What is the implication of this? Let's break up the code into the relevant parts:

# a0 = n: input of the function
factorial: 
    addi sp sp -8      # Push the stack pointer down by 2 words (8 bytes)
    sw ra 0(sp)        # Save the return address register (ra)
    sw s0 4(sp)        # Save the saved register (s0)
    ...
    jal ra factorial   # Recursion

This first half of the code starts from the actual function label and ends with the call to the same function label. For every call to factorial, the stack pointer is pushed down by 8 bytes, and a new set of ra and s0 values are saved into the stack. This process continues until we reach the terminating condition, which is the check for n==1.

    ...
    # if (n==1), return n;
    addi t0 zero 1     # Set t0 to 1
    beq a0 t0 done     # Compare the input against 1
    ...

Once this condition is satisfied by the last factorial function call, it proceeds to the done label then returns the stack pointer back up by 8 bytes, as shown below:

    ...
done:
    lw ra 0(sp)        # Retrieve the original return address (ra)
    lw s0 4(sp)        # Retrieve the original saved register (s0)
    addi sp sp 8       # Return the stack pointer up by 2 words (8 bytes)
    jr ra              # Function return

It will then return (through jr ra) to a previous instance of a factorial function call. The return point is right after the jal ra factorial line:

    # return factorial(n-1)*n
    mul a0 a0 s0       # Perform the multiplication
done:
    ...

This return point performs the multiplication needed for the factorial, and then performs the same set of instructions in the done label. This will allow this instance of the factorial function to return back to its caller, which is either another instance of the factorial function (in which case it performs the multiplication again and return the cumulative product through 'a0') or the original caller of the factorial function (let's say the main function).

In summary, the stack continually grows downward for every factorial recursive call. The final call to factorial will happen when the input equals 1, which then returns a0 = 1. Once that last instance of the factorial returns, the multiplication process done by the previous instances of the factorial function can proceed and will, one after the other, return the cumulative updated values on the a0 register. Finally, when we go back to the very first factorial instance, the saved ra will be retrieved from the stack. This saved ra will point back to the main function, which then ends the recursion.

Let's say register sp starts at 0x7FFFFFF0 at the start of factorial call, and that a0 = n = 5.

What was the lowest address that the stack pointer reached during the entire function execution?

If we start at a0 = 5, this means there will be 5 calls to the factorial function. Each call will push the stack pointer by -8 every time. Thus, 0x7FFFFFF0 - 8(5) = 0x7FFFFFC8.

For every 8 bytes, we push the stack pointer, we store 2 words. s0 contains a copy of a0 within every factorial instance.

At what address is s0 = 3 stored at?

From the code, we see that s0 is stored at sp -8 +4. The very first factorial call stores s0 = 5 at address 0x7FFFFFF0 - 8 + 4 = 0x7FFFFFEC. sp now points at 0x7FFFFFE8. The next call to factorial stores s0 = 4 at address 0x7FFFFFE8 -8 +4 = 0x7FFFFFE4. sp now points to 0x7FFFFFE0. The next call to factorial stores s0 = 3 at address 0x7FFFFFE0 -8 +4 = 0x7FFFFFDC.

How many duplicate ra values are stored in the stack?

The very first factorial call will store ra, which points somewhere in the main function, to the stack. The successive calls to factorial due to recursion will always point at the instruction mul a0 a0 s0 since that is the instruction right after the jal ra factorial. That is the next instruction to execute once the factorial call returns within the recursion. All successive calls to factorial will always save the same ra, since they are the same function in the end. There are 5 calls to the factorial function, only the first call saved a different ra than the rest. Thus, we have 4 duplicates.

Action Item

For this exercise, we are requiring that you don't use any extra save registers in your implementation. While you normally can use the save registers to store values that you want to use after returning from a function (in this case, when we're calling square in update_list), we want you to use temporary registers instead and follow their caller/callee conventions. The provided update_list implementation only uses the s0 and s1 registers, so we'll require that you don't use s2-s11.

Note: The CC checker won't check if you are using registers besides s0 and s1, but the Gradescope autograder will.

Your task is to fill in all of the lines marked TODO.

Testing

Save your code in the list_manipulation.s file. Use the -cc flag to run a basic calling convention check on your code locally:

java -jar tools/venus.jar -cc lab04/list_manipulation.s

The CC checker should report 0 warnings.

For reference, running list_manipulation on the web interface should give the following output:

Lists before:
5 2 7 8 1
1 6 3 8 4
5 2 7 4 3
1 2 3 4 7
5 6 7 8 9

Lists after:
25 4 49 64 1 
1 36 9 64 16 
25 4 49 16 9 
1 4 9 16 49 
25 36 49 64 81

Survey

Please fill out this short survey about your experience with the lab. Your responses will be used to improve the lab in the future. The survey will be collecting your email to verify that you have submitted it, but your responses will be anonymized when the data is analyzed. Thank you!

Submission

Save, commit, and push your work, then submit to the Lab 4 assignment on Gradescope.