Lab 2

Deadline: End of lab session Tuesday, July 9th

Objectives

  • The student will be able to perform specific bit manipulations through compositions of bit operations.
  • The student will become comfortable with using pointers while implementing algorithms in C.
  • The student will identify potential issues with dynamic memory management.

Setup

Pull the lab files from the lab starter repository with

git pull staff master

Exercise 1: Pointers and Structures in C

Here’s one to help you in your interviews. In ll_cycle.c, complete the function ll_has_cycle() to implement the following algorithm for checking if a singly-linked list has a cycle.

  1. Start with two pointers at the head of the list. We’ll call the first one tortoise and the second one hare.
  2. Advance hare by two nodes. If this is not possible because of a null pointer, we have found the end of the list, and therefore the list is acyclic.
  3. Advance tortoise by one node. (A null pointer check is unnecessary. Why?)
  4. If tortoise and hare point to the same node, the list is cyclic. Otherwise, go back to step 2.
  5. After you have correctly implemented ll_has_cycle(), the program you get when you compile ll_cycle.c will tell you that ll_has_cycle() agrees with what the program expected it to output.

ACTION ITEM: Implement ll_has_cycle() and execute the following commands to make sure that the provided tests pass.

$ make ll_cycle
$ ./ll_cycle

Hint: There are two common ways that students usually write this function. They differ in how they choose to encode the stopping criteria. If you do it one way, you’ll have to account for a special case in the beginning. If you do it another way, you’ll have some extra NULL checks, which is OK. The previous 2 sentences are meant to urge you to not stress over cleanliness. If they don’t help you, just ignore them. The point of this exercise is to make sure you know how to use pointers.

Here’s a Wikipedia article on the algorithm and why it works. Don’t worry about it if you don’t completely understand it. We won’t test you on this.

As a closing note, the story of the tortoise and the hare is always relevant, especially in CS61C. Writing your C programs slowly and steadily, using debugging programs like CGDB, is what will win you the race.

Exercise 2: Memory Management

This exercise uses vector.h, vector-test.c, and vector.c, where we provide you with a framework for implementing a variable-length array. This exercise is designed to help familiarize you with C structs and memory management in C.

ACTION ITEM: Explain why bad_vector_new() and also_bad_vector_new() are bad and fill in the functions vector_new(), vector_get(), vector_delete(), and vector_set() in vector.c so that our test code vector-test.c runs without any memory management errors.

Comments in the code describe how the functions should work. Look at the functions we’ve filled in to see how the data structures should be used. For consistency, it is assumed that all entries in the vector are 0 unless set by the user. Keep this in mind as malloc() does not zero out the memory it allocates.

For explaining why the two bad functions are incorrect, keep in mind that one of these functions will actually run correctly (assuming correctly modified vector_new, vector_set, etc.) but there may be other problems. Hint: think about memory usage.

ACTION ITEM: Test your implementation of vector_new(), vector_get(), vector_delete(), and vector_set() for both correctness and memory management (details below).

# 1) to check correctness:
$ make vector-test
$ ./vector-test

# 2) to check memory management using Valgrind:
$ make vector-memcheck

All the vector-memcheck rule does is run the following valgrind command on our executable. For a review, read through the valgrind exercise from Lab 1. Explain to yourself what each of the flags mean.

$ valgrind --tool=memcheck --leak-check=full --track-origins=yes [OS SPECIFIC ARGS] ./<executable>

The last line in the valgrind output is the line that will indicate at a glance if things have gone wrong. Here’s a sample output from a buggy program:

==47132== ERROR SUMMARY: 1200039 errors from 24 contexts (suppressed: 18 from 18)

If your program has errors, you can scroll up in the command line output to view details for each one. For our purposes, you can safely ignore all output that refers to suppressed errors. In a leak-free program, your output will look like this:

==44144== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 18 from 18)

Again, any number of suppressed errors is fine; they do not affect us.

Feel free to also use CGDB or add printf statements to vector.c and vector-test.c to debug your code.

Exercise 3: Intro to Assembly with RISC-V Simulator

So far, we have been dealing with C program files (.c file extension), and have been using the gcc compiler to execute these higher-level language programs. Now, we are learning about the RISC-V assembly language, which is a lower-level language much closer to machine code. For context, gcc takes the C code we write, first compiles this down to assembly code (gcc uses a more complex assembly language than RISC-V), and then assembles this down to machine code/binary.

In this lab, we will deal with several RISC-V assembly program files, each of which have a .s file extension. To run these, we will need to use a RISC-V simulator. The simulator we will use was developed by Keyhan Vakil (now a CS161 TA) and improved by Stephan Kaminsky (a former CS61C TAs). The simulator is called Venus and can be found online here.

Assembly/Venus Basics

  • Enter your RISC-V code in the “Editor” tab
  • Programs start at the first line regardless of the label. That means that the main function must be put first.
  • Programs end with an ecall with argument value 10. This signals for the program to exit. The ecall instructions are analogous to System Calls and allow us to do things such as print to the console or request chunks of memory from the heap.
  • Labels end with a colon (:).
  • Comments start with a pound sign (#).
  • You CANNOT put more than one instruction per line.
  • When you are done editing, click the “Simulator” tab to prepare for execution.

For the following exercises, please save your completed code in a file on your local machine. This is crucial for the checkoff portion to work.

Familiarizing yourself with Venus

Getting started:

  1. Paste the contents of lab2_ex3.s into the editor.
  2. Click the “Simulator” tab. This will prepare the code you wrote for execution.
  3. In the simulator, click “Assemble & Simulate from Editor”
  4. In the simulator, to execute the next instruction, click the “step” button.
  5. To undo an instruction, click the “prev” button.
  6. To run the program to completion, click the “run” button.
  7. To reset the program from the start, click the “reset” button.
  8. The contents of all 32 registers are on the right-hand side, and the console output is at the bottom
  9. To view the contents of memory, click the “Memory” tab on the right. You can navigate to different portions of your memory using the dropdown menu at the bottom.

Action Item

Record your answers to the following questions in a text file. Some of the questions will require you to run the RISC-V code using Venus’ simulator tab.

  1. What do the .data, .word, .text directives mean (i.e. what do you use them for)? Hint: think about the 4 sections of memory.
  2. Run the program to completion. What number did the program output? What does this number represent?
  3. At what address is n stored in memory? Hint: Look at the contents of the registers.
  4. Without using the “Edit” tab, have the program calculate the 13th fib number (0-indexed) by manually modifying the value of a register. You may find it helpful to first step through the code. If you prefer to look at decimal values, change the “Display Settings” option at the bottom.

Checkpoint

At this point, make sure that you are comfortable with the following. Note that these will not be part of the lab checkoff, but are meant to benchmark how comfortable you are with the material in the exercise.

  • You should know how to debug in Venus, including stepping through code and inspecting the contents of registers.
  • You should understand how RISC-V interfaces with memory. Note that this is different from how you might think about C since C doesn’t have registers.

Exercise 4: Translating from C to RISC-V

Open the files lab2_ex4_c.c and lab2_ex4_assembly.s The assembly code provided (.s file) is a translation of the given C program into RISC-V. Paste the contents of lab2_ex4_assembly.s into the editor.

Action Item

Find/explain the following components of the assembly file and put your answers in a text file.

  • The register representing the variable k.
  • The registers acting as pointers to the source and dest arrays.
  • The assembly code for the loop found in the C code.
  • How the pointers are manipulated in the assembly code.

After you’ve answered explained the above components, edit lab2_ex4_assembly.s so that it dest satisfies the following conditions.

  • dest[i] = 2 * source[i] for even i
  • dest[i] = 1 for odd i

Hint: This can be done by adding one line of code and modifying another (in other words, you only need to make 2 changes). Look at the initial values of dest; how does this help you implement this modification?

Verify that your changes work for the given source and dest arrays by running your code in a new Venus tab and check that the output looks like:

3 1 4 1 5 9
6 1 8 1 10 1

Checkpoint

Make sure you are comfortable with the following.

  • Broadly speaking, you should know how loops are constructed in RISC-V.
  • You should know how to create and access/modify global arrays in RISC-V.

Checkoff

Exercise 1:

  • Show your TA/AI the output of running ll_cycle.

Exercise 2:

  • Explain to your TA/AI why bad_vector_new() and also_bad_vector_new() are bad. Also, show your TA/AI the output of running the program as well as the output of make vector-memcheck.

Exercise 3:

  • Show action item answers to your TA/AI.

Exercise 4:

  • Use make lab2_ex4_assembly to run the assembly code through Venus locally and show the output to your TA/AI.

Bonus Exercise: Some More Practical Valgrind

This exercise isn’t for a checkoff. Instead, we included it because we think it will be helpful for students working on Project 1. If you have time now, try doing this exercise.

When you used Valgrind before, it may not have looked very useful or it may have been somewhat hard to understand what was going on. For this exercise, we’re going to look at some more practical and interpretable examples of code where we can use Valgrind. We have provided 4 programs for you to run Valgrind on: program_a.c, program_b.c, program_c.c, and program_d.c.

You can run a memory check on a program by running the following command, shown for program_a.c:

$ make a-memcheck
gcc -ggdb -Wall -std=c99 -o program_a program_a.c; \
valgrind --tool=memcheck --leak-check=full --track-origins=yes ./program_a; \
rm program_a
==29708== Memcheck, a memory error detector
==29708== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==29708== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==29708== Command: ./program_a
==29708==
*pi is 3
==29708==
==29708== HEAP SUMMARY:
==29708==     in use at exit: 4 bytes in 1 blocks
==29708==   total heap usage: 3 allocs, 2 frees, 1,032 bytes allocated
==29708==
==29708== 4 bytes in 1 blocks are definitely lost in loss record 1 of 1
==29708==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==29708==    by 0x400611: main (program_a.c:14)
==29708==
==29708== LEAK SUMMARY:
==29708==    definitely lost: 4 bytes in 1 blocks
==29708==    indirectly lost: 0 bytes in 0 blocks
==29708==      possibly lost: 0 bytes in 0 blocks
==29708==    still reachable: 0 bytes in 0 blocks
==29708==         suppressed: 0 bytes in 0 blocks
==29708==
==29708== For counts of detected and suppressed errors, rerun with: -v
==29708== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)</pre>

Alternatively, you can compile each C file yourself, and then run the Valgrind command with whatever specific flags you would like (for example the flag -v):

// compile program_a
$ gcc -o program_a program_a.c

// now run a memory check using Valgrind
$ valgrind --tool=memcheck --leak-check=full --track-origins=yes [OS SPECIFIC ARGS] ./program_a

Read each C file as well as run make X-memcheck (replace X with a, b, c, or d) on every program and try to determine what is happening that is causing memory problems in each program (e.g. memory out of bounds, segmentation fault, etc.).

Bonus Exercise Solutions

program_a: memory leak (modified and lost pointer to malloc'd memory)

program_b: memory out of bounds (outstep bound of array)

program_c: illegal use of memory (trying to use freed memory)

program_d: segmentation fault (accessing invalid memory)