Lab 2: Advanced C and Valgrind

Deadline: Thursday, July 1, 11:59:59 PM PT

Setup

To get the starter files for this lab, run the following command in your labs directory.

git pull starter main

If you get an error like the following:

fatal: 'starter' does not appear to be a git repository
fatal: Could not read from remote repository.

make sure to set the starter remote as follows:

git remote add starter https://github.com/61c-teach/su21-lab-starter.git

and run the original command again.

Exercise 1: Make

As you saw in Lab 1, compiling C programs in the terminal is a tedious and time-consuming operation. While this is doable for simple C programs, for larger and more complex programs with dozens of files and dependencies, this gets rather unwieldy quickly. Additionally, if we are editing code in one file in a large code base, we would like to rebuild the minimal amount of files rather than the entire project whenever we want to run our code. To solve these issues, we can use a program called make.

What is make?

Here is an excerpt from the gnu manual for make to explain how this works:

The make utility automatically determines which pieces of a large program need to be recompiled, and issues commands to recompile them.

...

To prepare to use make, you must write a file called the makefile that describes the relationships among files in your program and provides commands for updating each file. In a program, typically, the executable file is updated from object files, which are in turn made by compiling source files.

Once a suitable makefile exists, each time you change some source files, this simple shell command:
make
suffices to perform all necessary recompilations. The make program uses the makefile data base and the last-modification times of the files to decide which of the files need to be updated. For each of those files, it issues the recipes recorded in the data base.

Structure of a Makefile

A simple makefile consists of “rules” with the following shape:
target … : prerequisites …
        recipe
        …
        …
A target is usually the name of a file that is generated by a program; examples of targets are executable or object files. A target can also be the name of an action to carry out, such as ‘clean’ (see Phony Targets).

A prerequisite is a file that is used as input to create the target. A target often depends on several files.

A recipe is an action that make carries out. A recipe may have more than one command, either on the same line or each on its own line. Please note: you need to put a tab character at the beginning of every recipe line! This is an obscurity that catches the unwary.

A rule, then, explains how and when to remake certain files which are the targets of the particular rule. make carries out the recipe on the prerequisites to create or update the target. A rule can also explain how and when to carry out an action.

Example

We have created a Makefile that compiles the code from Lab1/exercise3. Read over the code in the Makefile.

There are two variables defined at the top of the file. To access these variables, we use this syntax: $(var_name).
The first target all specifies the default goal meaning which targets make will consider when no target is specified on the command line. all currently specifies linked_list. If you want to add more targets to default goal, you should add them to this line.

No target specified:
```
$ make
```
Target specified:
```
$ make linked_list
```
We tried to make this file as simple as possible to make it easier for you to understand. Make offers automatic variables to shorten recipies and prerequisit lists. If you want to learn more about this, you can find more info here (specifically look at $@ and $^).

Run the following command to compile the code and generate an executable called linked_list.
```
$ make
```
When you run make, you can see that it echoes the list of commands that are executed. Your output should look something like this:
```
gcc -c linked_list.c
gcc -c test_linked_list.c
gcc -o linked_list linked_list.o test_linked_list.o
```
Run make again, it should output somthing like this:
```
make: Nothing to be done for 'all'
```
None of the files that linked_list depends on were updated since the last time you invoked make, so make did not do anything.
The Makefile also contains a target for deleting the generated executable and object files. To execute this rule, use the following command
```
$ make clean
```
Again, make echoed the commands that it ran. It should look something like this:
```
rm linked_list linked_list.o test_linked_list.o
```
Run make now, it will recompile everything.
Make an edit to linked_list.c and then run make. You should see that linked_list.c is recompiled, but test_linked_list.c is not recompiled.

If you would like to see another example, you can find one here

Action Item

Exercises 3 and 4 of this lab require running the following commands to compile their code:

Exercise 3

$ gcc -o bit_ops bit_ops.c test_bit_ops.c

Exercise 4

gcc -o vector vector.c test_vector.c

Update Makefile so that it compiles the code in Exercises 3 and 4.

Remember to update the all and clean targets

Exercise 2: Valgrind

Even with a debugger, we might not be able to catch all bugs. Some bugs are what we refer to as “bohrbugs”, meaning they manifest reliably under a well-defined, but possibly unknown, set of conditions. Other bugs are what we call “heisenbugs”, and instead of being determinant, they’re known to disappear or alter their behavior when one attempts to study them. We can detect the first kind with debuggers, but the second kind may slip under our radar because they’re (at least in C) often due to mis-managed memory. Remember that unlike other programming languages, C requires you (the programmer) to manually manage your memory.

We can use a tool called Valgrind to help catch to help catch “heisenbugs” and "bohrbugs". Valgrind is a program which emulates your CPU and tracks your memory accesses. This slows down the process you’re running (which is why we don’t, for example, always run all executables inside Valgrind) but also can expose bugs that may only display visible incorrect behavior under a unique set of circumstances.

Using Valgrind to find segfaults

In Lab 1, you leaned how to find segfaults using cgdb. You can also use Valgring to find segfaults.

Edit the Makefile to include the -g flag in CFLAGS to provide debugging information to Valgrind.
Compile linked_list.c and test_linked_list.c by executing make.
Run valgrid on the executable using the following command:
```
$ valgrind ./linked_list
```

Memcheck is the default tool when you invoke Valgrind. The documentation on Valgrind's memcheck is very useful, as it provides examples of the most common error messages, what they mean, and some optional arguments you can use to help debug them.

Your output should look something like this. There is a lot of information here, so let's parse through it together.

Box 1. This shows us the command that we are running through Valgrind.

Box 2. This is a print statement from our program.

Box 3. We are reading 8 bytes from an invalid memory address on linked_list.c line 62.

Box 4. Our program received a segfault by accessing invalid memory on linked_list.c line 62

Box 5. There were no memory leaks at the time that the program exited.

Box 6. We encountered 1 error.

Copy over your solution from Lab01/exercise3 to fix this error. Run your code through Valgrind again, and you should see that there are no errors reported by valgrind.

Using Valgrind to detect memory leaks

Let's cause a memory leak in test_linked_list.c. Comment out the two lines that call free_list.
Run make to compile your code.
Run valgrind
```
valgrind ./linked_list
```
We can see that our program is still producing the correct result based on the printed messages "Congrats..."; however, we are now experiencing memory leaks. Valgrind tells us to "Rerun with --leak-check=full to see details of leaked memory", so let's do that
```
valgrind --leak-check=full ./linked_list
```

Your output should look something like this. There is a lot of information here, so let's parse through it together.

Box 1. Summary of heap usage. There were 128 bytes allocated in 8 different blocks the heap at the time of exit.

Box 2 and 3. Stack traces showing were the unfreed blocks were allocated. Direct blocks are those which are root nodes (blocks of memory that the programmer has direct access to, ex stack/global pointer to the heap). Indirect blocks are those which are not root nodes (ex a pointer inside of a struct).

Box 4. Summary of leak. You can find more info about this section here

You can use the stack trace to see where the unfreed blocks were allocated. Hopefully this example will help you understand Valgrind messages when you are completing your projects!

Exercise 3: Bit Operations

For this exercise, you will complete bit_ops.c by implementing the bit manipulation functions get_bit, set_bit, and flip_bit (shown below). You may ONLY use bitwise operations such as and (&), or (|), xor (^), not (~), left shifts (<<), and right shifts (>>). You may not use any for/while loops or conditional statements. You also may not use modulo (%), division, addition, subtraction, or multiplication for this question.

/* Returns the Nth bit of X. Assumes 0 <= N <= 31. */
unsigned get_bit(unsigned x, unsigned n) {
    /* YOUR CODE HERE */
    return -1; /* UPDATE WITH THE CORRECT RETURN VALUE*/
}

/* Set the nth bit of the value of x to v. Assumes 0 <= N <= 31, and V is 0 or 1 */
void set_bit(unsigned *x, unsigned n, unsigned v) {
    /* YOUR CODE HERE */
}

/* Flips the Nth bit in X. Assumes 0 <= N <= 31.*/
void flip_bit(unsigned *x, unsigned n) {
    /* YOUR CODE HERE */
}

ACTION ITEM: Finish implementing get_bit, set_bit, and flip_bit.

Once you complete these functions, you can compile and run your code using the following commands:

$ make bit_ops
$ ./bit_ops

This will print out the result of a few limited tests.

Exercise 4: Memory Management

This exercise uses vector.h, test_vector.c, and vector.c, where we provide you with a framework for implementing a variable-length array. This exercise is designed to help familiarize you with C structs and memory management in C.

ACTION ITEM:

Try to explain why bad_vector_new() and also_bad_vector_new() are bad. Hint: One of these functions will actually run correctly (assuming correctly modified vector_new, vector_set, etc.) but there may be other problems. Due to the lack of checkoffs, we have provided the reasons here, so you can verify your understanding

bad_vector_new()
THE VECTOR IS CREATED ON THE STACK, INSTEAD OF THE HEAP. ALL MEMORY STORED ON THE STACK GETS FREED AS SOON AS THAT FUNCTION FINISHES RUNNING, SO WHEN THE FUNCTION RETURNS, WE LOSE THE VECTOR WE CONSTRUCTED.

also_bad_vector_new()
WHILE THIS DOES WORK, IT IS RATHER INEFFICIENT. THE VECTOR OBJECT IS A "BIG" THING, SO WHEN WE RETURN IT, WE NEED TO COPY A LOT OF DATA FROM THE RETURN VALUE TO THE CALLING FRAME. THIS ALSO TENDS TO CASCADE, AS YOU END UP RETURNING IT TO A STACK VARIABLE, SO IF YOU END UP RETURNING THAT, YOU EITHER GET THE ERROR IN THE PREVIOUS PART, OR NEED TO RETURN THE WHOLE VECTOR AGAIN. INSTEAD, IT IS GENERALLY BETTER PRACTICE TO STORE STRUCTS ON THE HEAP, AND PASS AROUND POINTERS, AS POINTERS ARE A FIXED, "SMALL" SIZE.
Fill in the functions vector_new(), vector_get(), vector_delete(), and vector_set() in vector.c so that our test code test_vector.c runs without any memory management errors.

Comments in the code describe how the functions should work. Look at the functions we've filled in to see how the data structures should be used. For consistency, it is assumed that all entries in the vector are 0 unless set by the user. Keep this in mind as malloc() does not zero out the memory it allocates.

ACTION ITEM: Test your implementation of vector_new(), vector_get(), vector_delete(), and vector_set() for both correctness and memory management (details below).

# 1) to check correctness
$ make vector
$ ./vector

# 2) to check memory management using Valgrind:
$ valgrind ./vector

Any number of suppressed errors is fine; they do not affect us.

Feel free to also use CGDB to debug your code.

Exercise 5

Please submit this short survey about your experience with the lab. Your responses will be used to improve the lab in the future. The survey will be collecting your email to verify that you have submitted it, but your responses will be anonymized when the data is analyzed. Thank you!

Submitting

Please submit to the Lab 2 assignment on Gradescope.