Lab 2: C Memory Management, Valgrind
Deadline: Friday, September 9, 11:59:59 PM PT
Setup
In your labs
directory, pull the files for this lab with:
If you get an error like the following:
fatal: 'starter' does not appear to be a git repository
fatal: Could not read from remote repository.
make sure to set the starter remote as follows:
and run the original command again.
Exercise 1: Bit Operations
For this exercise, you will complete bit_ops.c
by implementing the bit manipulation functions get_bit
, set_bit
, and flip_bit
(shown below). You may ONLY use bitwise operations such as and (&
), or (|
), xor (^
), not (~
), left shifts (<<
), and right shifts (>>
). You may not use any for/while loops or conditional statements. You also may not use modulo (%
), division, addition, subtraction, or multiplication for this question.
/* Returns the Nth bit of X. Assumes 0 <= N <= 31. */
unsigned
/* Set the nth bit of the value of x to v. Assumes 0 <= N <= 31, and V is 0 or 1 */
void
/* Flips the Nth bit in X. Assumes 0 <= N <= 31.*/
void
Once you complete these functions, you can compile and run your code using the following commands:
This will print out the results of the tests.
Exercise 2: Using GDB to find segfaults
Recall that a segmenation fault (or segfault) occurs when our program tries to access an area of memory that it does not have permissions to access. This error can be a result of trying to access memory that does not belong to the program or by trying to dereference a NULL pointer. Let's take a look at how we can use GDB to find the location of a segfault. We've provided you with a buggy implementation of ll_has_cycle
from the last lab.
-
Compile the code by executing
-
Execute the program. We can see that it segfaults, but we are not given any information about where the segfault occurred.
-
We can use GDB to find out more information. Run GDB on the executable using the following command:
-
Run the program until it segfaults using the
run
command. -
We can see that our program segfaulted on line 9 of ll_cycle.c. This corresponds to the line
hare = hare->next->next;
. -
We can examine the current state of the program to understand why the it segfaulted. Since we are trying to access
hare
on this line, let's print out the structRemember, this is just going to print out the pointer, so this information is not useful to us. If we want to examine the contents of the struct, we need to dereference it.
We can see that the
next
pointer is NULL. This shows us that when the program executeshare = hare->next->next;
, it is trying to access thenext
field of this NULL pointer which is what is causing the segmentation fault. -
Fix the error in the program and rerun it to make sure that it's not failing anymore.
Exercise 3: Valgrind
Even with a debugger, we might not be able to catch all bugs. Some bugs are what we refer to as "bohrbugs", meaning they manifest reliably under a well-defined, but possibly unknown, set of conditions. Other bugs are what we call "heisenbugs", and instead of being determinant, they're known to disappear or alter their behavior when one attempts to study them. We can detect the first kind with debuggers, but the second kind may slip under our radar because they're (at least in C) often due to mis-managed memory. Remember that unlike other programming languages, C requires you (the programmer) to manually manage your memory.
We can use a tool called Valgrind to help catch to help catch "heisenbugs" and "bohrbugs". Valgrind is a program which emulates your CPU and tracks your memory accesses. This slows down the process you're running (which is why we don't, for example, always run all executables inside Valgrind) but also can expose bugs that may only display visible incorrect behavior under a unique set of circumstances.
Using Valgrind to find segfaults
Let's see how we can use valgrind to find the same bug that we just found with gdb. Add back the bug to ll_has_cycle
so that we can see how to detect it with valgrind.
- Run valgrind on the executable using the following command:
Let's break down the output. Below we can see the same print statements that are generated when we run our code normally.
==4662== Memcheck, a memory error detector
==4662== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==4662== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==4662== Command: ./ll_cycle
==4662==
Running tests...
Checking first list for cycles. There should be none, ll_has_cycle says it has no cycle
Checking second list for cycles. There should be a cycle, ll_has_cycle says it has a cycle
Checking third list for cycles. There should be a cycle, ll_has_cycle says it has a cycle
Checking fourth list for cycles. There should be a cycle, ll_has_cycle says it has a cycle
In the next chunk, we can see that we are performing an invalid read of size 8. This means that we are trying to read 8 bytes from a memory location that we do not have access to. Valgrind provides us with the call stack (the list of function calls) that led to the segfault. We can see that line 50 in main (located in test_ll_cycle.c
) makes a call to ll_has_cycle
(located in ll_cycle.c
). We can then see that line 9 in ll_cycle.c
is where the segfault occurs.
==4662== Invalid read of size 8
==4662== at 0x108C4B: ll_has_cycle (ll_cycle.c:9)
==4662== by 0x108B36: main (test_ll_cycle.c:50)
==4662== Address 0x8 is not stack'd, malloc'd or (recently) free'd
The next chunk tells us that we encountered a segfault and repeats the information above.
==4662== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==4662== Access not within mapped region at address 0x8
==4662== at 0x108C4B: ll_has_cycle (ll_cycle.c:9)
==4662== by 0x108B36: main (test_ll_cycle.c:50)
We can ignore this next part because we don't believe that we encountered a stack overflow (we have not used enough memory for that).
==4662== If you believe this happened as a result of a stack
==4662== overflow in your program's main thread (unlikely but
==4662== possible), you can try to increase the size of the
==4662== main thread stack using the --main-stacksize= flag.
==4662== The main thread stack size used in this run was 8388608.
The next chunk tells us if there was any heap memory allocated when the program terminated. "All heap blocks were freed -- no leaks are possible" tells us that we have no leaks.
==4662== HEAP SUMMARY:
==4662== in use at exit: 0 bytes in 0 blocks
==4662== total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
==4662==
==4662== All heap blocks were freed -- no leaks are possible
The next line tells us how to display surpressed errors (we don't care about that). And finally, we can see in the summary that we have 1 error.
==4662== For counts of detected and suppressed errors, rerun with: -v
==4662== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Update the code so that this error no longer occurs. Once you update it, your valgrind output should look like this. We can see in the error summary at the bottom that there are no errors.
==12790== Memcheck, a memory error detector
==12790== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==12790== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==12790== Command: ./ll_cycle
==12790==
Running tests...
Checking first list for cycles. There should be none, ll_has_cycle says it has no cycle
Checking second list for cycles. There should be a cycle, ll_has_cycle says it has a cycle
Checking third list for cycles. There should be a cycle, ll_has_cycle says it has a cycle
Checking fourth list for cycles. There should be a cycle, ll_has_cycle says it has a cycle
Checking fifth list for cycles. There should be none, ll_has_cycle says it has no cycle
Checking length-zero list for cycles. There should be none, ll_has_cycle says it has no cycle
Congrats, you passed all the test cases!
==12790==
==12790== HEAP SUMMARY:
==12790== in use at exit: 0 bytes in 0 blocks
==12790== total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
==12790==
==12790== All heap blocks were freed -- no leaks are possible
==12790==
==12790== For counts of detected and suppressed errors, rerun with: -v
==12790== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Invalid reads and memory leaks
Let's take a look at the Bork
translation program! Bork
is an ancient language that is very similar to English. To translate a word to Bork, you take the English word and add an 'f' after every vowel in the word.
Let's see if we can understand some Bork
. Compile and run bork
using the following commands.
An example output is provided below. Note that your output will probably look different.
Input string: "hello"
Length of translated string: 21
Translate to Bork: "hefl2?^?Ul2?^?Uof?^?U"
Hmm, Bork
is an old language, but there shouldn't be all of these strange characters. It seems that perhaps the ancients left some bugs in their program! Shall we embark on a journey to squash bugs and uncover the true beauty of Bork
?
If we take a brief glance at main
, we can see that we are taking an input string (src_str
) and translating it to Bork
(dest_str
). If we scroll to the top, we can see that we have a function (alloc_str
) to allocate space for a string in the heap, a Str
struct which contains a string and it's length, a make_str
function which will create a Str
struct and initialize its data
and len
field, and a function to free our struct's data. There is also a function to concate two strings together and another function to translate a letter to Bork
. Now this is quite a long program to debug.
Wouldn't it be nice if there were a tool that gave us a good first place to look?
Well as it turns out, there are a couple and valgrind
is one of them!
Let's run valgrind on our program using the following command.
==10170==
==10170== )
==10170== )
==10170== )
==10170== )
==10170== )
==10170==
==10170==
==10170== )
==10170== )
==10170== )
==10170== )
==10170==
==10170== )
==10170== )
==10170== )
==10170== )
==10170==
==10170==
==10170==
==10170==
==10170==
==10170==
==10170==
==10170==
==10170==
==10170==
==10170==
==10170==
==10170==
==10170==
==10170==
==10170== )
(Interesting side note: when we look at the normal program output in this valgrind log, we see normal behavior (i.e. it prints "hefllof"). That's because the way valgrind runs our program is different than how our program runs "naturally" (aka "bare metal"). We're not going to get into that for now.)
But back on debugging: A good general rule of thumb to follow when parsing big error logs is to only consider the first error message (and ignore the rest), so let's do that:
==10170== Invalid read of size 1
==10170== at 0x4C34D04: strlen (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==10170== by 0x10879F: make_Str (bork.c:22)
==10170== by 0x108978: translate_to_bork (bork.c:56)
==10170== by 0x1089F2: main (bork.c:68)
The error message states that we are doing an invalid read of size 1. What does this mean? An invalid read means that your program is reading memory at a place that it shouldn't be (this can cause a segfault, but not always). Size 1 means that we were attempting to read 1 byte.
Because we're unfamiliar with this ancient codebase and we don't want to read all of it to find the bug, a good process to follow is to start at high-level details and work our way down (so basically work our way through the call stack that valgrind provides).
Let's look at bork.c line 68 in main
(the botton of the stack):
Str bork_substr = ;
Is something funky going on here? Looks like we are just passing a character to translate_to_bork
. Seems ok so far.
Let's go farther down the call stack and look at bork.c line 56 in translate_to_bork
:
return ;
We're just calling make_Str
here. We should go deeper. Let's look at bork.c line 22.
return ;
Here we are making a new Str
struct and setting its data
and len
parameters. That seems normal too!
But valgrind
says that strlen
is doing an invalid read?
Well, we're passing a string to it right? What does strlen
do again? It determines the length of a string by iterating over each character until it gets to a null terminator. Maybe there is no null terminator so strlen
keeps going past the end of the string (which would mean that it's going past the area that we allocated for the string).
Let's make sure our string has a null terminator by checking where we created it.
Earlier, we saw this on line 56 in translate_to_bork
.
return ;
If we look two lines up (line 54), we can see that we are allocating space for the string by calling alloc_str
. Let's take a look at this function.
char *
Hmmm. It looks like alloc_str
is giving us some memory that's only
len
big, which means when we write to the string in translate_to_bork
, we don't
have enough space for a null terminator!
Let's make the following change to fix the problem:
< return malloc(len*sizeof(char));
> char *data = malloc((len+1)*sizeof(char));
> data[len] = '\0';
> return data;
Let's run our program to see if we fixed the problem
Input string: "hello"
Length of translated string: 7
Translate to Bork: "hefllof"
Everything looks like it's working properly. However, there could be hidden errors that we cannot see, so let's run our code through valgrind to make sure that there are no underlying issues.
==29797== Memcheck, a memory error detector
==29797== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==29797== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==29797== Command: ./bork hello
==29797==
Input string: "hello"
Length of translated string: 7
Translate to Bork: "hefllof"
==29797==
==29797== HEAP SUMMARY:
==29797== in use at exit: 8 bytes in 1 blocks
==29797== total heap usage: 11 allocs, 10 frees, 1,061 bytes allocated
==29797==
==29797== LEAK SUMMARY:
==29797== definitely lost: 8 bytes in 1 blocks
==29797== indirectly lost: 0 bytes in 0 blocks
==29797== possibly lost: 0 bytes in 0 blocks
==29797== still reachable: 0 bytes in 0 blocks
==29797== suppressed: 0 bytes in 0 blocks
==29797== Rerun with --leak-check=full to see details of leaked memory
==29797==
==29797== For counts of detected and suppressed errors, rerun with: -v
==29797== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Let's take a look at the heap summary below. It tells us that we had 8 bytes in 1 block allocated at the time of exit. This means that the memory in the heap that was not free'd stems from one allocation call and that it is 8 bytes large.
Next, we can see the heap summary which shows that we made 11 allocation calls and 10 frees over the lifetime of the program.
==29797== HEAP SUMMARY:
==29797== in use at exit: 8 bytes in 1 blocks
==29797== total heap usage: 11 allocs, 10 frees, 1,061 bytes allocated
Now let's take a look at the leak summary below. This just states that we lost 8 bytes in 1 block.
==29797== LEAK SUMMARY:
==29797== definitely lost: 8 bytes in 1 blocks
==29797== indirectly lost: 0 bytes in 0 blocks
==29797== possibly lost: 0 bytes in 0 blocks
==29797== still reachable: 0 bytes in 0 blocks
==29797== suppressed: 0 bytes in 0 blocks
==29797== Rerun with --leak-check=full to see details of leaked memory
It tells us to "Rerun with --leak-check=full to see details of leaked memory", so let's do that.
==32334== Memcheck, a memory error detector
==32334== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==32334== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==32334== Command: ./bork hello
==32334==
Input string: "hello"
Length of translated string: 7
Translate to Bork: "hefllof"
==32334==
==32334== HEAP SUMMARY:
==32334== in use at exit: 8 bytes in 1 blocks
==32334== total heap usage: 11 allocs, 10 frees, 1,061 bytes allocated
==32334==
==32334== 8 bytes in 1 blocks are definitely lost in loss record 1 of 1
==32334== at 0x4C31B0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==32334== by 0x108784: alloc_str (in /home/cc/cs61c/fa22/staff/cs61c-tac/bork)
==32334== by 0x10884E: concat (in /home/cc/cs61c/fa22/staff/cs61c-tac/bork)
==32334== by 0x108A30: main (in /home/cc/cs61c/fa22/staff/cs61c-tac/bork)
==32334==
==32334== LEAK SUMMARY:
==32334== definitely lost: 8 bytes in 1 blocks
==32334== indirectly lost: 0 bytes in 0 blocks
==32334== possibly lost: 0 bytes in 0 blocks
==32334== still reachable: 0 bytes in 0 blocks
==32334== suppressed: 0 bytes in 0 blocks
==32334==
==32334== For counts of detected and suppressed errors, rerun with: -v
==32334== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Now Valgrind is telling us the location where the unfree'd block was initially allocated. Let's take a look at this below. If we follow the call stack, we can see that malloc
was called by alloc_str
which was called by concat
in main
.
==32334== 8 bytes in 1 blocks are definitely lost in loss record 1 of 1
==32334== at 0x4C31B0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==32334== by 0x108784: alloc_str (in /home/cc/cs61c/fa22/staff/cs61c-tac/bork)
==32334== by 0x10884E: concat (in /home/cc/cs61c/fa22/staff/cs61c-tac/bork)
==32334== by 0x108A30: main (in /home/cc/cs61c/fa22/staff/cs61c-tac/bork)
If we look in main
, we can see that we allocate the space for dest_str
by calling concat
, but we never free it. We need dest_str
until the end of the program, so let's free it right before we return from main
. This struct was allocated on the stack in main (Str dest_str={};
), so we do not need to free the struct itself. However, the data that the struct points to was allocated in the heap. Therefore, we only need to free this portion of the struct. If you take a look near the top of the program, we have already provided a function free_Str
to free the allocated portion of the struct. Let's call this function at the end of our program.
> free_Str(dest_str);
You might be wondering why we are not freeing src_str
. If we take a look at where we constructed src_str
(Str src_str = make_Str(argv[1]);
), we can see that it was created using make_str
which does not make any calls to allocate space on the heap. The string that we are using to make src_str
comes from argv[1]
. The program that calls main is responsible for setting up argv[1]
, so we don't have to worry about it.
Once we fix our error, the valgrind output should look like this. The heap summary shows that there are no blocks allocated at the time we exit. The error summary at the bottom shows us that there are no errors to report.
==10835== Memcheck, a memory error detector
==10835== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==10835== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==10835== Command: ./bork hello
==10835==
Input string: "hello"
Length of translated string: 7
Translate to Bork: "hefllof"
==10835==
==10835== HEAP SUMMARY:
==10835== in use at exit: 0 bytes in 0 blocks
==10835== total heap usage: 11 allocs, 11 frees, 1,061 bytes allocated
==10835==
==10835== All heap blocks were freed -- no leaks are possible
==10835==
==10835== For counts of detected and suppressed errors, rerun with: -v
==10835== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Exercise 4: Memory Management
This exercise uses vector.h
, test_vector.c
, and vector.c
, where we provide you with a framework for implementing a variable-length array. This exercise is designed to help familiarize you with C structs and memory management in C.
-
Try to explain why
bad_vector_new()
is bad. We have provided the reason here, so you can verify your understandingbad_vector_new()
The vector is created on the stack, instead of the heap. All memory stored on the stack gets freed as soon as that function finishes running, so when the function returns, we lose the vector we constructed. -
Fill in the functions
vector_new()
,vector_get()
,vector_delete()
, andvector_set()
invector.c
so that our test codetest_vector.c
runs without any memory management errors.Comments in the code describe how the functions should work. Look at the functions we've filled in to see how the data structures should be used. For consistency, it is assumed that all entries in the vector are 0 unless set by the user. Keep this in mind as
malloc()
does not zero out the memory it allocates.
Test your implementation of vector_new()
, vector_get()
, vector_delete()
, and vector_set()
for both correctness and memory management (details below).
# 1) to check correctness
# 2) to check memory management using Valgrind:
Any number of suppressed errors is fine; they do not affect us.
Feel free to also use CGDB to debug your code.
Exercise 5
Please fill out this short survey about your experience with the lab. Your responses will be used to improve the lab in the future. The survey will be collecting your email to verify that you have submitted it, but your responses will be anonymized when the data is analyzed. Thank you!
Submission
Save, commit, and push your work, then submit to the Lab 2 assignment on Gradescope.