Debugging Valgrind Errors

If you are using C++ in this course, Marmoset will run your submissions with Valgrind. Valgrind is a program that detects memory-related errors.

A common misconception is that Valgrind only detects memory leaks, and that if you don't use new in your program you shouldn't get Valgrind errors. Valgrind actually detects a number of memory errors other than leaks, such as uses or accesses of uninitialized memory. Additionally, there are ways you can leak memory even if you don't use new, such as if your program terminates improperly and is unable to clean up stack-allocated objects. Improper termination can be caused by uncaught exceptions or by using the exit function (which should not be used in C++).

Valgrind error messages can be quite long and intimidating. This guide is intended to give you an idea of how to handle these errors.

Table of Contents

General tips

Solve the first error

When confronted with a massive Valgrind report consisting of many errors, a good idea is to start by just solving the first error. Often memory errors compound, and one error will cause many other errors throughout the program. Solving the first error that Valgrind shows you will sometimes fix many other errors, possibly even all the errors. Pretend the error message consists only of the first error, and ignore everything else.

Look for function names and line numbers

If you compile your program with the -g flag, Valgrind will show you the function names and line numbers where errors occur. Sometimes the actual bug occurs on a different line (particularly for uninitialized value errors) but the line number Valgrind tells you is a good starting point.

For example, in this message, there was a use of an uninitialized value in the main function on line 6 of the program. The bug is probably not on line 6 itself but rather earlier in the program where the value was left uninitialized. However, looking at line 6 can give you an idea of which value might have been left uninitialized.

==98641== Conditional jump or move depends on uninitialised value(s)
==98641==    at 0x1091F3: std::vector, std::allocator >, std::allocator, std::allocator > > >::resize(unsigned long) (stl_vector.h:691)
==98641==    by 0x109016: main (program.cc:6)

Look for the last point in the stack trace where your program appears

Consider the following error:

==51205== Invalid read of size 8
==51205==    at 0x4F7B905: assign (basic_string.h:1439)
==51205==    by 0x4F7B905: std::__cxx11::basic_string, std::allocator >::operator=(char const*) (basic_string.h:705)
==51205==    by 0x108A3D: h() (program.cc:6)
==51205==    by 0x108A8A: g() (program.cc:9)
==51205==    by 0x108A96: f() (program.cc:11)
==51205==    by 0x108AA2: main (program.cc:14)
==51205==  Address 0x8 is not stack'd, malloc'd or (recently) free'd
Did the error happen on line 6, line 9, line 11, or line 14 of program.cc? Or did it happen somewhere in the C++ standard library?

Unless there's a bug in the C++ standard library, the error probably happened in your program. Furthermore, the error probably happened at the last point in the stack trace where your program appears.

In this case, we see that main called function f, then function f called function g, then function g called function h, and then function h did something with an assignment operator, which lead to an invalid read error. The problem is likely in function h, although you might still want to look elsewhere if you can't find any problems in h.

Common types of Valgrind errors

Invalid reads and invalid writes

Invalid read errors and invalid write errors occur when you try to read from or write to a part of memory that you shouldn't be accessing. A very common reason for this is if you try to access an element of a vector or other data structure that doesn't exist. For example, if you access an index that is past the end of a vector, you will likely get one of these errors.

Valgrind will tell you the line where the invalid read or write occured, and usually there will be some code that accesses a vector or other data structure on that line. Think about whether this access is always valid. Could there be a case where you reach this line without adding the required elements to the data structure? The invalid access might only occur in certain cases, such as when the input contains a blank line.

Uninitialized value errors

The error message "Conditional jump or move depends on uninitialized value(s)" essentially means Valgrind has determined that the result of your program depends on uninitialized memory. Sometimes you will also see the message "Use of uninitialized value of size N".

Valgrind will report the line at which the program depends on the uninitialized value. It will allow uninitialized values to be moved and copied around in memory without reporting an error, as long as the program doesn't depend on these values. This can make the error hard to find because the mistake could be far away from the line Valgrind reports. For example, consider the following program:

 1 #include
 2 #include
 3 int main() {
 4   int i;
 5   i += 1;
 6   int j = i+2;
 7   std::vector v {i,j};
 8   v.push_back(i+j);
 9   for(int i : v) {
10     std::cout << i << std::endl;
11   }
12 }
Valgrind will report errors on line 10. The actual problem is on line 4, where we forgot to assign a value to i. But Valgrind allows us to increment i, assign i+2 to a new variable j, create a vector containing i and j, and add i+j to the vector, all without complaints, because the visible behaviour of the program isn't actually affected until we try to output the values of the vector.

The line number that Valgrind tells you is still helpful, because you know that somewhere on that line you're using an uninitialized value. But you might have to do some detective work to figure out which value is uninitialized and why it was not initialized. It's not always as simple as forgetting to give a default value to a variable. Maybe your program reads data from standard input, and there is a bug in the input reading function that causes some of the data variables to be uninitialized in certain cases.

Memory leaks

Sometimes Valgrind will report that your program leaked memory. There are a few reasons this can happen.

Forgetting to deallocate things you allocated

This is the most obvious and easily avoidable reason for memory leaks, but sometimes these mistakes happen. If you are using new, did you call delete on everything you allocated? Did you use the correct type of delete? (If you allocate an array you need to use delete [] instead of delete.)

You can avoid these problems by using smart pointers instead of new and delete, although smart pointers come with their own difficulties.

Improper termination

If you aren't using new, you might be confused as to how you can possibly be leaking memory. STL classes like vector and map use new internally, but they are designed to clean up their allocated memory correctly when their destructors are called. If you terminate your program improperly though, their destructors might not be called, and then memory will be leaked.

A common reason for this is using the exit function. This function is seemingly useful for terminating the program at an arbitrary point, but it has a catch: it doesn't call the destructors of stack-allocated objects before exiting. For this reason, you should avoid this function in C++. Instead, throw an exception from the point you where want to exit, catch the exception in your main function, and then return from main normally. If your whole program is in main, you can also just use return statements instead of exceptions; returning from main will do the proper cleanup. However, using exceptions and a single return point in main is arguably cleaner than having multiple return points in main.

Uncaught exceptions

Another type of improper termination is throwing an exception and not catching it. This is generally a lot more obvious when it happens, because the Valgrind error message will look something like this:

terminate called after throwing an instance of 'std::out_of_range'
  what():  vector::_M_range_check: __n (which is 0) >= this->size() (which is 0)
==119257==
==119257== Process terminating with default action of signal 6 (SIGABRT)
==119257==    at 0x5472E97: raise (raise.c:51)
==119257==    by 0x5474800: abort (abort.c:79)
==119257==    by 0x4ED587D: __gnu_cxx::__verbose_terminate_handler() [clone .cold] (vterminate.cc:95)
==119257==    by 0x4EE1485: __cxxabiv1::__terminate(void (*)()) (eh_terminate.cc:48)
==119257==    by 0x4EE14F0: std::terminate() (eh_terminate.cc:58)
==119257==    by 0x4EE1744: __cxa_throw (eh_throw.cc:95)
==119257==    by 0x4ED8036: std::__throw_out_of_range_fmt(char const*, ...) [clone .cold] (functexcept.cc:96)
==119257==    by 0x108BEB: std::vector >::_M_range_check(unsigned long) const (stl_vector.h:825)
==119257==    by 0x108AC8: std::vector >::at(unsigned long) (stl_vector.h:846)
==119257==    by 0x10899E: main (exception.cc:5)
Notice that "throw" appears several times in the error message, indicating the error is related to throwing an exception. The message "Process terminating with default action of signal 6 (SIGABRT)" is also a telltale sign of an uncaught exception, because uncaught exceptions will cause the program to "abort".

Usually when you get this error, it's not because of an exception you threw yourself - although it could be if you wrote your catch clause incorrectly. More likely, the exception you failed to catch comes from the C++ standard library.

In this case, you can tell by looking at the stack trace that the vector "at" function threw an "out_of_range" exception. The solution to this problem most likely isn't to catch this exception, but rather to make sure you avoid doing out-of-range accesses with "at". Depending on how your program is designed though, you might want to catch some C++ standard library exceptions.

Exceeding Marmoset limits

Marmoset places several limits on your program to make sure it doesn't destroy the testing servers. If your program exceeds one of Marmoset's limits, Marmoset will instantly terminate your program without letting it clean anything up, generally causing a memory leak. This can be a little confusing because your program doesn't necessarily have an actual memory leak; the actual problem is that you are exceeding one of Marmoset's limits.

Time limits

Marmoset places a limit on the amount of time your program takes. If you see the following message in your Valgrind output, you are probably exceeding the time limit:

Process terminating with default action of signal 24 (SIGXCPU)
SIGXCPU is a signal that indicates the process exceeded its time limit. Look for efficiency issues in your program. Some common ones are:

Output limits

Marmoset places a limit on the amount of output your program prodcues. Usually you will not exceed this limit unless one of the following situations happens:

As long as you remember to disable or remove debug printing before submitting your program, you shouldn't have to deal with this issue.

If you do exceed the output limit, Marmoset should print an informative error message.

Memory limits

Marmoset places a limit on the amount of memory your program uses. The message Valgrind gives if you exceed this limit is very strange and long, but it should say something like this near the bottom:

==63134==     Valgrind's memory management: out of memory:
==63134==        newSuperblock's request for 4194304 bytes failed.
==63134==          264,896,512 bytes have already been mmap-ed ANONYMOUS.
==63134==     Valgrind cannot continue.  Sorry.
These errors seem to be fairly rare, but if you encounter one you will have to find a way to reduce the amount of memory your program uses.

Click here to go back to the top of the page.