Friday, November 13, 2015

Preventing dynamic memory problems in C

When writing C programs that use structures kept in dynamic memory, there are a few techniques that are useful to prevent problems. The ones I mention here are not new, and well written software (e.g, much of Plan 9 from Bell Labs) use them a lot. This post is mostly for new programmers or not-so-old programmers in the hope that it might help them.

For example, consider a server that allocates/releases some structures to serve clients:
typedef struct MBuf MBuf;
struct MBuf {
uint64_t pc;
MBuf *next; // next in free or used list
MBuf *anext; // next in global allocation list
... the actual data goes here ...

It does not matter what this structure is used for. But the code will do many (de)allocations for these. Instead of using malloc/free directly, we keep all allocated structures in a single list linked through the anext field, and all unused structures in a free list, linked through the next field.

For allocation, if there are structures in the free list, we are done. Otherwise we allocate a new one and link it in the global list.

Now, in the allocation, we take the program counter of the caller and store it in the pc field. In deallocation, we set the pc to zero and set most of the structure (but for the pc and the link fields) to a silly value. That is, we poison the data when it is free.

Doing so, makes it quite easy to detect references to free structures. Also, it is easy to write a leak function that, at the end of the program or at any other point, scans the entire global allocation list and prints those program counters that are not zero, so we know the places in the program where we are allocating the structures without actually releasing them.

It's an easy thing to do, it makes the program faster because it avoids extra allocations, and it makes the program safer because it aids in testing and debugging. Hope you enjoy it.

The same thing can be done to structures pre-allocated in arrays, thus, the leaks detected are those that are actually leaks (because the leak tool provided by the OS or the language does not know if something that we allocated is actually a leak or not; it might be a correct pre-allocation that was never used).

For example, this might be the code:

MBuf *buf;

buf = freebuf;
if (buf != NULL) {
freebuf = buf->next;
} else {
buf = xmalloc(sizeof(MBuf));
buf->anext = allbufs;
allbufs = buf;
CALLERPC(buf->pc); // set buf->pc with the caller pc
return buf;

buffree(MBuf *buf)
MBuf *n;

if (buf != NULL) {
n = buf->anext;
if (buf->pc == 0) {
xfatal("buffree: double free");
memset(buf, 3, sizeof(*buf));// poison
buf->anext = n;
buf->pc = 0;
buf->next = freebuf;
freebuf = buf;

MBuf *buf;

for (buf = allbufs; buf != NULL; buf = buf->anext) {
if (buf->pc != 0) {
xwarn("bufalloc: leak pc %llx", buf->pc);
xwarn("\tlldb %s", argv0);
xwarn("\tdi -m -s %llx", buf->pc);