Motivation

The lack of a garbage collector in C++ turns heap management into a pretty complex issue which must be explictly handled by the programmer. To this end Microsoft's Visual C++ compiler is offering a debug version of the standard heap mangement primitives (malloc(), free(), new, detele, etc.) via header file crtdbg.h. Also included in this header are functions for checking the status of the heap, such as: _CrtMemCheckpoint(), _CrtMemDumpStatistics(), etc.
Putting aside the terrible interface offered by these functions, the primary flaw of crtdbg is that it only works in debug mode (i.e.: preporcessor symbol _DEBUG must be defined), thus rendering these functions completely useless in release builds.

It turns out that there is an elegant solution for constructing a snapshot of the heap (which can then be used for the debugging of heap-related problems) at any point during the execution of the program. This solution, which is based on the _heapwalk() function, is outlined in the following section.

A simple yet versatile heap scanner

The _heapwalk() function (defined in the standard header file: malloc.h) allows client code to iterate over all heap blocks, either used blocks or free blocks. As a bonus, it will also detect some inconsistencies in the heap structure, thus allowing the programmer to track down these hard to find block-overrun bugs. Most importantly, this function works well in either debug or release mode.
The following code snip depicts the HeapStatus class that provides the scaffolding required for _heapwalk()-based iteration. This class is parameterized by the BlockVisitPolicy type, which defines the action to be taken on each iteration pass.

//
// heapstatus.h - Heap Iteration class
// Must be included before crtdbg.h
//

#include <malloc.h>
#include <stdexcept>

#ifndef _HEAPSTATUS_H_INCLUDED_
#define _HEAPSTATUS_H_INCLUDED_



struct CountSizePolicy
{
   // Note that in _DEBUG mode, the size of each allocated block is larger 
   // than the size that is specified in the allocation request (either 
   // malloc() or new).
   //
   // This is due to additional trailing/leading bytes which are wrapped around 
   // each allocated block. 
   // These bytes hold some magic numbers, that are used for testing the 
   // block's integrity.
   //
   // See crtdbg.h for more details
   //
   unsigned int usedBytes_; 
   unsigned int freeBytes_;
   
   CountSizePolicy() { reset(); }
   
   void reset() 
   {
      usedBytes_ = 0;
      freeBytes_ = 0;
   }

   void onUsedBlock(void *p, unsigned int size)
   {
      usedBytes_ += size;
   }
   
   void onFreeBlock(void *p, unsigned int size)
   {
      freeBytes_ += size;
   }   
};

template<typename BlockVisitPolicy = CountSizePolicy>
struct HeapStatus : BlockVisitPolicy
{
   HeapStatus() { refresh(); }
   
   void refresh()
   {
      reset();
      
      _HEAPINFO hinfo;
      hinfo._pentry = NULL;
            
      int heapstatus;
      while(true)
      {
         heapstatus = _heapwalk(&hinfo);
         if(heapstatus == _HEAPEND || heapstatus == _HEAPEMPTY)
            return;

         if(hinfo._useflag == _USEDENTRY)
            onUsedBlock(hinfo._pentry, hinfo._size);
         else
            onFreeBlock(hinfo._pentry, hinfo._size);         
      }
      
      // If we got here there is a problem.
      switch(heapstatus)
      {
         case _HEAPBADPTR:
            throw std::runtime_error("Heap Status error: bad pointer to heap");
            break;
         case _HEAPBADBEGIN:
            throw std::runtime_error("Heap Status error: bad start of heap");
            break;
         case _HEAPBADNODE:
            throw std::runtime_error("Heap Status error: bad node in heap");
            break;
      }      
   }   
};


#endif // _HEAPSTATUS_H_INCLUDED_

Why do programmers overlook heap management issues?

It turns out that at the early stages of development, programmers consciously decide to ignore heap mangement issues. In some cases, heap mangement is properly addressed only after the program has reached some level of stability. In other cases, correct heap mangement code is never written since the relevant product (or sub-system) is thrown to the garbage can.
Although questionable, this decision has a strong practical basis driven by the programmer's tendency to avoid writing code which is likely to undergo severe changes. This rationale is based on the following arguments:
  • As long as the core of the program (i.e.: all the essential parts, which are usually related to functional requirements) is still under development, the code is exposed to many radical changes. At these early stages, the program can be seen as a "vibrating hull" which repeatedly changes its shape and behavior. This hull becomes stable only when the developer feels that the code written so far is a sound basis for the complete program.
  • In a program where the hull is small, the net effect of the vibrations is mitigated, since changes in the code cannot propagate too far. In other words, it is easier to cope with the vibrations if the hull is small.
  • Therefore, there is a clear advantage in keeping the hull as small as possible, at least until it has matured and stabilized.
The immediate implications of this reasoning, is that programmers prefer to postpone the complete development of implementational concerns as much as possible, in a continuos attempt to keep the hull's compact size during its vibrating period.