[Nobug] memory checking (braindump)

Sat Aug 28 00:41:13 CEST 2010

hi,

I just want to drop a braindump about memory validation/watching this time.
Some time ago I already talked with ichthyo about this and we decided we
currently don't need this feature for Lumiera. But eventually it would be a
nice thing to complete NoBug's featureset, so i will implement it someday.
Documenting this here may help to work it further out.

First I conclude what kinds of errors such a feature should address:
 * Writing data out of bounds
 * Unintentional writes to valid data
 * Finding memory leaks

Whats the Rationale to do this in NoBug instead of Valgrind? Valgrind has an
insane overhead, one can not run performance demanding applications easily
under valgrind. It's up to 20 times slower than running something natively and
also needs about as much more memory. More importantly, Valgrind has by design
some blind spots, some stack corruption and other tricks with the stack are
going unnoticed by valgrind. This is just to point this out. Valgrind is a
valuable tool I don't want to miss, I am just after filling the gaps and add
memory debugging to NoBug to gain speed and catch the cases where Valgrind is
blind. The NoBug way will be explicit, intrusive instrumention, NoBug will not
hook into the system like valgrind or other tools do. In this regard, NoBug
will have its own blind spots too, for example use of uninitialized memory is
rather hard to track down and certainly a strength of valgrind.

Doing memory checks in a intrusive way has some benefits (except for the usual
drawback that you have to set up this intrusive macros whenever required).
This means you can add memory checks at higher levels (your object factories)
as well as on lower levels for example if you implement custom allocators
(memory pools, garbage collectors, temporary buffers,...).

So lets look at an idea:

each allocation should have some guard area around it to detect
unintentional writes before and after the object. I'd recommend to have
this guard area at least 1 object size before and after the object (to
catch off-by-1 indirection errors). For bulk (array, memory pool,..)
allocations it would suffice to have half (rounded up) times the object
size guard areas around, except that you want to add some more margin
the to the complete memory block where the objects will be in.

Unintentional writes can be detected by calculating a checksum
over the memory block watched and then assert its validity.

Memory is often allocated for some domains/subsystems or hierarchical.
For leak checking you want to be sure that all child allocations are
freed before you free the parent.

Thus follows the following API considerations:

Every allocation will be instrumented to include the guard areas, for
this we need a SIZEOF() which accounts for it. (I using short names
here, names in a real API will differ)

Then every return of an allocation has to be adjusted by the offset if
the user data (except the allocation indicated failure by returning
NULL). So firstly an basic instrumentation would look like:

  Object* myobject = DATAOF( malloc( SIZEOF(*myobject)));

and freeing it by:

 free(BASEOF(myobject));

To implement a registry of all allocations we need a (double) linked list
chaining up all allocations in random order and a central node being the entry
to all this nodes. Double linked because we need fast removes on any node. We
need to store the size of the object, the size of the guard areas and we have
to store a checksums over the data and guard areas. Further a nobug_context
should be stored to track down from where the object was allocated.

With some care this metadata can be stored within the the guard areas before
and after the object to preserve locality and optimize memory usage. Further
some things can be further optimized:

 * It makes no sense to store a full size_t for the object size, gigantic
   allocations with (possibly equally large) guard areas in front and back are
   impractical and rare. I proclaim that 3 bytes (16MB) objects should be
   enough. Biggier allocation areas should be watched otherwise or we when we
   have that big allocations having biggier management structures which store
   the size at another place wont hurt (we can indicate this by storing 0 in
   the ordinary size location).

 * The guard size could be just a factor, then signed char will suffice. 1 to
   127 gives a multiplier, -1 to -128 a (abs) divisor.

Thus follows:

struct before
{
        llist node;
        int32_t guard_size : 8;
        uint32_t allocation_size : 24;
        uint32_t guard_checksum;
};

struct after
{
        uint32_t data_checksum;
        nobug_context context;
};

and wrap a user will have the layout like:
{
	char front_guard[guard_size - sizeof(struct before)];
	struct before front_meta;

	user_data here; /* this is the address of the user data */

	char pad_guard[calculate_alignment_somehow()];
	struct after back_meta;
	char back_guard[guard_size - sizeof(struct after)];
}

Now we can assert that the guard areas didnt got corrupted by calculating and
comparing the checksum (we need to do this on each llist update touching a
node). When the data checksuim is set (we define '0' as being unused) then we
can assert that the data didnt got modified too. This should be done with an
explicit API, MEMCHECK(object) for checking and FREEZE(object) for
recalculating the checksum after a mutation. Freeing an object should do the
MEMCHECK too. To release the FREEZE we need a UNFREEZE() too.

Next, we have all allocated memory available in the list, this means we can
check for potential leaks at the end of the application lifetime (or
inbetween) by doing a garbage collector like conservative scan. Since we have
no roots (as in a GC) and not all memory/references are necceesary covered
with the intrusive mechanism this should be taken with a grain of salt, as
there is no gurantee for it to be exact.

Implications?

There is a gotcha now, when the program runs under valgrind, then
initializing memory guards will prevent valgrind detecting illegal reads to
it. Valgrind has some hooks to mark memory areas uninitialized but we
possibly just want to disable all this memory checking when running
under valgrind, otherwise this might just decrease valgrinds
performance even more.

Thats it for now, a implementation with more details (log flags and all) will
follow some day, but don't hold your breath.

       Christian