Garbage collectors should be invisible: if developers (or even users) notice them, something is wrong. Usually these issues are performance issues, but there can also be repair issues. This makes their implementation somewhat daunting.
In my spare time, I work on virtual instruction set architecture, primarily to better understand the nature of computing and, potentially, the trade-offs in programming language design.
The project was stalled for several months because I was reluctant to start working on the garbage collector, which I felt was necessary to maintain memory safety.
I was a little confused about that, but now I think it’s because garbage collectors are terrible to use. Not only are they expected to be mostly invisible, but they also have to compute a very non-local property: whether an object is alive and not yet freed depends on the state of the pointers in all the other objects on the heap. Industrial garbage collectors are also very complex. So it all seemed very discouraging.
This article attempts to document what I did to cope. Most of the ideas below are intended to make the garbage collection operation more obvious.
What can be done?
The choice of garbage collector is important. Copying a semi-spatial garbage collector like Cheney’s algorithm seems like an excellent starting point. (See CJ Cheney, A non-recursive algorithm for collapsing a listor his Wikipedia page, Cheney’s algorithm). Such a garbage collector can be very small and very simple: without counting code to identify pointers to root objects and parsing objects and heaps, a functional implementation of the garbage collector is only a few tens of lines (or even less if there is already code for duplicating objects).
The garbage collector copies all objects from the fromspace part to the tospace part of the heap (hence the name semispace collector). This has the important side effect of ensuring that all object addresses are changed during the garbage collection cycle. Therefore, any use of an object pointer that is forgotten and not processed during collection results in very obvious problems (usually a crash, but can also result in incorrect data).
With a less memory-intensive garbage collector that tries to work in place, perhaps one that never moves any live objects, raw pointers can go unnoticed until a more complex test case comes along. With a garbage collector that copies, it’s practically necessary to have the root pointer, stack, and code to scan the object from the start.
With such a semispace collector, the address range of the previous heap (what was previously fromspace) can be reserved at the kernel level and mapped with the PROT_NONE protection flag, so that accidental dereferencing by a stale pointer that was not updated in the previous garbage collection cycle is very likely to fail. This works even if the pointers are stored outside (probably by accident), in places that the garbage collector should not scan.
In the case of interfaces with separately developed C code, the problem of untracked pointers can be mitigated by careful interface design. The C interface contained in the reference implementation of the Lua programming language uses a virtual stack on which pointers to objects are stored (Section 4.1, Stack, in Roberto Ierusalimschy et al: The Lua 5.4 Reference Manual (2023)). Object pointers are never exposed. C programmers use stack slot indices to manipulate them. Contrast this with CPython’s approach, where the actual object pointers are exposed to C extension modules. With a copy garbage collector, limiting access to object pointers to a small portion of the virtual machine also seems reasonable.
My virtual ISA contains non-heap-allocated fixed-size integers and floating-point values, both as local variables in procedures and as fields in heap-allocated objects. The assembler performs type checking, so it already knows whether a register contains a pointer to an object at the beginning of each instruction. (An instruction can use a register as input only if all execution paths leading to it have a sufficiently compatible type, but the register itself does not have a single fixed type during procedure execution.)
At runtime, the virtual machine interpreter does not need to perform type checking because, in addition to shifting registers, separate instructions are used for scalar and pointer values. I already had a working heap unwrapper (for exception handling), but I wasn’t sure if encoding information from the heap frame into relevant instructions (which can trigger garbage collection) in the form of a map battery worked. Although there’s no debug interface yet, I’ve added a very basic heap print implementation, which displays the contents of the heap and, for pointers to objects identified by the map heap, the object header from the heap. The test suite can then use simple regular expressions to check that the output contains pointers in the expected locations for a set of representative test cases.
As for the pointer maps for heap-allocated objects, it is possible to do better: compare the pointer values coming from explicit access to the object with the results of an object parsing procedure that uses the same pointer identification method as the garbage collector. (I’m specifically not targeting general runtime reflection support, so this is just an interface for internal testing).
This data-redundant approach (by comparing garbage collection results to pointers known to be present) allows easy end-to-end object scanning code testing. Similar introspective testing is possible for stack maps if there is a way to convert object pointers to integers, losing the pointer property. (However, this is only internal testing, as exposing the address of objects in this way encourages direct manipulation of objects, perhaps from separately written C code).
The garbage collector no longer updates these integers, but the address of the object changes. The test case knows which registers contain the pointers and ensures that the garbage collector has updated those registers (assuming we have a garbage collector copying).
Continuing the theme of using redundant data for consistency checking, it is possible to keep the heap analyzed at all times, or at least at safe points: starting at the low address of the heap or segment, it is possible to find all previously allocated objects (whether alive or not). For each object, the address of the next object can be determined based on the header information found in the current object. This means that object headers must exist, but are also useful in other contexts (e.g. to implement dynamic uploading or for verified descents from a base class pointer to a pointer driver class).
Heap traversal can be used to check at a specific time (for example, immediately after garbage collection) that the heap is consistent, without depending on a new traversal from the root. The first stage can construct a bit array containing the set bits for all the starting addresses of objects in the heap, and the second stage of the verification process can then verify that each identified pointer is either zero or points to the beginning of a previously identified object from the heap.
Currently, this checking procedure is very similar to that of a Cheney-style copy garbage collector (without the object copy parts), but can be used without modification with any other garbage collector that maintains a parseable heap, even if only artificially (via inserting dummy items to fill gaps in the stack).
Another thing that has allowed me to get better coverage with the limited tests I’ve had so far is to run garbage collection on every allocation, or at least much more often than usual. (I think I first saw this in Hotspot, which has several GCALot tags in debug builds). With this execution, the tests exercise the garbage collection code much more.
Was it worth it?
So far, the garbage collector seems to be working well, which isn’t surprising considering it’s so simple. The garbage collection-specific problems I’ve run into are related to the heap size heuristics and determining the heap boundary at which the next garbage collection starts. The address calculations for this (and the memory regions to unmap, since they won’t be needed) turned out to be quite tricky. There were some early integration issues with the stack maps because the assembler and the VM implementation didn’t fully agree on their encoding, but this was already visible in the stack dump debuggers I added for testing.
source: “Garbage collectors are terrible” (Florian Weimer)
and you
What do you think ?
See also: