Tuesday, March 01, 2005

Destructors in GCed languages II

In response to Destructors in GCed languages, Walter Bright added this point:

6) The destructors can automatically call the base destructors, and they do that. But they cannot automatically call destructors on the members. The reason is that class objects are by reference only, so the members are by reference, so the destructor cannot tell if someone else is holding that reference as well. So it can't call the destructor on them. (It could if it used reference counting memory management, but D uses mark/sweep.)

In C++, it's possible to embed member objects in their enclosing class, and so objects of that class clearly "own" the member objects and cleanup can occur deterministically. UML even has a way to graphically distinguish between embedded objects and shared objects.

As soon as you start sharing objects, however, you lose the determinism and so it would seem that there isn't a way to automatically call destructors for member objects.

But how useful is a destructor that does this? It only solves part of the problem, and leaves the rest to the programmer. I think this is why the Java designers decided to punt on the whole destructor issue, for the same reason as Walter gave, which is more generally "our garbage collector is not deterministic enough to know when to clean up objects."

Is this actually this issue, though? Suppose we separate the ideas of memory reclamation and object cleanup, and say that the nondeterministic garbage collector handles memory reclamation and some other mechanism handles object cleanup. This is what Java does, but you are provided with the finally clause in order to achieve non-memory object cleanup. D tries to go a step further and create a destructor mechanism, but stops before calling member object destructors and thus might do more harm (by implying complete destruction) than good.

Let's look at the implication that reference counting and garbage collection are the same thing. Reference counting can certainly be used as a garbage collection mechanism, as we see in Python. But reference counting is what its name implies: a way to keep track of the number of references there are to a particular object. The problem with calling destructors for shared objects is exactly this: you need to know whether there are any other references to an object before calling its destructor.

It's fairly easy to write a reference-counting implementation to keep track of the references to a shared object so that you can know when to call the destructor. And this runs within a system that has a separate garbage collector.

The downside is that the programmer is responsible for calling any "addRef" method, and it would be nice if it could all be automated.

If you distinguish between the garbage collector for memory reclamation, and reference counting for destruction of shared objects, I think it is possible to solve the automatic destructor problem for garbage-collected languages. Here's how it could work:


  1. If a class has a destructor, the compiler adds reference counting to that class.
  2. Any time a new reference is attached to an object of that class, the reference count is incremented automatically.
  3. As a reference disappears, it decrements the reference to an object.
  4. When an object's reference count goes to zero, the destructor is called.


Basically, the reference-counting mechanism is running in parallel with the regular nondeterministic garbage collector, and reference counting is only attached to objects that have destructors. By separating the two (reference counting for destructors vs. garbage collection for memory release), I believe this would not only allow a completely automatic destructor to be created -- one that would properly clean up member objects -- but I think it would also eliminate the need to explicitly call destructors in finally clauses. I don't know of any situation where you must explicitly call destructors in Python; as soon as the reference count goes to zero the destructor is called.

(Later) Walter commented:


D goes further than just offering finally blocks, it also offers scoped destruction when the 'auto' storage class is used. The scoped destruction can be used for resource management just as in C++, there are other nifty uses for it such as timing code, see
www.digitalmars.com/techtips/timing_code.html.

It doesn't automatically resolve the issue of running destructors on members deterministically, but if you write your class putting such members as 'private' members, and don't create other references to them, you can use the 'delete' operator on them to deterministically clean them up.


MindView Home Page