C++ FAQ Celebrating Twenty-One Years of the C++ FAQ!!!
(Click here for a personal note from Marshall Cline.)
Section 36:
[36.10] How do I serialize objects that contain pointers to other objects, but those pointers form a tree with no cycles and only "trivial" joins?

As before, the word "tree" does not mean that the objects are stored in some sort of tree-like data structure like std::set. It simply means your objects have pointers to each other, and the "no cycles" part means you can follow the pointers from one object to the next and never return to an earlier object. The objects are not "inside" a tree; they are a tree. If that doesn't make sense, you really should read the lingo FAQ before continuing with this one.

Use this solution if the graph contains joins at the leaf nodes, but those joins can be easily reconstructed via a simple look-up table. For example, the parse-tree of an arithmetic expression like (3*(a+b) - 1/a) might have joins since a variable-name (like a) can show up more than once. If you want the graph to use the same exact node-object to represent both occurrences of that variable, then you could use this solution.

Although the above constraints don't fit with those of the solution without any joins, it's so close that you can squeeze things into that solution. Here are the differences:

  • During serialization, ignore the join completely.
  • During unserializing, create a look-up table, like std::map<std::string,Node*>, that maps from the variable name to the associated node.

Caveat: this assumes that all occurrences of variable a should map to the same node object; if it's more complicated than this, that is, if some occurrences of a should map to one object and some to another, you might need to use a more sophisticated solution.

Caveat: you need to take special care if your objects contain pointers which point to a member of some object, rather than to the object itself. For example, if pointer 'p' points to 'x.y' and pointer 'q' points to 'x.z', that is, they point to data members within object 'x' rather than to object 'x' itself, when you serialize the pointers, you need to recognize both point to the same identical object. You will need to serialize object 'x', not just 'x.y' and 'x.z', and you need to make sure upon unserialization 'p' and 'q' point again to the same identical object. The serialization process will probably require a two-pass approach, and serialized pointer will probably need to store an offset within the target object as well as the target object's ID.