I'm probably just trying to do something crazy here, so let me explain my use case first:
I've got an object graph in Ruby, consisting only of the basic JSON types (strings, numbers, arrays, hashes, trues, falses, nils). I'd like ultimately to serialize this graph to JSON.
The problem is that I don't have control over the origin of all objects in the graph. This means that some of the strings contained in the object graph might be tagged with the wrong encodings (for example, a string that's actually just a bunch of random garbage bytes ends up tagged with a UTF-8 encoding). This will cause the JSON serialization to fail (since JSON only supports UTF-8 encoded strings).
I have a strategy for handling these problematic strings, which basically consists of replacing them with a transformed version of each string (the exact transformation isn't really relevant).
In order to apply this transformation to strings, I need to walk the entire object graph and find all of them. This is trivial to implement recursively using standard depth-first search. One wrinkle is that I'd like to avoid mutating the original object graph or any strings therein, so I'm basically building a copy of the object graph as I traverse it (with only the non-problematic leaf nodes being referenced directly from the new graph, and all other nodes being duped).
This all works, and is reasonably efficient, save the duplication of non-leaf nodes in the transformed object graph. The problem is that it sometimes gets fed very deeply-nested object graphs, so the recursion will on occasion produce a SystemStackError
.
I've implemented a non-recursive solution using DFS with a stack of Enumerator
objects, but it seems to be dramatically slower than the recursive solution (presumably on account of the extra object allocations for the Enumerator
s and the silly StopIteration
exceptions that get raised at the end of each Enumerator
.
Breadth-first search seems inappropriate, because I don't think there's a way to determine the path back up to the root when visiting a given node, which I think I need in order to build a copy of the tree.
Am I wrong about BFS here? Are there other techniques that I could be using to accomplish this traversal without recursion? Is this all just loony?