0

I want to model a very large (number of nodes) graph consisting of many very large (memory-wise) nodes. Since the nodes are so large, I would like to only store them once and pass around borrows to them, so conceptually something like this:

struct Graph<'a> {
    nodes: Vec<Node<'a>>,
}

struct Node<'a> {
    edges: HashMap<String, &'a Node>,
    // ...lots of other data...
}

Of course there is no way to construct a Graph like this because (a) Vec does not allow me to add new nodes when there are borrows to the elements, and (b) I can't tell rustc that the nodes vector will live for lifetime 'a. I can also not use something like Rc because the graph has cycles.

What I would like to be able to express is an arena of sorts, which lets me allocate a lot of Nodes, make immutable borrows to them as long as the arena lives, and use lifetime checks to ensure that I have no remaining Node references out when the arena is de-allocated. Something like:

struct Graph<'a> {
    nodes: Arena<'a, Node<'a>>,
}

struct Node<'a> {
    edges: HashMap<String, &'a Node>,
}

impl<'a, A> Arena<'a, A> {
    fn own(&self, a: A) -> &'a A {
        // magic
    }
}

impl<'a, A> Drop for Arena<'a, A> {
    fn drop(&'a mut self) {
        // magic
    }
}

Is this semantically possible to express in Rust?

Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
dflemstr
  • 25,947
  • 5
  • 70
  • 105
  • 1
    Searching on [crates.io](https://crates.io/) for "arena" provides many results. One such crate is [typed-arena](https://crates.io/crates/typed-arena). – Shepmaster Mar 01 '16 at 22:04
  • 1
    *I can also not use something like `Rc` because the graph has cycles.* — not completely true; you can introduce [`Weak`](http://doc.rust-lang.org/std/rc/struct.Weak.html) references when you have cyclical data. The [module-level documentation](http://doc.rust-lang.org/std/rc/index.html) has an example. – Shepmaster Mar 01 '16 at 22:06
  • @Shepmaster The `typed-arena` implementation lets me get borrows, but those borrows are bound to a `&mut` lifetime that I can't connect to the lifetime of the arena. I'll try using weak references, though. – dflemstr Mar 03 '16 at 18:37

1 Answers1

3

A simple go-to solution is to use the typed-arena crate. It contains an Arena type with a fn alloc(&self, T) -> &mut T method.

Another simple solution would be to use indices instead of references (and never remove from the Vec as this would invalidate indices). On 64 bits platforms, using 32 bits indices could shave off some bytes.

Both solutions, however, suffer from the inability to remove nodes. You may stop referencing them, but they will still live in memory, and as such using them for dynamic graphs (where nodes come and go) requires a bit more work. My advice in this case is to periodically create a new clone of the graph from a fresh arena (not duplicating the unused nodes), which is akin to using a Moving Garbage Collector, if less automatic.

Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
  • *On 64 bits platforms, using 32 bits indices could shave off some bytes.* — would this entail many casts between a `u32` and `usize`? – Shepmaster Mar 02 '16 at 13:23
  • @Shepmaster: well, at hardware level, it will entail casts of course; at software level, however, providing the vector is properly encapsulated with access functions taking a `u32`, then you should only get one or two `as usize`. – Matthieu M. Mar 02 '16 at 13:29
  • @MatthieuM. Could you demonstrate how `typed-arena` would let me build a graph in a way that would let me encapsulate that graph in a `struct`? – dflemstr Mar 03 '16 at 18:38
  • @MatthieuM. (I have used `typed-arena` before, but I have only gotten it to work in a single stack frame) – dflemstr Mar 03 '16 at 20:59
  • @dflemstr: I suggested `typed-arena` because you were speaking about an arena, however the items created by an arena borrow said arena (because you cannot refer to the item after the arena is dead) and sibling borrowing relationships are not allowed in Rust so indeed the arena itself would not be part of the graph. I do not think this is much of an issue (as long as you have an upper-bound on the lifetime). If being part of the graph is necessary, then the `Vec` + indices solution is easier. – Matthieu M. Mar 04 '16 at 07:13