1

Basically what I tried to achieve is the following graph-kinda structure in C.

#include <stdlib.h>
struct node{
    struct node *next;
};
int main() {
    struct node* n = malloc(sizeof(struct node));
    n->next = n;
    return 0;
}

what I have tried in Rust:

struct Node {
    next: Vec<Rc<RefCell<Node>>>,
}
fn main() {
    let n = Rc::new(RefCell::new(Box::new(Node { next: vec![] })));
    n.borrow_mut().next = vec![n.clone()];
}

and it works but it uses a Vec. What I really want:

struct Node {
    next: Rc<RefCell<Node>>,
}
fn main() {
    let n = Rc::new(RefCell::new(Box::new(Node { next: ? })));
    n.borrow_mut().next = n.clone();
}

I cannot even construct a Node in this circumstance. One solution is to use Option, then I can initialize with None, but it adds a burden (Imagine we want a cyclic chain for example).

Link I have read: https://github.com/nrc/r4cppp/blob/master/graphs/src/rc_graph.rs https://rust-leipzig.github.io/architecture/2016/12/20/idiomatic-trees-in-rust/

Is the solution they called arena the 'rustic' way of solving the problem? I believe RefCell has a runtime cost, is this statement true?

E_net4
  • 27,810
  • 13
  • 101
  • 139
Joe Yan
  • 23
  • 5
  • If you want a cyclic chain then a circular buffer is probably the sensible choice. – kaya3 Jan 03 '23 at 09:02
  • @kaya3 Yep, and it is true that a linked-list-like struct is a terrible choice most of the time . I am curious about the language functionality itself mostly here since rust calls itself a safe system language, I suppose it should at least do as much as C... – Joe Yan Jan 03 '23 at 09:33
  • 2
    The best way is to use indices. But you can also use `Option`s, `Weak` pointers, or `unsafe`. – Chayim Friedman Jan 03 '23 at 10:21
  • 1
    _"rust calls itself a safe system language"_ As surprising as this may come, most developers do not need to make their own low level data structures. `std` has you covered ~99% of the time. Such data structures might be taught in elementary programming courses, but in those they are either implemented in a programming language with a garbage collector, or one without memory safety nets altogether. See also: https://stackoverflow.com/questions/34747464/implement-graph-like-data-structure-in-rust https://stackoverflow.com/questions/28608823/how-to-model-complex-recursive-data-structures-graphs – E_net4 Jan 03 '23 at 10:44
  • @E_net4thecommentflagger Thx, I am new to rust and believe low-level data structures somehow have correspondence to the design taste and hence always try to write something useless like this when learning a new language. For example, the 'arena' option seems to guarantee a better locality than the 'linked' option and surely the trade-off can be amortized performance or foreknowledge of the number of nodes. 'arena' seems to be the right way to go. – Joe Yan Jan 03 '23 at 11:10

2 Answers2

1

You could use pointers and unsafe to get the same behavior just like in C (note this still checks if the allocation succeeded unlike your C example, but I think this is enough of an approximation):

pub struct Node {
    next: *mut Node,
}
fn main() {
    let n = Box::leak(Box::new(Node { next: std::ptr::null_mut() }));
    n.next = n as _;
}

Comparision of the produced instructions

You just won't get any safety benefits that way. I recommend to use an Option<Rc<RefCell<Node>>> the Option is a zero cost abstraction which forces you to think about the fact that the pointer might be null, you can even skip the runtime checks with unwrap_unchecked if you really wanted to.

struct Node {
    next: Option<Rc<RefCell<Node>>>,
}
fn main() {
    let n = Rc::new(RefCell::new(Node { next: None }));
    n.borrow_mut().next = Some(n.clone());
}

And yes RefCell does have a runtime cost, but it's needed because of the guarantees Rust makes about mutable references, namely that they are unique.

Again you can opt out of the safety it provides and resort to UnsafeCell with no runtime overhead, but it puts the burden of checking everything is in order onto you, the programmer, the performance benefit is seldomly worth the price you pay in my experience.

cafce25
  • 15,907
  • 4
  • 25
  • 31
1

You should use Option<Rc<RefCell<Node>>>. There is a runtime cost to RefCell and Option but there is a reason for that.

The runtime cost from Option is the same as in C but it is harder to mess up. In your original C struct

struct node{
    struct node *next;
};

You should check node->next != null before every access. Option<Rc<..>> has the same size as a pointer since the compiler optimizes it. 0 means None non-zero means Some(rc). So this is the same as your C struct, just harder to mess up. Later, if you are confident in parts of the program you can use the unsafe function Option::unwrap_unchecked to skip the check.

Also RefCell has a runtime cost but it is minimal. If you are confident in the correctness of your program, you can replace it with UnsafeCell, which has no runtime cost.

The reason for needing RefCell is to prevent bugs and allow the compiler better optimizations. In C writing the following code is different from the rust code with &mut Node arguments

node* f(node* n1, node* n2) {
   n2->next = 0;
   n1->next = n2;
   return n2->next;
}

Here you cannot know the result of the return value. It could be either n2 or 0 depending on n1 == n2. In rust with mutable references, this cannot happen. It guarantees inequality and the compiler can optimize a lot more and just return 0 every time without reloading n2->next from the memory.

susitsm
  • 465
  • 2
  • 6
  • Love your example, If I remember correctly, C 'restrict' quantifier has similar functionality in optimisation (that is one memory location can only be modified by one left value otherwise undefined behaviour if I remember correctly?). – Joe Yan Jan 03 '23 at 10:56
  • My understanding of using null in C pointer as None in algebraic data type is that we use a 'held out' value for None and a null(0) value is never valid as an address in C. In that way, we can say the pointer in C is kind of isomorphic to Option in algebraic data type. – Joe Yan Jan 03 '23 at 11:02