I'm interested in finding or implementing a Rust data structure that provides a zero-cost way to memoize a single computation with an arbitrary output type T
. Specifically I would like a generic type Cache<T>
whose internal data takes up no more space than an Option<T>
, with the following basic API:
impl<T> Cache<T> {
/// Return a new Cache with no value stored in it yet.
pub fn new() -> Self {
// ...
}
/// If the cache has a value stored in it, return a reference to the
/// stored value. Otherwise, compute `f()`, store its output
/// in the cache, and then return a reference to the stored value.
pub fn call<F: FnOnce() -> T>(&self, f: F) -> &T {
// ...
}
}
The goal here is to be able to share multiple immutable references to a Cache
within a single thread, with any holder of such a reference being able to access the value (triggering the computation if it is the first time). Since we're only requiring to be able to share a Cache
within a single thread, it is not necessary for it to be Sync
.
Here is a way to implement the API safely (or at least I think it's safe) using unsafe
under the hood:
use std::cell::UnsafeCell;
pub struct Cache<T> {
value: UnsafeCell<Option<T>>
}
impl<T> Cache<T> {
pub fn new() -> Self {
Cache { value: UnsafeCell::new(None) }
}
pub fn call<F: FnOnce() -> T>(&self, f: F) -> &T {
let ptr = self.value.get();
unsafe {
if (*ptr).is_none() {
let t = f();
// Since `f` potentially could have invoked `call` on this
// same cache, to be safe we must check again that *ptr
// is still None, before setting the value.
if (*ptr).is_none() {
*ptr = Some(t);
}
}
(*ptr).as_ref().unwrap()
}
}
}
Is it possible to implement such a type in safe Rust (i.e., not writing our own unsafe
code, only indirectly relying on unsafe
code in the standard library)?
Clearly, the call
method requires mutating self
, which means that Cache
must use some form of interior mutability. However, it does not seem possible to do with a Cell
, because Cell
provides no way to retrieve a reference to the enclosed value, as is required by the desired API of call
above. And this is for a good reason, as it would be unsound for Cell
to provide such a reference, because it would have no way to ensure that the referenced value is not mutated over the lifetime of the reference. On the other hand, for the Cache
type, after call
is called once, the API above does not provide any way for the stored value to be mutated again, so it is safe for it to hand out a reference with a lifetime that can be a long as that of the Cache
itself.
If Cell
can't work, I'm curious if the Rust standard library might provide some other safe building blocks for interior mutability which could be used to implement this Cache
.
Neither RefCell
nor Mutex
fulfill the goals here:
- they are not zero-cost: they involve storing more data than that of an
Option<T>
and add unnecessary runtime checks. - they do not seem to provide any way to return a real reference with the lifetime that we want -- instead we could only return a
Ref
orMutexGuard
which is not the same thing.
Using only an Option
would not provide the same functionality: if we share immutable references to the Cache
, any holder of such a reference can invoke call
and obtain the desired value (and mutating the Cache
in the process so that future calls will retrieve the same value); whereas, sharing immutable references to an Option
, it would not be possible to mutate the Option
, so it could not work.