0

There may be several owners of Vec<u8> via cloned Arc<Vec<u8>> pointers. I want to find the other Arc pointers to then drop the Vec<u8>.


In a multi-threaded rust program, I would like to drop all references to shared data Vec<u8> so the underlying heap memory is released. Currently, a check of Arc::strong_count returns a count of 2. This means there are two Arc<Vec<u8>> instances "alive" at that time. To drop the data Vec<u8>, all Arc<Vec<u8>> instances must also be dropped.

But I don't know where the other Arc<Vec<u8>> instances reside.

Is there a tool or method to reveal the originating source of an Arc pointer at runtime?

JamesThomasMoon
  • 6,169
  • 7
  • 37
  • 63
  • The point of reference counting ("Atomically Reference Counted") is that it's a count, not back-pointers. It gets incremented when a reference is created; it gets decremented when a reference is destroyed. If you want to communicate between reference holders, you'll need a different tool. The most likely tool you'd want is `Weak`. Then if the one-and-only strong reference is dropped, all the weak references will be dropped as well. (It's important to note that any time you've built something with "I don't know where the instances reside," you're probably going to have headaches in Rust.) – Rob Napier Jun 16 '22 at 03:01

1 Answers1

4

You can force freeing of an Arc by decrement_strong_count()ing in a loop until there are no more owners, but this is a very bad idea. The other owners may access the data afterwards, causing undefined behavior.

Instead, you should make it voluntary. The best way to do that is keeping Weaks instead of Arc for the other owners. This way, you are the sole owner, and they can access the data if it has not been dropped by upgrade()ing the Weak. Note it will not free the memory of the Arc itself until all Weaks are dropped (this is necessary for soundness), but it will drop the contained data.

If you want to know what code is holding the Arc, this is going to be much harder. Arc doesn't track its owners, so you have to do it. I don't know a tool that will do that. For example, you can capture a backtrace every time you clone the Arc and store all backtraces near the data, something like:

use std::ops::Deref;
use std::sync::{Arc, Mutex, MutexGuard};

use backtrace::Backtrace;
use slotmap::basic::IterMut as SlotMapIterMut;
use slotmap::{DefaultKey, SlotMap};

struct CloningAwareArcData<T> {
    backtraces: Mutex<SlotMap<DefaultKey, Backtrace>>,
    data: T,
}

// This struct doesn't support unsized pointees. It can, although I think it
// will require nightly.
pub struct CloningAwareArc<T> {
    data: Arc<CloningAwareArcData<T>>,
    backtrace_key: DefaultKey,
}

impl<T> CloningAwareArc<T> {
    pub fn new(v: T) -> Self {
        let mut backtraces = SlotMap::new();
        let backtrace_key = backtraces.insert(Backtrace::new_unresolved());
        Self {
            data: Arc::new(CloningAwareArcData {
                backtraces: Mutex::new(backtraces),
                data: v,
            }),
            backtrace_key,
        }
    }

    /// This function will block cloning & calling it again until the iterator is dropped.
    ///
    /// This is an associated function and not a method in order to not interfere with a method of `T` with the same name.
    pub fn active_backtraces(this: &Self) -> ActiveBacktracesIter<'_> {
        ActiveBacktracesIter::new(this.backtrace_key, this.data.backtraces.lock().unwrap())
    }
}

impl<T> Deref for CloningAwareArc<T> {
    type Target = T;

    fn deref(&self) -> &Self::Target {
        &self.data.data
    }
}

impl<T> Clone for CloningAwareArc<T> {
    fn clone(&self) -> Self {
        let backtrace_key = self
            .data
            .backtraces
            .lock()
            .unwrap()
            .insert(Backtrace::new_unresolved());
        Self {
            data: Arc::clone(&self.data),
            backtrace_key,
        }
    }
}

// This struct is safe-referential: we need to store the guard and the
// iterator that refers to the map inside the guard.
// Using parking_lot's `MappedMutexGuard` will not help us here since
// it is only usable if we can map the guard to a reference, but we
// map it to an iterator.
// See https://stackoverflow.com/q/40095383/7884305.
// I tried to write it using ouroboros, but gave up :) Doesn't mean this
// is impossible, but it is pretty hard.
pub struct ActiveBacktracesIter<'a> {
    self_key: DefaultKey,
    _guard: MutexGuard<'a, SlotMap<DefaultKey, Backtrace>>,
    inner: SlotMapIterMut<'static, DefaultKey, Backtrace>,
}

impl<'a> ActiveBacktracesIter<'a> {
    fn new(
        self_key: DefaultKey,
        mut guard: MutexGuard<'a, SlotMap<DefaultKey, Backtrace>>,
    ) -> Self {
        let inner: SlotMapIterMut<'_, DefaultKey, Backtrace> = guard.iter_mut();
        // SAFETY: Lifetimes cannot affect layout, and we're holding it only until we drop
        // the guard.
        let inner: SlotMapIterMut<'static, DefaultKey, Backtrace> =
            unsafe { std::mem::transmute(inner) };
        Self {
            self_key,
            _guard: guard,
            inner,
        }
    }
}

impl<'a> Iterator for ActiveBacktracesIter<'a> {
    type Item = &'a Backtrace;

    fn next(&mut self) -> Option<Self::Item> {
        let (key, backtrace) = self.inner.next()?;
        if self.self_key != key {
            backtrace.resolve();
            Some(backtrace)
        } else {
            // Skip owning `Arc`
            self.next()
        }
    }

    // You can implement additional iterator methods and traits, for optimization.
}

However, note that Arc cloning is supposed to be cheap. Backtrace capture is going to be very expensive, and even if you attach some other data it will make cloning much more expensive than just atomic addition + compare and jump.

Chayim Friedman
  • 47,971
  • 5
  • 48
  • 77
  • Chayim Friedman thanks for the recommendation. I was considering the `Weak` `Arc` pointer approach. But before I introduce that change (and it's risks), I would like to try to fix my use of the `Strong` `Arc` pointers. – JamesThomasMoon Jun 16 '22 at 03:05
  • @JamesThomasMoon What do you mean by "fixing your use"? – Chayim Friedman Jun 16 '22 at 03:05
  • > _What do you mean by "fixing your use"_ I mean, something in my program still has an `Arc` pointer, and I'm certain it doesn't need it anymore. But I don't know what that thing is. So my "fix" is re-examining where `Arc` are cloned. Then refactoring the code to avoid unneeded `Arc` clones (if that's possible). But before that, and the point of this question: _how do I find what other thing(s) are holding an `Arc` clone?_ The program I'm working on is fairly large, so it's difficult to do by simply reading the code. (maybe a tool could assist me?) – JamesThomasMoon Jun 16 '22 at 03:08
  • "I'm certain it doesn't need it anymore. But I don't know what that thing is." That's impossible. If you don't know what it is, there is no way you know (provably at compile time) that it isn't needed. If you destroy the object while there are still Arc references to it, the program will crash (in the best case; worst case it will "do something undefined") when the references are used. Rust won't allow that. You will find immediately what parts of your program use the Arc reference by changing the code to not use Arc, and see where the compiler fails. That's the power of strong types. – Rob Napier Jun 16 '22 at 03:46
  • @JamesThomasMoon Edited, but I don't really think you have something. – Chayim Friedman Jun 16 '22 at 05:25
  • @ChayimFriedman great code sample. Thanks! – JamesThomasMoon Jun 18 '22 at 23:09