1

The problem: I want an iterator over items under a guard (RwLockReadGuard in this case). The items are not references, they are cloned. It seems a lot like those questions:

But maybe there is an additional flavour in this question I can't get my head around.

So below there is a StructWithRwLock that implements an Iterator. And there is a struct ManyStructWithRwLock which (surprisingly) contains a vec of structs StructWithRwLock. It also implements an iterator. The iterator of ManyStructWithRwLock runs each iterator of its children and always produces the next least element. For that a min-heap (BinaryHeap) structure is used. The iterator is not allowed to use allocations inside so it has to use an externally allocated BinaryHeap.

There is a test in the end. In the test two copies of the iterators are created. Both are dropped in between. But compiler for some reason thinks that mutable reference to binary heap may be used when the binary heap itself is dropped. How come? So I get this error

error[E0597]: `test_struct` does not live long enough
   --> src/test_example.rs:132:24
    |
132 |         let mut iter = test_struct.iter(&mut buffer_heap);
    |                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ borrowed value does not live long enough
...
142 |     }
    |     -
    |     |
    |     `test_struct` dropped here while still borrowed
    |     borrow might be used here, when `buffer_heap` is dropped and runs the destructor for type `BinaryHeap<StructWithRwLockIter<'_>>`
    |
    = note: values in a scope are dropped in the opposite order they are defined

Why is this happening? Do I need unsafe for this? How can it be corrected?

The minimal running example:

use std::cmp::Ordering;
use std::collections::BinaryHeap;
use std::sync::{Arc, RwLock, RwLockReadGuard};

type BufferHeap<'b> = BinaryHeap<StructWithRwLockIter<'b>>;

#[derive(Debug)]
pub struct StructWithRwLock {
    inner: Arc<RwLock<Vec<usize>>>
}
impl StructWithRwLock {
    pub fn iter(&self) -> StructWithRwLockIter {
        let inner = self.inner.read().unwrap();
        StructWithRwLockIter {
            inner,
            current_index: 0,
        }
    }

}
pub struct StructWithRwLockIter<'a> {
    inner: RwLockReadGuard<'a, Vec<usize>>,
    current_index: usize,
}
impl<'a> StructWithRwLockIter<'a> {
    fn peek(&self) -> Option<usize> {
        let entries = &self.inner;

        if self.current_index >= entries.len() {
            return None;
        }

        let item = entries.get(self.current_index).unwrap();
        Some(*item)
    }
}

impl<'a> Eq for StructWithRwLockIter<'a> {}

impl<'a> PartialEq<Self> for StructWithRwLockIter<'a> {
    fn eq(&self, other: &Self) -> bool {
        self.peek().eq(&other.peek())
    }
}

impl<'a> PartialOrd<Self> for StructWithRwLockIter<'a> {
    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
        self.peek().partial_cmp(&other.peek())
    }
}

impl<'a> Ord for StructWithRwLockIter<'a> {
    fn cmp(&self, other: &Self) -> std::cmp::Ordering {
        self.peek().cmp(&other.peek())
    }
}

impl<'a> Iterator for StructWithRwLockIter<'a> {
    type Item = usize;

    fn next(&mut self) -> Option<Self::Item> {
        let entries = &self.inner;

        if self.current_index >= entries.len() {
            return None;
        }

        let item = entries.get(self.current_index).unwrap();
        self.current_index += 1;
        Some((*item).clone())
    }
}

pub struct ManyStructWithRwLock {
    items: Vec<StructWithRwLock>
}

impl ManyStructWithRwLock {
    pub fn iter<'b, 'a:'b>(&'b self, ext_buffer: &'a mut BufferHeap<'b>) -> impl Iterator<Item = usize> +'b {
        ext_buffer.clear();
        for item in &self.items {
            ext_buffer.push(item.iter());
        }

        kmerge(ext_buffer)
    }
}

/// similar to itertools::kmerge_by but using extartnal buffer
pub fn kmerge<'b,'a:'b>(ext_buffer: &'a mut BufferHeap<'b>) -> KMergeStrWLockBy<'a, 'b>
{
    KMergeStrWLockBy {
        heap: ext_buffer,
    }
}

pub struct KMergeStrWLockBy<'a, 'b>
{
    heap: &'a mut BufferHeap<'b>,
}

impl<'a, 'b> Iterator for KMergeStrWLockBy<'a, 'b>
{
    type Item = usize;

    fn next(&mut self) -> Option<Self::Item> {
        let mut next = self.heap.pop()?;
        let item = next.next()?;
        self.heap.push(next);
        Some(item)
    }
}
#[cfg(test)]
mod tests {
    use super::*;
    use std::collections::BinaryHeap;

    #[test]
    fn test_aggr_iterator() {
        let mut buffer_heap = BinaryHeap::with_capacity(2);
        let test_struct = ManyStructWithRwLock {
            items: vec![
                StructWithRwLock {
                    inner: Arc::new(RwLock::new(vec![1, 2, 3, 4, 5, 6, 7, 8, 9, 10]))
                },
                StructWithRwLock {
                    inner: Arc::new(RwLock::new(vec![1, 2, 3, 4, 5, 6, 7, 8, 9, 10]))
                }
            ]
        };

        let mut iter = test_struct.iter(&mut buffer_heap);
        for i in 0..10 {
            let _ = iter.next().unwrap();
        }
        drop(iter);
        let mut iter = test_struct.iter(&mut buffer_heap);
        for i in 0..10 {
            let _  = iter.next().unwrap();
        }
        drop(iter);
    }
}
Nikolay Zakirov
  • 1,505
  • 8
  • 17
  • 1
    `&'a mut BufferHeap<'b>` with `'a: 'b` is the same as `&'a mut BufferHeap<'a>` since `'b` already has to outlive `'a` so that's no good. – cafce25 Apr 12 '23 at 11:48
  • And `&'a mut Foo<'a>` is always wrong. See: https://stackoverflow.com/a/66253247/5397009 – Jmb Apr 12 '23 at 15:06
  • Thank you, great! So it follows that I can use it (as the external buffer is essentially static and lives for the duration of the program) but I can not test it? – Nikolay Zakirov Apr 12 '23 at 15:52

1 Answers1

0

This looks like you are inducing a borrow of test_struct in buffer_heap by forwarding items into your mutable reference of buffer_heap, and the complaint that test_struct is dropped too soon is an indicator for this. Namely:

impl StructWithRwLock {
    pub fn iter(&self) -> StructWithRwLockIter {
        let inner = self.inner.read().unwrap();  // <-- mutable borrow of the vec inside Inner, requires a borrow of self
        StructWithRwLockIter {  // <-- therefore this returns an object that borrows self
            inner,
            current_index: 0,
        }
    }
}

It's fairly normal that an Iterator struct borrows the parent since it has to iterate over the elements. However, down here:

        // ManyStructWithRwLock::iter
        for item in &self.items {   // <-- borrow of self to borrow items
            ext_buffer.push(item.iter());  // <-- pushes a reference within item, which therefore borrows item, which therefore borrows self
        }

instead of pushing the cloned items of the individual structs into the borrowed buffer, you are pushing an Iterator of each struct, and the Iterators borrow test_struct as shown, and so buffer_heap now borrows test_struct. Iterators aren't automatically dropped when they run out of items, they are just left in a "done" state, and here they're still in the heap (and I'm not sure the borrow checker would even understand if you're able to remove them if you iterate 11 and not just 10 or do a clear).

You could clone eagerly from your inner structs and put those directly in the buffer (but this is an allocation), or wrap your struct vectors in Rc or Cell and clone those--that's still an allocation for copying pointers, but at least it doesn't allocate the whole contents of the vector. Or you could require the buffer be moved into ManyStructWithRwLock::iter, so then it is owned by the Iterator of the data as well, and is thus dropped when the iterator is.

As someone new to Rust who repeatedly runs into this issue trying to make mutable references to things that were accidentally borrowed into other objects... trying to hold references in structs is super difficult for maintenance.

Zannick
  • 25
  • 5