How can I access a Rust Iterator from Python using PyO3?

Question

I'm quite new with Rust, and my first 'serious' project has involved writing a Python wrapper for a small Rust library using PyO3. This has mostly been quite painless, but I'm struggling to work out how to expose lazy iterators over Rust Vecs to Python code.

So far, I have been collecting the values produced by the iterator and returning a list, which obviously isn't the best solution. Here's some code which illustrates my problem:

use pyo3::prelude::*;

// The Rust Iterator, from the library I'm wrapping.
pub struct RustIterator<'a> {
    position: usize,
    view: &'a Vec<isize>
}

impl<'a> Iterator for RustIterator<'a> {
    type Item = &'a isize;

    fn next(&mut self) -> Option<Self::Item> {
        let result = self.view.get(self.position);
        if let Some(_) = result { self.position += 1 };
        result
    }
}

// The Rust struct, from the library I'm wrapping.
struct RustStruct {
    v: Vec<isize>
}

impl RustStruct {
    fn iter(&self) -> RustIterator {
        RustIterator{ position: 0, view: &self.v }
    }
}

// The Python wrapper class, which exposes the 
// functions of RustStruct in a Python-friendly way.
#[pyclass]
struct PyClass {
    rust_struct: RustStruct,
}

#[pymethods]
impl PyClass {
    #[new]
    fn new(v: Vec<isize>) -> Self {
        let rust_struct = RustStruct { v };
        Self{ rust_struct }
    }

    // This is what I'm doing so far, which works
    // but doesn't iterate lazily.
    fn iter(&self) -> Vec<isize> {
        let mut output_v = Vec::new();
        for item in self.rust_struct.iter() {
            output_v.push(*item);
        }
        output_v
    }
}

I've tried to wrap the RustIterator class with a Python wrapper, but I can't use PyO3's #[pyclass] proc. macro with lifetime parameters. I looked into pyo3::types::PyIterator but this looks like a way to access a Python iterator from Rust rather than the other way around.

How can I access a lazy iterator over RustStruct.v in Python? It's safe to assume that the type contained in the Vec always derives Copy and Clone, and answers which require some code on the Python end are okay (but less ideal).

score 0 · Answer 1 · edited Aug 25 '23 at 19:28

My suggestion is going to be that, as you've pointed out, PyO3 is not designed to handle generics on PyClass implementors. In this case, it's keeping you from doing a potentially dangerous thing because of the lifetime generic on the RustIterator you are trying to wrap. rustc can't analyze lifetimes over an FFI boundary like the Rust/Python boundary that PyO3 is attempting to cross. Therefore, you can only pass a wrapper if it is 'static + Send + Sync ('static references don't require a generic lifetime; it's a built-in keyword).

That means that (assuming you have access to the interior of the RustIterator) you could use unsafe code to change the lifetime of the &'a Vec<T>. The compiler doesn't assume lifetimes when you dereference a raw pointer. It trusts that the lifetime is whatever you tell it. That means that you could do something like the below example as long as you can be sure that the iterator that you pass to the Python runtime will never outlive the RustIterator that you got from the library you are wrapping. Just to be clear, this will probably do bad things to your library.

struct StaticRustIterator {
    position: usize,
    view: &'static Vec<isize>
}

fn make_iter_static<'a>(iter: RustIterator<'a>) -> StaticRustIterator {
    let RustIterator { position, view } = iter;
    let static_iter = StaticRustIterator {
        position,
        view: unsafe { view.as_ptr() as &'static Vec<isize> }
    };
    static_iter
}

Your other option if you want to use PyO3 is to clone the view field in the RustIterator and make a 'static iterator that way. This is expensive, but safe.

struct StaticRustIterator {
    position: usize,
    view: Vec<isize>
}

fn make_iter_static<'a>(iter: RustIterator<'a>) -> StaticRustIterator {
    let RustIterator { position, view } = iter;
    let static_iter = StaticRustIterator { position, view: view.clone() };
    static_iter
}

I personally wouldn't choose either of these approaches if I could help it. You could see if there is a way to get an atomically reference counted pointer to that view instead of getting a RustIterator. If you could kind of reverse-engineer some of the deeper, grittier parts of the library so that they are safe to pass over the FFI boundary, I would recommend doing that instead. Or maybe someone else will post here and reveal some trick to turn a reference into a weak pointer or something. This isn't a great answer, but hopefully it gives you some ideas.

How can I access a Rust Iterator from Python using PyO3?

1 Answers1