1

I'm trying to create an Iterator interface using the csv crate such that I can return a HashMap of col-name: value and am running into a lifetime error I cannot figure out.

For the code below:

use csv::{
    Reader,
    StringRecord,
    StringRecordsIter,
};
use std::collections::HashMap;
use std::fs::File;


pub struct Handler {
    pub reader: Reader<File>
}

impl Handler {
    pub fn new(file: File) -> Handler {
        Handler { reader: Reader::from_reader(file) }
    }
}


// type Row = HashMap<String, String>;
pub struct Row<'r> {
    number: usize,
    fields: HashMap<&'r str, &'r str>,
}


pub struct CSVIterator<'f> {
    current_row: usize,
    headers: StringRecord,
    records: StringRecordsIter<'f, File>,
}

impl<'f> CSVIterator<'f> {
    pub fn new(handler: &'f mut Handler) -> CSVIterator<'f> {
        CSVIterator {
            current_row: 0,
            headers: handler.reader.headers().unwrap().clone(),
            records: handler.reader.records(),
        }
    }
}

impl<'f> Iterator for CSVIterator<'f> {
    type Item = Row<'f>;

    fn next(&mut self) -> Option<Self::Item> {
        let next_record = self.records.next();

        if next_record.is_none() {
            return None;
        }

        let record = next_record.unwrap().unwrap();
        let fields = make_fields(&record, &self.headers);
        let row = Row {
            number: self.current_row,
            fields: fields,
        };

        return Some(row)
    }
}


fn make_fields<'r>(
    record: &'r StringRecord, header: &'r StringRecord
) -> HashMap<&'r str, &'r str> {
    let mut row: HashMap<&str, &str> = HashMap::new();
    for (colname, value) in header.iter().zip(record) {
        row.insert(colname, value);
    }
    row
}

I'm getting the following error:

error[E0495]: cannot infer an appropriate lifetime for borrow expression due to conflicting requirements
  --> src/csvio.rs:55:43
   |
55 |         let fields = make_fields(&record, &self.headers);
   |                                           ^^^^^^^^^^^^^
   |
note: first, the lifetime cannot outlive the anonymous lifetime defined here...
  --> src/csvio.rs:47:13
   |
47 |     fn next(&mut self) -> Option<Self::Item> {
   |             ^^^^^^^^^
note: ...so that reference does not outlive borrowed content
  --> src/csvio.rs:55:43
   |
55 |         let fields = make_fields(&record, &self.headers);
   |                                           ^^^^^^^^^^^^^
note: but, the lifetime must be valid for the lifetime `'f` as defined here...
  --> src/csvio.rs:44:6
   |
44 | impl<'f> Iterator for CSVIterator<'f> {
   |      ^^
note: ...so that the types are compatible
  --> src/csvio.rs:47:46
   |
47 |       fn next(&mut self) -> Option<Self::Item> {
   |  ______________________________________________^
48 | |         let next_record = self.records.next();
49 | |
50 | |         if next_record.is_none() {
...  |
61 | |         return Some(row)
62 | |     }
   | |_____^
   = note: expected `<CSVIterator<'f> as Iterator>`
              found `<CSVIterator<'_> as Iterator>`

For more information about this error, try `rustc --explain E0495`.

I may not intuitively understand the lifetime requirements for the next method here, can someone point me in the right direction?

Thanks!

Herohtar
  • 5,347
  • 4
  • 31
  • 41
kemri
  • 149
  • 12
  • Does this answer your question? [How do I write an iterator that returns references to itself?](https://stackoverflow.com/questions/30422177/how-do-i-write-an-iterator-that-returns-references-to-itself) The problem with your code is effectively the same as in that question -- part of the item you are trying to return borrows data from the iterator itself, which isn't possible in Rust due to the signature of `Iterator::next`. __The items returned by an iterator are allowed to outlive the iterator itself.__ – cdhowie May 06 '22 at 14:33
  • As a side note, you can `.collect()` from an iterator of 2-tuples directly into a `HashMap`, so you can replace the entire body of `make_fields()` with `header.iter().zip(record).collect()`. – cdhowie May 06 '22 at 14:37
  • Note you're also going to have a problem because you borrow from the local `record`, even if you fix the lifetime issue with the headers. Basically, the iterated type should be `HashMap` instead of `HashMap<&'r str, &'r str>` because the actual read strings aren't valid for all of `'r`. – cdhowie May 06 '22 at 14:41
  • So a signature of `HashMap` works, I was just trying to avoid having to clone/copy data for every row of the CSV. but maybe that's just the simplest (or least ridiculous) way to do this. – kemri May 06 '22 at 14:57
  • I'm not sure I understand the issue around Rust not allowing iterators to return borrowed data from itself either, but I'm reading up right now. thanks for the link! – kemri May 06 '22 at 14:59
  • 1
    The string data has to live somewhere. If you want to return a borrowed value (`&str`) then the value you're _borrowing from_ has to be valid for all of `'r`, but right now you have nowhere to obtain such values. One is tied to the life of the iterator (`self.headers`) and the other is limited to the execution of `Iterator::next` (`record`) both of which have lifetimes shorter than `'r`. To make this work with borrowing you'd have to stuff the strings onto the `Handler` somewhere, because that's the only thing you have access to that's valid for all of `'r`. – cdhowie May 06 '22 at 15:00
  • 1
    Unless you're dealing with millions of records (and even then) performance is unlikely to be a huge deal here. The biggest issue is where would you store the `StringRecord`? In theory you could put a `Vec` on the `Handler`, but in practice that's not going to work since you would have to read all of the records in advance (you can't hand out references to elements in a `Vec` while you continue to push elements, because pushing requires a mutable borrow). – cdhowie May 06 '22 at 22:45

0 Answers0