1

I'm using the Rust csv crate to read CSV files. I want to create the option for the user to take x first records from the CSV.

Given a function like fn read_records(csv_reader: csv::Reader, max_records: Option<usize>) -> ?, I want to do the below:

use std::fs::File;
use std::io::BufReader;

use csv as csv_crate;

use self::csv_crate::StringRecordsIntoIter;

/// Read a csv, and print the first n records
fn read_csv_repro(
    mut file: File,
    max_read_records: Option<usize>,
) {
    let mut csv_reader = csv::ReaderBuilder::new()
        .from_reader(BufReader::new(file.try_clone().unwrap()));

    let records: Box<StringRecordsIntoIter<std::io::BufReader<std::fs::File>>> = match max_read_records {
        Some(max) => {
            Box::new(csv_reader.into_records().take(max).into_iter())
        },
        None => {
            Box::new(csv_reader.into_records().into_iter())
        }
    };

    for result in records
    {
        let record = result.unwrap();

        // do something with record, e.g. print values from it to console
        let string: Option<&str> = record.get(0);
        println!("First record is {:?}", string);
    }
}

fn main() {
    read_csv_repro(File::open("csv_test.csv").unwrap(), Some(10));
}

(gist)

I'm struggling with getting my code to work, with the below error from the compiler:

error[E0308]: mismatched types
  --> src/main.rs:18:22
   |
18 |             Box::new(csv_reader.into_records().take(max).into_iter())
   |                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected struct `csv::reader::StringRecordsIntoIter`, found struct `std::iter::Take`
   |
   = note: expected type `csv::reader::StringRecordsIntoIter<_>`
              found type `std::iter::Take<csv::reader::StringRecordsIntoIter<_>>`

How can I get the above code to work?

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
nevi_me
  • 2,702
  • 4
  • 24
  • 37
  • See also [Conditionally iterate over one of several possible iterators](https://stackoverflow.com/q/29760668/155423) for a version of this that requires no heap allocation – Shepmaster Jan 06 '19 at 17:10

2 Answers2

3

While Nate's answer works for this specific case, the more general solution here is to use trait objects. My impression is that this is what you were intending to do by using Box here. Otherwise, in Nate's solution, the use of Box is completely superfluous.

Here is code that uses trait objects without needing to do take(std::usize::MAX) (using Rust 2018):

use std::fs::File;
use std::io::BufReader;

/// Read a csv, and print the first n records
fn read_csv_repro(
    file: File,
    max_read_records: Option<usize>,
) {
    let csv_reader = csv::ReaderBuilder::new()
        .from_reader(BufReader::new(file.try_clone().unwrap()));

    let records: Box<Iterator<Item=csv::Result<csv::StringRecord>>> =
        match max_read_records {
            Some(max) => {
                Box::new(csv_reader.into_records().take(max).into_iter())
            },
            None => {
                Box::new(csv_reader.into_records().into_iter())
            }
        };

    for result in records
    {
        let record = result.unwrap();

        // do something with record, e.g. print values from it to console
        let string: Option<&str> = record.get(0);
        println!("First record is {:?}", string);
    }
}

fn main() {
    read_csv_repro(File::open("csv_test.csv").unwrap(), Some(10));
}
BurntSushi5
  • 13,917
  • 7
  • 52
  • 45
  • I had tried `Box>>` but was getting the error. Using `csv::Result` looks like is what I needed (because I was struggling to convert my expected error into a csv Error type. Thanks – nevi_me Jan 05 '19 at 19:09
0

You have to take(std::usize::MAX) when max_records is None. It's annoying, but both iterators have to have the same type to be stored in the same variable. Also, the .intoIter()'s that you added have no effect, as you were calling them on iterators.

fn read_csv_repro(file: File, max_read_records: Option<usize>) {
    let mut csv_reader = csv::Reader::from_reader(BufReader::new(file));
    let records: Box<std::iter::Take<StringRecordsIntoIter<std::io::BufReader<std::fs::File>>>> = match max_read_records {
        Some(max) => {
            Box::new(csv_reader.into_records().take(max))
        },
        None => {
            Box::new(csv_reader.into_records().take(std::usize::MAX))
        }
    };
}
Nate
  • 481
  • 3
  • 11
  • 1
    Thanks, I didn't know about usize::MAX. Since I use `take()` in both branches, I've simplified my `records` to `csv_reader.into_records().take(max_read_records.unwrap_or(std::usize::MAX))`. – nevi_me Jan 05 '19 at 04:14