I would like to iterate over some network files (tsv.gz), parse them (load each row), and only write portions (i.e. columns) to files, i.e. https://datasets.imdbws.com/ (ideally with flate2
), but I can't seem to find any idioms for iterating over files from URIs. Should I use an external package like hyper and try to iterate over Body? If so, how can I convert a Body into something that implements Read
? Here is some base code:
use flate2::read::GzDecoder;
use hyper::Client;
use std::io::BufReader;
#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
let client = Client::new();
let uri = "http://datasets.imdbws.com/title.basics.tsv.gz".parse()?;
let body = client.get(uri).await?.into_body();
let d = GzDecoder::new(body); // hyper::Body doesn't implement Read
for line in BufReader::new(d).lines() {
// do something with lines
}
Ok(())
}