I'm receiving RecordBatches serialized as bytes and I'm trying to de-serialize into RecordBatches. Using StreamReader::try_new() and passing in the byte data and an empty Vec<usize>
pleases the compiler, but when I try to call reader.next()
I get an error.
I'm stuck because I'm not sure what the 2nd parameter (the projection
parameter) is supposed to be. It is typed as an Option<Vec<usize>>
. When I print out reader.schema()
it is the correct schema but it looks like I'm doing something wrong as far as reading the rest of the data into RecordBatch form.
let buf: Vec<usize> = Vec::new();
let mut reader = StreamReader::try_new(data.data.as_slice(), Some(buf))?;
while !reader.is_finished() {
println!("scehma: {}", reader.schema());
println!("next batch: {:?}", reader.next());
}
Output:
scehma: Field { name: "my_int64_column", data_type: Int64, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: None }
next batch: Some(Err(InvalidArgumentError("at least one column must be defined to create a record batch")))
When changing the buf to be non-empty, the error message changes to
SchemaError("project index 1 out of bounds, max field 1")',