I have a string column in parquet which I would like to unnest:
df = df.select([pl.col("parameters").apply(json.loads)]).unnest("parameters")
I assume I need to do something similar to this (which I found in this answer):
fn part_split(row_id_series: Series) -> Result<Option<Series>, PolarsError> {
let x = row_id_series
.u32()
.unwrap()
.into_iter()
.map(|val| Some(val.unwrap()/200_000 as u32))
.collect::<UInt32Chunked>();
Ok(Some(x.into_series()))
}
let o = GetOutput::from_type(DataType::UInt32);
df.with_column(col("row_id").alias("part").apply(part_split, o)).
However I cannot figure out what the type of the collect should be. The return should be a struct but I cannot fill in the blanks between serde_json
parsing to Struct.
fn unjsonify(json_fields: Series) -> Result<Option<Series>, PolarsError> {
let x = json_fields
.utf8()
.unwrap()
.into_iter()
.map(|s| serde_json::from_str(s)) //???????
.collect::<?????>();
Ok(Some(x.into_series()))
}
o = GetOutput::from_type(DataType::?????);
df.with_column(col("parameters").apply(unjsonify, o)).unnest("parameters").