0

I have a string column in parquet which I would like to unnest:

df = df.select([pl.col("parameters").apply(json.loads)]).unnest("parameters")

I assume I need to do something similar to this (which I found in this answer):

fn part_split(row_id_series: Series) -> Result<Option<Series>, PolarsError> {
    let x = row_id_series
        .u32()
        .unwrap()
        .into_iter()
        .map(|val| Some(val.unwrap()/200_000 as u32))
        .collect::<UInt32Chunked>();
    Ok(Some(x.into_series()))
}

let o = GetOutput::from_type(DataType::UInt32);
df.with_column(col("row_id").alias("part").apply(part_split, o)).

However I cannot figure out what the type of the collect should be. The return should be a struct but I cannot fill in the blanks between serde_json parsing to Struct.

fn unjsonify(json_fields: Series) -> Result<Option<Series>, PolarsError>  {
    let x = json_fields
        .utf8()
        .unwrap()
        .into_iter()
        .map(|s| serde_json::from_str(s)) //???????
        .collect::<?????>();
    Ok(Some(x.into_series()))
}
o = GetOutput::from_type(DataType::?????);
df.with_column(col("parameters").apply(unjsonify, o)).unnest("parameters").
jonrsharpe
  • 115,751
  • 26
  • 228
  • 437
Bahadir Cambel
  • 422
  • 5
  • 12

0 Answers0