4

I have a Polars dataframe with a column of type str with the date and time in format 2020-03-02T13:10:42.550. I want to convert this column to the polars.datetime type.

After reading this post Easily convert string column to pl.datetime in Polars, I came up with:

df = df.with_column(pl.col('EventTime').str.strptime(pl.Datetime, fmt="%Y-%m-%dT%H:%M:%f", strict=False))

However, the values my column "EventTime' are all null.

Many Thanks!

Johnas
  • 296
  • 2
  • 5
  • 15
  • besides the missing seconds directive %s, **it's `%.f` for fractional seconds**, not `.%f` as you would use in vanilla Python. – FObersteiner Dec 13 '22 at 08:33

1 Answers1

5

You were close. You forgot the seconds component of your format specifier:

(
    df
    .with_column(
        pl.col('EventTime')
        .str.strptime(pl.Datetime,
                      fmt="%Y-%m-%dT%H:%M:%S%.f",
                      strict=False)
        .alias('parsed EventTime')
    )
)
shape: (1, 2)
┌─────────────────────────┬─────────────────────────┐
│ EventTime               ┆ parsed EventTime        │
│ ---                     ┆ ---                     │
│ str                     ┆ datetime[ns]            │
╞═════════════════════════╪═════════════════════════╡
│ 2020-03-02T13:10:42.550 ┆ 2020-03-02 13:10:42.550 │
└─────────────────────────┴─────────────────────────┘

BTW, the format you are using is standard, so you can eliminate the format specifier altogether.

(
    df
    .with_column(
        pl.col('EventTime')
        .str.strptime(pl.Datetime,
                      strict=False)
        .alias('parsed EventTime')
    )
)
shape: (1, 2)
┌─────────────────────────┬─────────────────────────┐
│ EventTime               ┆ parsed EventTime        │
│ ---                     ┆ ---                     │
│ str                     ┆ datetime[μs]            │
╞═════════════════════════╪═════════════════════════╡
│ 2020-03-02T13:10:42.550 ┆ 2020-03-02 13:10:42.550 │
└─────────────────────────┴─────────────────────────┘

Edit

And what if I would like to ignore the miliseconds? so the "%.f", if I just leave it out it can't interpret properly the dataframe

We need to allow Polars to parse the date string according to the actual format of the string.

That said, after the parsing, we can use dt.truncate to throw away the fractional part.

(
    df
    .with_column(
        pl.col('EventTime')
        .str.strptime(pl.Datetime,
                      strict=False)
        .dt.truncate('1s')
        .alias('parsed EventTime')
    )
)
shape: (1, 2)
┌─────────────────────────┬─────────────────────┐
│ EventTime               ┆ parsed EventTime    │
│ ---                     ┆ ---                 │
│ str                     ┆ datetime[μs]        │
╞═════════════════════════╪═════════════════════╡
│ 2020-03-02T13:10:42.550 ┆ 2020-03-02 13:10:42 │
└─────────────────────────┴─────────────────────┘
  • Oh wow..didn't know that's the standard, thank you!! – Johnas Sep 14 '22 at 13:15
  • 1
    It’s one of several standards that Polars will automatically attempt. The `fmt` specifier is for cases where you have non-standard formats (or you want to enforce a particular format.) –  Sep 14 '22 at 13:28
  • And what if I would like to ignore the miliseconds? so the "%.f", if I just leave it out it can't interpret properly the dataframe. – Johnas Sep 19 '22 at 12:49
  • 1
    I've added a section for how to throw away the fractional seconds. –  Sep 19 '22 at 18:42