I have a Polars LazyFrame that, after applying several functions, looks like this:
┌───────────────┬──────────────┬─────────────────────────┬──────────────────────────┐
│ citing_patent ┆ cited_patent ┆ cited_patent_issue_date ┆ citing_patent_issue_date │
│ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ str ┆ date ┆ date │
╞═══════════════╪══════════════╪═════════════════════════╪══════════════════════════╡
│ X ┆ A ┆ 2000-10-20 ┆ 2001-02-08 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ X ┆ B ┆ 1999-08-04 ┆ 2001-02-08 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ Y ┆ B ┆ 1999-08-04 ┆ 2004-06-04 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ N ┆ A ┆ 2000-10-20 ┆ 2021-12-20 │
└───────────────┴──────────────┴─────────────────────────┴──────────────────────────┘
I would like to group it by cited_patent
, and have a column for the number of citing_patent
s within three years of cited_patent_issue_date
.
After reading 15741618, I tried using dateutil.relativedelta
.
Here is the code I have so far:
.groupby("cited_patent")
.agg(
[
pl.col("cited_patent_issue_date").first(),
(pl.col("citing_patent_issue_date") <= pl.col("cited_patent_issue_date").first() + relativedelta(years=3)).sum()
]
)
However, this doesn't work, as I get an error:
pyo3_runtime.PanicException: could not convert value relativedelta(years=+3) as a Literal
I can't seem to find anything else on this, so I'm a bit stuck.
What's the recommended way to add years to dates in Polars?