I need to write my own expression in polars_lazy. Based on my understanding from the source code I need to write a function that returns Expr::Function. The problem is that in order to construct an object of this type, an object of type FunctionOptions must be provided. The caveat is that this class is public but the members are pub(crate) and thus outside of the create one cannot construct such an object. Are there ways around this?
2 Answers
Personally I think the Rust API for polars is not well documented enough to really use yet. Although the other answer and comments mention apply
and map
, they don't mention how or the trade-offs. I hope this answer prompts others to correct me with the "right" way to do things.
So first, here's how to use apply
on lazy dataframe, even though lazy dataframes don't take apply
directly as a method as eager ones do, and mutating in-place:
// not sure how you'd find this type easily from apply documentation
let o = GetOutput::from_type(DataType::UInt32);
// this mutates two in place
let lf = lf.with_column(col("two").apply(str_to_len, o));
And here's how to use it while not mutating the source column and adding a new output column instead:
let o = GetOutput::from_type(DataType::UInt32);
// this adds new column len, two is unchanged
let lf = lf.with_column(col("two").alias("len").apply(str_to_len, o));
With the str_to_len
looking like:
fn str_to_len(str_val: Series) -> Result<Series> {
let x = str_val
.utf8()
.unwrap()
.into_iter()
// your actual custom function would be in this map
.map(|opt_name: Option<&str>| opt_name.map(|name: &str| name.len() as u32))
.collect::<UInt32Chunked>();
Ok(x.into_series())
}
Note that it takes Series
rather than &Series
and wraps in Result
.
With a regular (non-lazy) dataframe, apply
still mutates but doesn't require with_column
:
df.apply("two", str_to_len).expect("applied");
Whereas eager/non-lazy's with_column
doesn't require apply
:
// the fn we use to make the column names it too
df.with_column(str_to_len(df.column("two").expect("has two"))).expect("with_column");
And str_to_len
has slightly different signature:
fn str_to_len(str_val: &Series) -> Series {
let mut x = str_val
.utf8()
.unwrap()
.into_iter()
.map(|opt_name: Option<&str>| opt_name.map(|name: &str| name.len() as u32))
.collect::<UInt32Chunked>();
// NB. this is naming the chunked array, before we even get to a series
x.rename("len");
x.into_series()
}
I know there's reasons to have lazy and eager operate differently, but I wish the Rust documentation made this easier to figure out.

- 2,913
- 2
- 24
- 22
-
Great examples, thanks! Is there need/functionality to parallelize apply/map ? – Anatoly Bugakov Apr 29 '22 at 20:58
-
ie you could ido `use polars::export::rayon::iter::ParallelIterator;` and then replace .into_iter() with .par_ier() ... do you think such parallisation would be benefitial for performance? – Anatoly Bugakov Apr 30 '22 at 17:05
-
Could you explain why to use map twice in `map(|opt_name: Option<&str>| opt_name.map(|name: &str| name.len() as u32))`? Why not `map(|opt_name: Option<&str>| opt_name.unwrap().len())`? – Crispy13 Jul 09 '23 at 03:12
-
1Re @Crispy13 - I don't want to throw via `unwrap` unwrapping a `None`, so the inner `map` protects me from that. Could instead use `filter_map` too. @Anatoly Bugakov - what parallelization helps depends on what work your function is doing. – Alex Moore-Niemi Jul 10 '23 at 13:55
I don't think you're meant to directly construct Expr
s. Instead, you can use functions like polars_lazy::dsl::col()
and polars_lazy::dsl::lit()
to create expressions, then use methods on Expr
to build up the expression. Several of those methods, such as map()
and apply()
, will give you an Expr::Function
.

- 892
- 6
- 9
-
But I need an expression that is not available in polars_lazy. A have an utf8 column and need to do some custom massaging that cannot be obtained by combining existing expressions. – dkla Dec 17 '21 at 15:45
-
I have 'partially' solved this by using eager but it would be better to do it in lazy. – dkla Dec 17 '21 at 15:46
-
As the awnser correctly states. You can use the apply expression to apply a custom closure over your string data. – ritchie46 Dec 19 '21 at 06:45