I am new to polars and I wonder what is the equivalent of pandas groupby.apply(drop_duplicates) in polars. Here is the code snippet I need to translate :
import pandas as pd
GROUP = list('123231232121212321')
OPERATION = list('AAABBABAAAABBABBBA')
BATCH = list('777898897889878987')
df_input = pd.DataFrame({'GROUP':GROUP, 'OPERATION':OPERATION, 'BATCH':BATCH})
df_output = df_input.groupby('GROUP').apply(lambda x: x.drop_duplicates())
I tried the following, but, it does not output what I need
import polars as pl
GROUP = list('123231232121212321')
OPERATION = list('AAABBABAAAABBABBBA')
BATCH = list('777898897889878987')
df_input = pl.DataFrame({'GROUP':GROUP, 'OPERATION':OPERATION, 'BATCH':BATCH})
df_output = df_input.groupby('GROUP').agg(pl.all().unique())
If I take only one Group, I get locally what I want :
df_part = df_input.filter(pl.col('GROUP')=='2')
df_part[['OPERATION', 'BATCH']].unique()
Does somebody know how to do that ?