When using pandarallel
to use all cores when running .apply methods on my dataframes, I came across a syntax which I never saw before. Rather, it's a way of using dot syntax that I don't understand.
import pandas as pd
from pandarallel import pandarallel
df = pd.DataFrame([[1, 2, 3], [4, 5, 6]], columns=['a', 'b'])
So far so good, just setting up a dataframe. Next, to get pandarallel
ready, we do
pandarallel.initialize()
Next up is the bit where I am confused: to use pandarallel we call this method on the dataframe
df.parallel_apply(func)
My question is: if the dataframe df
was instantiated using the pandas
library, and pandas
does not have a method called parallel_apply
, how is it that Python knows to use the pandarallel
method on the pandas
object?
I presume it's something to do with the initialization, but I have never seen this before and I don't understand what's happening in the back end.