I try to create a new column in Koalas dataframe df
. The dataframe has 2 columns: col1
and col2
. I need to create a new column newcol
as a median of col1
and col2
values.
import numpy as np
import databricks.koalas as ks
# df is Koalas dataframe
df = df.assign(newcol=lambda x: np.median(x.col1, x.col2).astype(float))
But I get the following error:
PandasNotImplementedError: The method
pd.Series.__iter__()
is not implemented. If you want to collect your data as an NumPy array, use 'to_numpy()' instead.
Also I tried:
df.newcol = df.apply(lambda x: np.median(x.col1, x.col2), axis=1)
But it didn't work.