I have a PySpark dataframe. It has an integer column "index" and a double column "feature". I also have a python list parameter
with length the same as number of unique elements in "index" column.
I would like to generate a new column in the following way in PySpark: for each row, if the value of "index" is i
, then I would like to multiply "feature" by parameter[i]
for this new column.
For small number of elements in parameter
, I can use when().when().otherwise
to generate the output. How should I do it when the number of elements in parameter
is large?