I am using spark 2.1 and usage is pyscripting
Problem Statement: Have an scenario where there is an need to pass multiple columns as input and return one column as output below is my input dataframe of 3 columns
a b c
S S S
S NS NS
S NS S
S S NS
NS S NS
my output has to be as below
a b c d
S S S S
S NS NS NS
S NS S S
S S NS NS
NS S NS NS
I am trying to register an UDF to pass these 3 columns[a,b,c] as input and return d column as output here a,b,c,d are the column names
I am finding difficult to get the output below is the syntax used
def return_string(x):
if [x.a=='s' & x.b=='S' & x.c=='s']
return 'S'
else if[x.a=='s' & x.b=='NS' & x.c=='s']
return 'S'
else if[x.a=='s' & x.b=='S' & x.c=='NS']
return 'NS;
func= udf(returnstring,types.StringType())
Can anyone please help me in completing this logic.