I have a dataframe in spark and I want to manually map the values of one of the columns:
Col1
Y
N
N
Y
N
Y
I want "Y" to be equal to 1 and "N" to be equal to 0, like this:
Col1
1
0
0
1
0
1
I have tried StringIndexer, but it I think it randomly encodes the categorical data. (I am not sure)
The python equivalent for this is:
df["Col1"] = df["Col1"].map({"Y": 1, "N": 0})
Can you please help me on how can I achieve this in Pyspark?