I have got some experience in pyspark. When our team is migrating the Spark project from python to C# (.Net for Spark). I'm encountering problems:
Suppose we have got a Spark dataframe df with an existing column as col1.
In pyspark, I could do something like:
df = df.withColumn('new_col_name', when((df.col1 <= 5), lit('Group A')) \
.when((df.col1 > 5) & (df.col1 <= 8), lit('Group B')) \
.when((df.col1 > 8), lit('Group C')))
The question is how to do the equivalent in C#?
I've tried many things but still getting Exceptions when using the When() method. For example, the following code would generate the exception:
df = df.WithColumn("new_col_name", df.Col("col1").When(df.Col("col1").EqualTo(3), Functions.Lit("Group A")));
Exception:
[MD2V4P4C] [Error] [JvmBridge] java.lang.IllegalArgumentException: when() can only be applied on a Column previously generated by when() function
Searched around and didn't find many examples on .Net for Spark. Any help would be much appreciated.