2

I have a data frame with multiple columns and one of them is following. I would like to do some logical operations like if count > 0.0 replace the value with 1.0 else 0.0 and create another column with title label. How could I do it in PySpark


Count
-------
  0
  1
  2
  3
  0
  100

Answer

I managed to solve this as follows:

from pyspark.sql.functions import col, when
cond1 = col("Count") > 0.0 
df = df.withColumn("label", when(cond1, 1.0).otherwise(0.0))
thetna
  • 6,903
  • 26
  • 79
  • 113

0 Answers0