0

enter image description hereI have a dataframe with two columns (text, useful). I would like to replace any value in "useful" column greater that 20 to 1, else set to zero. Need some help with this. I'm using scala in databricks community

+--------------------+------+
|Located in a base...|     0|
| I am not a vegeta  |    12|



+--------------------+------+
|                text|useful|
+--------------------+------+
|Located in a base...|     0|
|I am not a vegeta...|     1|
|There is so many ...|    12|
|Disclaimer: this ...|     0|
|House Special Chi...|     0|
|The food at Chez ...|     2|
|Overall not bad. ...|     3|

enter image description here

df

Ramesh Maharjan
  • 41,071
  • 6
  • 69
  • 97
  • Just can't figure this out. import org.apache.spark.sql.functions._ val uDf = Df.withColumn("useful", regexp_replace(col("useful") > 20) => 1) – Michael Mckenzie Jul 27 '18 at 11:51
  • I had tried that but keep getting error "error: too many arguments for method withColumn: (colName: String, col: org.apache.spark.sql.Column)org.apache.spark.sql.DataFrame" val uDf = DF.withColumn("label", when(col("label") > 20), 1).otherwise(0) ^ – Michael Mckenzie Jul 27 '18 at 13:34
  • I misplaced one bracket so the correct one is `val uDf = Df.withColumn("useful", when((col("useful") > 20), 1).otherwise(0))` – Ramesh Maharjan Jul 27 '18 at 13:40
  • That's what I was missing. Thanks a lot – Michael Mckenzie Jul 27 '18 at 13:45
  • Possible duplicate of [Scala: How can I replace value in Dataframes using scala](https://stackoverflow.com/questions/32357774/scala-how-can-i-replace-value-in-dataframes-using-scala) – Ramesh Maharjan Jul 27 '18 at 13:49

0 Answers0