0

I would like to modify the column values as below. If the column value is "BALL-KG", I want to modify that value to "BALL" otherwise, teh column values remains the same Input Columns:

Name      Product
John      PIPE
Hema      BALL-KG
Basha     BALL-KG
Hari      BALL
Bijju     BAG

Output:

Name      Product
John      PIPE
Hema      BALL
Basha     BALL
Hari      BALL
Bijju     BAG

Thanks.

Lilly
  • 910
  • 17
  • 38
  • Related to [Pyspark replace strings in Spark dataframe column](https://stackoverflow.com/questions/37038014/pyspark-replace-strings-in-spark-dataframe-column) and [replace values of one column in a spark df by dictionary key-values (pyspark)](https://stackoverflow.com/questions/44776283/replace-values-of-one-column-in-a-spark-df-by-dictionary-key-values-pyspark) – anky May 12 '20 at 06:56
  • 2
    a simple when/otherwise clause would suffice, i think this question lacks research but here u go `df.withColumn("Product", F.when(F.col("Product")=="BALL-KG",F.lit("BALL")).otherwise(F.col("Product")))` – murtihash May 12 '20 at 07:00

1 Answers1

2

Try, assuming df is your input data frame:

Import PySpark.sql.functions as F

df=df.select(F.col("Name"), F.when(F.col("Product")==F.lit("BALL-KG"), F.lit("BALL")).otherwise(F.col("Product")).alias("Product"))
F.when(<condition>, <option_if_met>).otherwise (<if_not_met>)

it's if else syntax in PySpark

Grzegorz Skibinski
  • 12,624
  • 2
  • 11
  • 34