0

How to convert a column value in Spark dataframe to lowercase/uppercase in Java?

For example, below is the input dataframe:

name | country | src        | city       | debit
---------------------------------------------
"foo"| "NZ"    | salary     | "Auckland" | 15.0
"bar"| "Aus"   | investment | "Melbourne"| 12.5

I need convert the 'city' column to lower case

name | country | src        | city       | debit
------------------------------------------------
"foo"| "NZ"    | salary     | "auckland" | 15.0
"bar"| "Aus"   | investment | "melbourne"| 12.5

I have found solutions in Scala and Python, but not in Java as below

How to change case of whole column to lowercase?

In java there is a solution to convert column names, but not its data.

How to lower the case of column names of a data frame but not its values?

How can I convert column values to lowercase?

Monu
  • 2,092
  • 3
  • 13
  • 26
  • The same function is also available in java. I don't understand which difficulties you're facing exactly? – blackbishop Feb 21 '23 at 14:53
  • how can i use java stream to convert data to lower case instead column ? – Monu Feb 21 '23 at 15:09
  • Why do you want to use stream if you only need to lowercase the column `city`? – blackbishop Feb 21 '23 at 15:15
  • I am not sure how can i change the colum to lower case without iterating ? – Monu Feb 21 '23 at 15:18
  • 2
    `import static org.apache.spark.sql.functions.lower;` then updating the column in your dataframe: `df.withColumn("city", lower(df.col("city")))` – blackbishop Feb 21 '23 at 15:21
  • oh okay, I was trying to import function as below which was giving an error. ```import org.apache.spark.sql.functions.lower;``` – Monu Feb 21 '23 at 15:26

1 Answers1

0

incase someone is looking for answer. Below is the solution as per suggestion by @blackbishop

import static org.apache.spark.sql.functions.lower; 
df=df.withColumn("city", lower(df.col("city")))
Monu
  • 2,092
  • 3
  • 13
  • 26