-2

I am taking data from hbase and converted it to dataframe. Now, I have a column in data-frame which is string datatype.But i need to convert its datatype to Int.

Tried below code but its throwing me an error

df.withColumn("order", 'order.cast(int)')

Error i am facing is below

error:col should be column

I have given proper column name here, Do i need to change the syntax of above mentioned code in pyspark?

Ahito
  • 333
  • 3
  • 8
  • 15

1 Answers1

8

Either:

df.withColumn("order", df.order.cast("int"))

or

from pyspark.sql.functions import expr

df.withColumn("order", expr("CAST(order AS INTEGER)"))
  • 1
    I get this error notebook:1: error: value cast is not a member of org.apache.spark.sql.DataFrame – Sade Oct 16 '18 at 20:33