I have a simple use case of performing join on two dataframes, I am using spark 1.6.3 version. The issue is that while trying to cast the string type to integer type using cast method the resulting column is all null values.
I have already tried all solutions mentioned here How to cast a column in dataframe? but all questions have answers for scala api and I could not find anyone that works with java api.
DataFrame dataFromDB = getDataFromDB("(select * from schema.table where
col1 is not null)"); //This method uses spark sql
//to connect to a db2 data base and get the data
//I perform the cast operation as
dataFromDB.withColumn("INCOME_DATA", dataFromDB.col("INCOME_DATA")
.cast(DataTypes.IntegerType));
//but the above results in null values
//other things I tried based on the link above is below
dataFromDB.selectExpr(cast("INCOME_DATA" as integer")) //this too produces null values
//I tried to remove the whitespaces from income data column with no success
dataFromDB.select(dataFromDB.col("INCOME_DATA").toString().replaceAll("\\s+", ""); //this does not remove any whitespace
I am unable to find the solution for it, also the column I am trying to convert is of String type and may contain trailing whitespaces, could this be an issue? if yes then how can i remove them, I tried to remove them as below but doesnot seem to work. This is my first ever working with spark dataframe so any help with this is deeply appreciated. Thanks!