I'm trying to create a column of tuple based on other two columns in spark dataframe.
data = [ ('A', 4,5 ),
('B', 6, 9 )
]
columns= ["id","val1", "val2"]
sdf = spark.createDataFrame(data = data, schema = columns)
sdf.withColumn('values', F.struct(F.col('val1'), F.col('val2')) ).show()
what I got is:
I need column values
to be tuples. So instead of {4,5} {6,9}
, I want (4,5) (6,9)
. Does anyone know what I did wrong? Thanks a lot.