I have a table with N
columns, I want to concatenate them all to a string column and then perform a hash on that column. I have found a similar question in Scala.
I want to do this entirely inside of Spark SQL ideally, I have tried HASH(*) as myhashcolumn
but due to several columns being sometimes null I can't make this work as I would expected.
If I have to create a UDF and register it to make this happen, I need to use Python and not Scala as all my other code is in Python.
Any ideas?