I am facing a challenge that's driving me crazy after two hours of trial and error...
I need to hash at least two columns of a relational table with presto (actually with Amazon Athena which uses the presto engine).
My current state is this:
SELECT concat(lower(to_hex(md5(to_utf8(trim(column1))))),
lower(to_hex(md5(to_utf8(trim(column2)))))) AS HK
FROM table
limit 10
The issue with this is, it hashes the columns first and afterwards concatenates them what ends up looking like this, because it concatenates the hashes:
8f9bfe9d1345237cb3b2b205864da075ce8ae9da5b7cd6c3df2929543a9af92d
Instead of concatenating the strings first and then hashing it what should end up like this:
8f9bfe9d1345237cb3b2b205864da075
I'd be very grateful for any ideas on how to solve this problem.
EDIT:
I got a solution, but only without using the trim() function, with it the query does not work and results in an error message
INVALID_FUNCTION_ARGUMENT: There must be two or more concatenation arguments
Current query:
SELECT lower(to_hex(md5(to_utf8(concat(user, email))))) AS UserMailHK FROM table limit 10
Any ideas on this?