I need to hash certain columns (like email) while copying MySQL
tables to HDFS
using Sqoop
.
- Is there a built-in option in
sqoop
? - If not, how can this be achieved?
EDIT-1
Currently I could think of a very crude way to achieve this: passing a SQL
query (instead of table-name) like following to sqoop
SELECT
`name`,
SHA1(`email`) AS `email`,
`dob`
FROM
`my_db`.`users`
- Not sure if this would work at all [will update once I've tried]
- Even if it works, it (most probably) would require generating
SQL
-query specific to underlying DB (MySQL
,PostgreSQL
etc)