0

I am trying to find the SQL equivalent of hash in bigquery.

SQL :

 SELECT    CAST(HASHBYTES('SHA2_256', CONCAT(
                                            COL1, COL2, COL3
                                        )) AS BINARY(32)) AS HashValue

Big Query:

SELECT  SHA2_256(CONCAT(COL1, '', COL2 )) AS HashValue.

I can't find any examples where hashing is done on multiple columns. The datatype of the columns are different as well.

Any help is really appreciated.

Ben P
  • 3,267
  • 4
  • 26
  • 53
Aced
  • 89
  • 1
  • 2
  • 8

2 Answers2

1

Using Standard SQL (SHA256 function) you could cast all your fields to string, concatenate them and use the hash. Something like this:

SELECT SHA256( 
              CONCAT(
                  CAST(integer_field1 as STRING), 
                  CAST(integer_field2 as STRING), 
                  CAST(timestamp_field as STRING) 
                  )
         ) as sha256_hash FROM `table`
Javier Montón
  • 4,601
  • 3
  • 21
  • 29
1

You can see follow this change request

These are now implemented. Thanks again for sharing feedback on needing these ?> functions. Please see:

TO_HEX: https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#to_hex

FROM_HEX: https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#from_hex

2 related questions I found for you are:
Is it possible to hash using MD5 in BigQuery?
Random Sampling in Google BigQuery

Community
  • 1
  • 1
Tamir Klein
  • 3,514
  • 1
  • 20
  • 38