4

I have a code in sql which I am using. Not much familiar with postgresql. Below is the code which I want to convert to postgresql. I'm using dbeaver 3.5.4

Update tablename
set record_hash = cast(hashbytes('MD5',
                  coalesce(id, '') +
                  coalesce(name, '') +
                  coalesce(created_date, '') +
                  coalesce(last_modified_date, '')
                  ) as bigint) 
;
Ramesh
  • 251
  • 2
  • 4
  • 12
  • Postgres is also using SQL. –  Apr 26 '16 at 20:54
  • 1
    Dbeaver is a SQL client for several databases that uses Eclipse IDE as it's base. What database is it connected to ? You can see this in Database Navigation view, then click "Edit Connection". It's on top of the window. – Nelson Teixeira Apr 26 '16 at 21:06
  • 1
    Just to clarify: did you want to store 16-bytes md5 hash into the 8-bytes `bigint` field? – Abelisto Apr 26 '16 at 21:07
  • never mind I found it here to be SQL Server – Nelson Teixeira Apr 26 '16 at 21:12
  • Nelson Teixeira : I'm using amazon redshift for this. Abelisto: Would like to store 16-bytes md5 hash – Ramesh Apr 27 '16 at 12:47
  • Keep in mind that the almost all solutions below stores only 8 highest bytes of the 16 bytes of the hash, so, it is at list confusing. IMO the best solution is to store hash as `uuid` type which is exactly 16 bytes: `md5()::uuid` Or, another way, use `bit(128)` type. – Abelisto Apr 28 '16 at 00:05

4 Answers4

1

You can do it like this:

Update tablename
set record_hash = ('x'|| substr(
                            md5(
                              coalesce(id, '') ||
                              coalesce(name, '') ||
                              coalesce(created_date, '') ||
                              coalesce(last_modified_date, '')
                            ),1,16)::bit(64)::bigint )

Found here how to do the hash => bigint conversion.

Community
  • 1
  • 1
Nelson Teixeira
  • 6,297
  • 5
  • 36
  • 73
  • Thanks a lot for the responses. I used this query and I get DBCException: SQL Error [42846]: ERROR: cannot cast type text to bit .. Appears to be a casting issue .. record_hash is set to bpchar ..Not sure what datatype should I change for this – Ramesh Apr 27 '16 at 12:44
  • what's the type of record_hash field? – Nelson Teixeira Apr 27 '16 at 13:03
  • Right now, I changed it to numeric, but same error. – Ramesh Apr 27 '16 at 13:06
  • try changing it to bigint as it's the type it's being cast to. :) – Nelson Teixeira Apr 27 '16 at 13:12
  • Yeah. changed it to bigint. Unfortunately same error. Other fields in the table are varchar. – Ramesh Apr 27 '16 at 13:19
  • then we'll have to determine if it's a problem in attributing to record_hash field or in the right part. Substitute all before ('x' by a simple select to see if the error persists. I mean: select ('x'||... – Nelson Teixeira Apr 27 '16 at 13:34
  • I did a select clause for it. And it says that the first column 'id' does not exist. In order to correct it, I tried it like below, and we are back to casting issue. select ('x' || lpad(md5(coalesce('id', '') || coalesce('name', '') || coalesce('created_date', '') || coalesce('last_modified_date', '') ), 16, '0' ) )::bit(64)::bigint; – Ramesh Apr 27 '16 at 13:40
  • it's not about being substr or lpad. Both return same type. So now remove, ::bigint, then ::bigint(64), then one field at a time until you find the part that's causing the problem. Then remove this part and put the parts you removed before finding where the problem is, until you know exactly what part of the query causes the error. Then tell me what part is. – Nelson Teixeira Apr 27 '16 at 13:44
  • okay. I did remove bigint and bit(64). Error says record_hash is of type bigint but expression is of type text. Will need to rewrite or cast the expression. – Ramesh Apr 27 '16 at 14:04
  • If I do this I'm able to get the result select ('x' || lpad(md5(coalesce('id', '') || coalesce('name', '') || coalesce('created_date', '') || coalesce('last_modified_date', '') ), 16, '0' ) ); – Ramesh Apr 27 '16 at 14:10
  • WIth the result being x2406b413fbf35fd2 – Ramesh Apr 27 '16 at 14:11
  • 1
    Got it working. If I remove bit64, update statement is working. Thanks for your patience. You are very helpful!! – Ramesh Apr 27 '16 at 14:15
  • Most software development problems can be solved by using this method. You remove part by part until you pinpoint the part causing trouble. Normally at that point the problem becomes obvious. – Nelson Teixeira Apr 27 '16 at 15:39
  • If I use the same syntax for another table, I am getting syntax error at or near coalesce. That's weird. I did not change the syntax. – Ramesh Apr 27 '16 at 20:27
  • Syntax is like this : – Ramesh Apr 27 '16 at 20:31
  • Update tablename set record_hash = ('x' || lpad(md5(coalesce(id, '') || coalesce(name, '') || coalesce(type, '') ), 16, '0'))::bigint; – Ramesh Apr 27 '16 at 20:31
  • same thing, check nulls, substitute by select, remove field by field... pinpoint the problem. – Nelson Teixeira Apr 27 '16 at 20:43
  • If I want to use timestamp , would the syntax change? It would error as – Ramesh Apr 29 '16 at 14:23
  • invalid input syntax for type timestamp: "" – Ramesh Apr 29 '16 at 14:23
  • coalesce(last_modified_date, TIMESTAMP '-infinity') – Ramesh Apr 29 '16 at 14:50
  • @Ramesh my advice is not mixing things. A hash converted to a bigint for indexing is OK. Ints are faster to lookup than strings. But if you change it to timestamp that may confuse someone that tries to understand your project in the future. Even if it's a project that only you will maintain, don't trust your memory about it. Sometime in the future you can comeback to this and think: why the heck is this timestamp here ? Better to do things in a natural self-explanatory way. The fist thing to always keep in mind is the KISS principle "Keep it simple *student*" ;D – Nelson Teixeira Apr 29 '16 at 15:15
  • Thanks for your valuable suggestion Nelson! – Ramesh Apr 29 '16 at 20:31
1

I assume that this hashbyte() thing generates a hash value.

To create a md5 checksum in Postgres you can use the md5() function, e.g.:

md5(concat(id::text, name, created_date::text, last_modified_date::date))

concat() will take care of null values automatically, no need for coalesce()

Unfortunately there is no direct conversion from a hex value to an integer in Postgres

0

Postgres has MD5 as a built-in function:

Update tablename
    set record_hash = ('x' || lpad(md5(coalesce(id, '') ||
                                       coalesce(name, '') ||
                                       coalesce(created_date, '') ||
                                       coalesce(last_modified_date, '')
                                      ), 16, '0'
                                  )
                      )::bit(64)::bigint;

For the conversion back to bigint, give credit where credit is due. Erwin Brandstetter's answers are usually very thorough so I would expect it to work well.

Community
  • 1
  • 1
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • isn't lpad missing 2nd argument ? – Nelson Teixeira Apr 26 '16 at 21:24
  • Thanks a lot for the responses. I used this query and I get DBCException: SQL Error [42846]: ERROR: cannot cast type text to bit .. Appears to be a casting issue .. record_hash is set to bpchar ..Not sure what datatype should I change for this. – Ramesh Apr 27 '16 at 12:43
0

Just as info, here is several approaches how to represet 16-bytes data without data loss:

with t(x) as (values(md5('abc123')))
select
  t.x, -- varchar(32), char(32)
  t.x::uuid, -- uuid
  ('x'||t.x)::bit(128), -- exactly bit(128)
  ('\x'||t.x)::bytea, -- bytea
  array[('x'||left(t.x,16))::bit(64)::bigint, ('x'||right(t.x,16))::bit(64)::bigint] -- bigint[]
  -- ... and so on
from t;
Abelisto
  • 14,826
  • 2
  • 33
  • 41