PostgreSQL Varchar UID to Int UID while preserving uniqueness

Question

Say I have a unique column of VarChar(32).

ex. 13bfa574e23848b68f1b7b5ff6d794e1.

I want to preserve the uniqueness of this while converting the column to int. I figure I can convert all of the letters to their ascii equivalent, while retaining the numbers and character position. To do this, I will use the translate function.

psuedo code: select translate(uid, '[^0-9]', ascii('[^0-9]'))

My issue is finding all of the letters in the VarChar column originally. I've tried

select uid, substring(uid from '[^0-9]') from test_table;

But it only returns the first letter it encounters. Using the above example, I would be looking for bfaebfbbffde

Any help is appreciated!

You can't map VarChar(32) value to int preserving the uniqueness. If you replace all letters with ascii equivalent, you will get number with at least 32 digits, while max integer is 2,147,483,647. — CodeFuller, Dec 03 '17 at 06:13
Postgres has special [uuid type.](https://www.postgresql.org/docs/current/static/datatype-uuid.html) — klin, Dec 03 '17 at 06:31
Can I use bigint? The reason I need it int is because I am using the pgRouting extension and some of the functions require an int as input — james, Dec 04 '17 at 01:41

score 0 · Answer 1 · answered Dec 03 '17 at 07:49

create one function which replace charter with blank which you not need in string,

CREATE FUNCTION replace_char(v_string VARCHAR(32) CHARSET utf8) RETURNS VARCHAR(32)
     DETERMINISTIC
 BEGIN

   DECLARE v_return_string VARCHAR(32) DEFAULT '';
   DECLARE v_remove_char VARCHAR(200) DEFAULT '1,2,3,4,5,6,7,8,9,0';
   DECLARE v_length, j INT(3) DEFAULT 0;

   SET v_length = LENGTH(v_string);

   WHILE(j < v_length) DO
    IF ( FIND_IN_SET( SUBSTR(v_string, (j+1), 1), v_remove_char ) = 0) THEN
        SET v_return_string = CONCAT(v_return_string, SUBSTR(v_string, (j+1), 1) );
    END IF;
    SET j = j+1;
   END WHILE;

   RETURN v_return_string;

END$$

DELIMITER ;

Now you just nee to call this function in query

select uid, replace_char(uid) from test_table;

It will give you string what you need (bfaebfbbffde)

If you want to int number only i.e 13574238486817567941 then change value of variable, and also column datatype in decimal(50,0), decimal can stored large number and there is 0 decimal point so it will store int value as decimal.

v_remove_char  = 'a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z';

score 0 · Accepted Answer · answered Dec 03 '17 at 22:37

First off, I agree with the two commenters who said you should use a UID datatype.

That aside...

Your UID looks like a traditional one, in that it's not alphanumeric, it's hex. If this is the case, you can convert the hex to the numeric value using this solution:

PostgreSQL: convert hex string of a very large number to a NUMERIC

Notice the accepted solution (mine, shame) is not as good as the other solution listed, as mine will not work for hex values this large.

That said, yikes, what a huge number. Holy smokes.

Depending on how many records are in your table and the frequency of insert/update, I would consider a radically different approach. In a nutshell, I would create another column to store your numeric ID whose value would be determined by a sequence.

If you really want to make it bulletproof, you can also create a cross-reference table to store the relationships that would

Reuse an ID if it ever repeated (I know UIDs don't, but this would cover cases where a record is deleted by mistake, re-appears, and you want to retain the original id)
If UIDs repeat (like this is a child table with multiple records per UID), it would cover that case as well

If neither of these apply, you could dumb it down quite a bit.

The solution would look something like this:

Add an ID column that will be your numeric equivalent to the UID:

alter table test_table
add column id bigint

Create a sequence:

CREATE SEQUENCE test_id

create a cross-reference table (again, not necessary for the dumbed down version):

create table test_id_xref (
   uid varchar(32) not null,
   id bigint not null,
   constraint test_id_xref_pk primary key (uid)
)

Then do a one-time update to assign a surrogate ID to each UID for both the cross-reference and actual tables:

insert into test_id_xref
with uids as (
  select distinct uid
  from test_table
)
select uid, nextval ('test_id')
from uids;

update test_table tt
set id = x.id
from test_id_xref x
where tt.uid = x.uid;

And finally, for all future inserts, create a trigger to assign the next value:

CREATE OR REPLACE FUNCTION test_table_insert_trigger()
  RETURNS trigger AS
$BODY$
BEGIN
  select t.id
  from test_id_xref t
  into NEW.id
  where t.uid = NEW.uid;

  if NEW.id is null then
    NEW.id := nextval('test_id');
    insert into test_id_xref values (NEW.uid, NEW.id);
  end if;

  return NEW;
END;
$BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100;

CREATE TRIGGER insert_test_table_trigger
  BEFORE INSERT
  ON test_table
  FOR EACH ROW
  EXECUTE PROCEDURE test_table_insert_trigger();

Thanks for your help. I went back and looked through the source of the data and it is indeed based of of UUID originally. I used the function listed in your link (not the accepted one) to transform it to an integer. Works great! — james, Dec 04 '17 at 01:54

PostgreSQL Varchar UID to Int UID while preserving uniqueness

2 Answers2