0

I have a table in Postgres that contains columns for a userid and an external id that will be used in API calls to an external service. I have to create the external id on my side, validate that is unique, and store it in PG before calling the external API. This article comes close to what I want: How can I generate a unique string per record in a table in Postgres? However, there's the possibility of conflict if two concurrent calls generate the same id. What I would like to do is have a loop that generates a random string, then attempts to insert that with the user id into the table. It should fail if the random string already exists (there is a unique constraint on the column). If it fails, it should generate another id and attempt to insert that (once I get working code I will add a counter to prevent hammering the db).

How would you go about writing that loop? The loop should continue if the INSERT returns an error (constraint check), otherwise loop again. I've checked the Postgres docs and can't seem to find (or am missing) the way to check the error code/status on a query.

UPDATE

I've come up with a possible solution, but need to flesh it out. The following is in pidgeon-sql, just me thinking around the problem:

success = true;
LOOP
-- create random string function
BEGIN
  insert string
EXCEPTION
  success = false;
EXIT WHEN success;
END;
JEPrice
  • 627
  • 7
  • 10
  • 1
    Would the [UUID](https://www.postgresql.org/docs/current/uuid-ossp.html) extension work? – Adrian Klaver Sep 24 '20 at 17:47
  • unfortunately the external API does not use UUID – JEPrice Sep 24 '20 at 18:18
  • Yeah but: `select uuid_generate_v1mc()::varchar; 3c464166-fe95-11ea-87f2-0fd30065b715 select pg_typeof(uuid_generate_v1mc()::varchar); character varying` – Adrian Klaver Sep 24 '20 at 18:41
  • @JEPrice: then convert the uuid to a string –  Sep 24 '20 at 18:47
  • it's too long. Not sure why the external api dev went this way, but it's 10 chars max :/ – JEPrice Sep 24 '20 at 18:52
  • I should ask one question: Are external IDs the only thing someone needs to access the external API (even without being logged in or authorized in some way)? Many issues with using non-random unique IDs go away if only one authorized user can use a given external ID to access the external API. (Also, look up the "insecure direct object references problem".) See also my note on [unique random identifiers](https://peteroupc.github.io/random.html#Unique_Random_Identifiers). – Peter O. Sep 25 '20 at 00:11
  • no, absolutely not, login is required. – JEPrice Sep 26 '20 at 14:04

2 Answers2

0

if no need of randomness of external ID, then

CREATE SEQUENCE base_seq;
ALTER TABLE thetable
    ALTER COLUMN ext_id SET DEFAULT LPAD(nextval('base_seq')::text, 64, '0');

will give strongly unique (databasewide) strings in ext_id column

but if your only choice is to try-in-a-loop, then the loop in plpgsql function will be like this:

LOOP
  new_try_ext_id := some randomization magic here...
  INSERT INTO thetable(userid,ext_id)
    VALUES (someid, new_try_ext_id)
  ON CONFLICT DO NOTHING;
  GET DIAGNOSTICS some_integer_var = ROW_COUNT;
  EXIT WHEN some_integer_var > 0;
END LOOP;
  • this will be used in an externally facing application, so sequences aren't something that I would care to use – JEPrice Sep 24 '20 at 18:51
  • hmmm, that looks better than my pidgeon-sql above. I need to spend some time and wrap this up in a function with a random string generator, make a few other additions, and it might be what I need... – JEPrice Sep 24 '20 at 19:33
0

Revised: Your security worry on using sequences may have some validity although I do not recall that ever coming up even in a security audit. But if it is a business requirement then you must go with it. Seems to me you need to handle key collision for multiple tables, so a generalized function generalized generating seems appropriate with a specific insert function for each table. You will need to write the insert functions for each table and cannot just use an insert statement, you must use a function (or procedure if you are using Postgres V12 or higher). You will also have to pass each column as a parameter to the insert function. The following basically 'flesh out' your pseudo-code.

create or replace function generate_random_id
                    ( lower_value_in bigint default 1  
                    , upper_value_in bigint default 10000000000)
                                                     
   returns bigint
  language sql
  volatile strict 
as $$
    select floor(random()*(upper_value_in-lower_value_in+1)+1)::bigint ;
$$; 
 
create or replace function insert_atable(col_x_in atable.colx%type)
   returns void 
  language plpgsql 
as $$
declare
    l_invalid_id boolean := true;
begin 
    while l_invalid_id
    loop
       begin
           insert into atable( id, colx)
             values ( generate_random_id(),col_x_in); 
           l_invalid_id := false;
       exception 
          when unique_violation then null;         
       end;
    end loop;
end;
$$;   

Revised demo.

Of course you can abandon the idea or the 2 ids actually being the same.

Original: So the external facing id has to be unique, but why random. Generate the id from a sequence then, limit that sequence max value as 9999999999. Then cast the generated sequence as text and store that result. That way the the internal and external ids are both unique but have the same value (at least when external cast type as id). Even better, if you have Postgres 12 of higher, you can define the external id as a generated column on id and thereby guarantee they always the same. Table definition becomes something like:

create table atable
             ( id integer  generated always as identity (maxvalue 999999999)
             , ext_id text generated always as  (id::text) stored
             , colx text 
             ) ;

See demo. Note: Demo defines id as "generated by default". This is used for demonstration purposes only.

Belayer
  • 13,578
  • 2
  • 11
  • 22
  • random to guard against fishing. If the id is sequential, it would be pretty easy to guess an id, and learning one id would open up a range of ids for attack. – JEPrice Sep 24 '20 at 22:05