I would like to insert a record into a table and if the record is already present get its id, otherwise run the insert and get the new record's id.
I will be inserting millions of records and have no idea how to do this in an efficient manner. What I am doing now is to run a select to check if the record is already present, and if not, insert it and get the inserted record's id. As the table is growing I imagine that SELECT
is going to kill me.
What I am doing now in python with psycopg2 looks like this:
select = ("SELECT id FROM ... WHERE ...", [...])
cur.execute(*select)
if not cur.rowcount:
insert = ("INSERT INTO ... VALUES ... RETURNING id", [...])
cur.execute(*insert)
rid = cur.fetchone()[0]
Is it maybe possible to do something in a stored procedure like this:
BEGIN
EXECUTE sql_insert;
RETURN id;
EXCEPTION WHEN unique_violation THEN
-- return id of already existing record
-- from the exception info ?
END;
Any ideas of how optimize a case like this?