Function min(uuid) does not exist in postgresql

Question

I have imported tables from Postgres to hdfs by using sqoop. My table have uuid field as primary key and my command sqoop as below:

sqoop import --connect 'jdbc:postgresql://localhost:5432/mydb' --username postgreuser --password 123456abcA --driver org.postgresql.Driver --table users --map-column-java id=String --target-dir /hdfs/postgre/users --as-avrodatafile --compress -m 2

But I got the error:

Import failed: java.io.IOException: org.postgresql.util.PSQLException: ERROR: function min(uuid) does not exist

I tried executed the sql command: SELECT min(id) from users and got the same error. How could I fix it ? I use Postgres 9.4, hadoop 2.9.0 and sqoop 1.4.7

You can cast the `uuid` column to text: e.g.: `min(id::text)` — , Mar 12 '18 at 06:59
Thanks @a_horse_with_no_name for answer, but how could i cast the uuid in sqoop. I try --map-column-java id=String , but didn't work. — hazzy, Mar 12 '18 at 09:30

score 5 · Answer 1 · answered Aug 01 '18 at 20:02

5

I'd like to credit @robin-salih 's answer, I've used it and implementation of min for int, to build following code:

CREATE OR REPLACE FUNCTION min(uuid, uuid)
RETURNS uuid AS $$
BEGIN
    IF $2 IS NULL OR $1 > $2 THEN
        RETURN $2;
    END IF;

    RETURN $1;
END;
$$ LANGUAGE plpgsql;


create aggregate min(uuid) (
  sfunc = min,
  stype = uuid,
  combinefunc = min,
  parallel = safe,
  sortop = operator (<)
);

It almost the same, but takes advantages of B-tree index, so select min(id) from tbl works in few millis.

P.S. I'm not pgsql expert, perhaps my code is somehow wrong, double check before use in production, but I hope it uses indexes and parallel execution correctly. I've made it just from sample code, not digging into theory behind aggregates in PG.

answered Aug 01 '18 at 20:02

Bogdan Mart

460
8
19

I found this to be incorrect: https://stackoverflow.com/a/55268461/1489726 – Paymahn Moghadasian Mar 20 '19 at 19:08
Yeah, it seems to sometimes return NULL when it shouldn't – Paymahn Moghadasian Mar 20 '19 at 19:44
Oh, damn SQL's null threatment. Good you found out that. – Bogdan Mart Jan 26 '20 at 00:43

score 2 · Answer 2 · answered Jun 26 '18 at 10:26

Postgres doesn't have built-in function for min/max uuid, but you can create your own using the following code:

CREATE OR REPLACE FUNCTION min(uuid, uuid)
RETURNS uuid AS $$
BEGIN
    IF $2 IS NULL OR $1 > $2 THEN
        RETURN $2;
    END IF;

    RETURN $1;
END;
$$ LANGUAGE plpgsql;


CREATE AGGREGATE min(uuid)
(
    sfunc = min,
    stype = uuid
);

score 2 · Answer 3 · answered Mar 20 '19 at 19:10

I found the answer's provided by @robin-salih and @bodgan-mart to be a great starting point but ultimately incorrect. Here's a solution which worked better for me:

    CREATE FUNCTION min_uuid(uuid, uuid)
    RETURNS uuid AS $$
    BEGIN
        -- if they're both null, return null
        IF $2 IS NULL AND $1 IS NULL THEN
            RETURN NULL ;
        END IF;

        -- if just 1 is null, return the other
        IF $2 IS NULL THEN
            RETURN $1;
        END IF ;
        IF $1 IS NULL THEN
            RETURN $2;
          END IF;

        -- neither are null, return the smaller one
        IF $1 > $2 THEN
            RETURN $2;
        END IF;

        RETURN $1;
    END;
    $$ LANGUAGE plpgsql;


    create aggregate min(uuid) (
      sfunc = min_uuid,
      stype = uuid,
      combinefunc = min_uuid,
      parallel = safe,
      sortop = operator (<)
    );

For more details, see my post at How to select minimum UUID with left outer join?

scravy · Answer 4 · 2021-12-22T07:30:38.350

I am defining min/max aggregates for uuids using least/greatest which I believe should give the best performance as those are native to postgres (but I haven't benchmarked it).

Since least/greatest are special forms (to my understanding) I have to proxy them using a function which I am marking as immutable and parallel safe.

least/greatest already have proper null-handling behavior.

I am using these in production on Postgres 13.

create or replace function min(uuid, uuid)
    returns uuid
    immutable parallel safe
    language plpgsql as
$$
begin
    return least($1, $2);
end
$$;

create aggregate min(uuid) (
    sfunc = min,
    stype = uuid,
    combinefunc = min,
    parallel = safe,
    sortop = operator (<)
    );

create or replace function max(uuid, uuid)
    returns uuid
    immutable parallel safe
    language plpgsql as
$$
begin
    return greatest($1, $2);
end
$$;

create aggregate max(uuid) (
    sfunc = max,
    stype = uuid,
    combinefunc = max,
    parallel = safe,
    sortop = operator (>)
    );

Thanks for this great answer. Would be awesome if you could contribute that to PostgreSQL core! — Christian Beikov, Oct 05 '22 at 08:01

score -3 · Answer 5 · answered Mar 31 '18 at 00:02

This is not a issue with sqoop. Postgres doesn't allow min/max on uuid. Each uuid is unique and is not considered bigger/smaller than other.

To fix this in sqoop you might need to use some other field as the split-by key. I used created_At timestamp as my split-by key instead.

Function min(uuid) does not exist in postgresql

5 Answers5

Linked