2

Suppose I have a table -

A        B       C  
1        3       5  
1        3       7  
1        3       9  
2        4       3  
2        4       6  
2        4       1 

here there are multiple copies for the same combination of A and B. for each combination I want back the first entry of it. so the result for this table i want to be-

A        B      C  
1        3      5  
2        4      3 

How can I do this in postgres sql?

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
user3311298
  • 335
  • 5
  • 12
  • Please define "first" (there is no natural order in a table). And what is the primary key? [This related answer may be of help.](http://stackoverflow.com/questions/3800551/select-first-row-in-each-group-by-group/7630564#7630564) – Erwin Brandstetter Mar 12 '14 at 01:45

2 Answers2

8

Assuming you can define "first" in terms of a sort on a, b, and c you want DISTINCT ON for this.

SELECT
  DISTINCT ON ("A", "B")
  "A", "B", "C"
FROM Table1
ORDER BY "A", "B", "C";

E.g. http://sqlfiddle.com/#!15/9ca16/1

See SELECT for more on DISTINCT ON.


If you have made the serious mistake of assuming SQL tables have an inherent order, you're going to need to fix your table before you proceed. You can use the PostgreSQL ctid pseudo-column to guide the creation of a primary key that matches the current on-disk table order. It should be safe to just:

ALTER TABLE mytable ADD COLUMN id SERIAL PRIMARY KEY;

as PostgreSQL will tend to write the key in table order. It's not guaranteed, but neither is anything else when there's no primary key. Then you can:

SELECT
  DISTINCT ON ("A", "B")
  "A", "B", "C"
FROM Table1
ORDER BY id;

(Edit: I don't recommend using ctid in queries baked into applications. It's a handy tool for solving specific problems, but it's not really public API in PostgreSQL, and it's not part of the SQL standard. It's not like ROWID in Oracle, it changes due to vacuum etc. PostgreSQL is free to break/change/remove it in future versions.)

Craig Ringer
  • 307,061
  • 76
  • 688
  • 778
1

Well, you can sort of do this. SQL tables have no concept of ordering, so you really need a column to specify the order. The following returns an arbitrary row from each group:

select distinct on(a, b) a, b, c
from table t
order by a, b;

Normally, you would use something like:

select distinct on(a, b) a, b, c
from table t
order by a, b, id desc;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786