Return all the rows for id's that have all three seq number present

Question

I have a table which looks something like this:

table1

+----+-----+------+
| id | seq | test |
+----+-----+------+
|  1 |   1 | HR   |
|  1 |   2 | RR   |
|  2 |   1 | HR   |
|  2 |   2 | RR   |
|  2 |   3 | OXY  |
|  3 |   1 | HR   |
|  3 |   2 | RR   |
|  4 |   1 | HR   |
|  4 |   2 | RR   | 
|  4 |   3 | OXY  | 
+----+-----+------+

I would like to get the result table like below. That is I need to have all the rows of a particular id only if all the three seq number is present for a particular id:

+----+-----+------+
| id | seq | test |
+----+-----+------+
|  2 |   1 | HR   |
|  2 |   2 | RR   |
|  2 |   3 | OXY  |
|  4 |   1 | HR   |
|  4 |   2 | RR   |
|  4 |   3 | OXY  | 
+----+-----+------+

I am looking forward to write a plpgsql function which gives me the solution. I am relatively new to plpgsql and programming in general. It would be great if someone help me out in getting the result.

So far this is what my function looks like and it is incomplete:

CREATE OR REPLACE FUNCTION test()
returns SETOF table1 AS $$
DECLARE
    cur CURSOR FOR
        SELECT *
        FROM table1
        ORDER by id;
    rec_cur RECORD;
    counter INTEGER DEFAULT 0;

BEGIN 
    OPEN cur;

    FETCH FIRST FROM cur INTO rec_cur;
    MOVE RELATIVE +1 FROM cur;

    LOOP
        FETCH cur INTO rec_cur;
        EXIT WHEN NOT FOUND;

        IF rec_cur.seq = 1 AND counter = 0 THEN
        RETURN NEXT rec_cursor;
        END IF;

    END LOOP;
    CLOSE cur;
    RETURN;

END ; $$ 
LANGUAGE PLPGSQL STABLE PARALLEL SAFE;

Thank you so much. I was just about to do that. Anyways thank you :) — Chandrasen D Rajashekar, May 28 '17 at 20:33
As always, an actual table definition (`CREATE TABLE` statement) would help to clarify. Most importantly: Is `(id, seq)` defined unique and both columns not null? Any other reliable meta info? And always mention your Postgres version. [Edit] the question to clarify. — Erwin Brandstetter, May 30 '17 at 00:35
@ErwinBrandstetter thank you so much for the info. I would definitely consider the points which you have mentioned when I ask the next question. — Chandrasen D Rajashekar, May 31 '17 at 14:09

Gordon Linoff · Answer 1 · 2017-05-28T20:38:42.593

2

A cursor is definitely not the right approach. You can the ids easily using aggregation and having:

select id
from t
where seq in (1, 2, 3)
group by id
having count(seq) = 3;

Then to get the original rows, there are multiple ways:

select t.*
from t join
     (select id
      from t
      where seq in (1, 2, 3)
      group by id
      having count(seq) = 3
     ) tt
     on t.id = tt.id;

EDIT:

If the sequence numbers always start at 1 and have no gaps, then window functions are the way to go:

select t.*
from (select t.*, max(t.seq) over (partition by t.id) as maxseq
      from t
     ) t
where maxseq = 3;

edited May 28 '17 at 20:38

answered May 28 '17 at 20:31

Gordon Linoff

1,242,037
58
646
786

Since `seq` could only have a value from `1, 2, 3`, would `where seq in (1, 2, 3)` have any effect? – Majid Fouladpour May 28 '17 at 20:35
@MajidFouladpour . . . The question does not clearly specify that is the case. – Gordon Linoff May 28 '17 at 20:39

Erwin Brandstetter · Answer 2 · 2017-06-01T01:43:55.267

Your question is incomplete.
If we can assume the existence of rows with seq = 1 and seq = 2 if there is a row with seq = 3 for the same id, then it becomes cheap and simple:

SELECT *
FROM  (SELECT id FROM table1 WHERE seq = 3) x
JOIN   table1 t USING (id)
-- ORDER BY id, seq;  -- unclear whether you need sorted output.

Also assuming (id, seq) to be defined UNIQUE and both column NOT NULL.

If you need to optimize read performance, add a partial index:

CREATE INDEX foo ON table1 (id) WHERE seq = 3;

Since Postgres 9.6 this can be used in an index-only scan.

And you need an index on (id) of course. The index on (id, seq) which exists if you have said UNIQUE constraint does the job as well. Related:

Either way, it's a case of relational-division. here is an arsenal of techniques to identify qualifying id's if we can't assume sequential values in seq:

How to filter SQL results in a has-many-through relation

The table is such a way that, for a unique id for sure there is seq=1 and seq=2 which corresponds to test=HR and test=RR. But for some id's there are seq=3 which correspond to test=OXY. I am interested in only the id's which has all the 3 seq's. And also both the columns do not have NULL values. The additional info. is really great and i appreciate that. Thank you again. — Chandrasen D Rajashekar, May 31 '17 at 17:47

Return all the rows for id's that have all three seq number present

2 Answers2