Get max value from a window of rows as new column for all rows

Question

| col 1 | col 2 | col 3 |
|-------|-------|-------|
| 67458 | ADM   | 1008  |
| 67458 | ADM   | 1009  |
| 67458 | SKI   | 1009  |
| 67458 | LIS   | 1010  |
| 67458 | TOU   | 1121  |

How to get max value of col3 when col2='ADM' and use that as the value for rest of the records?

Expected Result:

| col 1 | col 2 | col 3 | col 4 |
|-------|-------|-------|-------|
| 67458 | ADM   | 1008  | 1009  |
| 67458 | ADM   | 1009  | 1009  |
| 67458 | SKI   | 1009  | 1009  |
| 67458 | LIS   | 1010  | 1009  |
| 67458 | TOU   | 1121  | 1009  |

I know how to do this with sub-selects and all. col4 will be a pseudo-column to be used downstream for JOINs and stuff.

I have tried the following but it populates 1121 instead of 1009:

MAX(col3) OVER (PARTITION BY col1 (CASE WHEN col2='ADM' THEN col2 END) ORDER BY col1)

score 1 · Answer 1 · answered Mar 29 '19 at 15:17

1

You can use correlated subquery :

select t.*,
       (select max(t1.col3) from table t1 where t1.col1 = t.col1 and t1.col2 = 'ADM') as col4
from table t;

answered Mar 29 '19 at 15:17

Yogesh Sharma

49,870
5
26
52

Gordon Linoff · Answer 2 · 2019-03-29T22:09:35.367

1

Use window functions:

select t.*,
       max(col3) filter (where col2 = 'ADM') over (partition by col1) as col4
from t;

In Redshift, you probably have to use case:

select t.*,
       max(case when col2 = 'ADM' then col3 end) over (partition by col1) as col4
from t;

edited Mar 29 '19 at 22:09

answered Mar 29 '19 at 15:17

Gordon Linoff

1,242,037
58
646
786

Thanks for the quick response. Does this work in Redshift? – harsha87 Mar 29 '19 at 15:26

Cid · Answer 3 · 2019-03-29T15:36:45.890

You can use a sub-query in your SELECT to get that value.

This suq-query can be stored in a variable if you mind executing it again and again and cross joining it.

Schema (PostgreSQL v10.0)

CREATE TABLE test (
  "col 1" INTEGER,
  "col 2" VARCHAR(3),
  "col 3" INTEGER
);

INSERT INTO test
  ("col 1", "col 2", "col 3")
VALUES
  ('67458', 'ADM', '1008'),
  ('67458', 'ADM', '1009'),
  ('67458', 'SKI', '1009'),
  ('67458', 'LIS', '1010'),
  ('67458', 'TOU', '1121');

Query #1

SELECT MAX("col 3") AS "col 4" INTO col4 FROM test t2 WHERE t2."col 2" = 'ADM';

There are no results to be displayed.

Query #2

SELECT "col 1",
       "col 2",
       "col 3",
       "col 4"
FROM test
CROSS JOIN
col4;

Output

| col 1 | col 2 | col 3 | col 4 |
| ----- | ----- | ----- | ----- |
| 67458 | ADM   | 1008  | 1009  |
| 67458 | ADM   | 1009  | 1009  |
| 67458 | SKI   | 1009  | 1009  |
| 67458 | LIS   | 1010  | 1009  |
| 67458 | TOU   | 1121  | 1009  |

View on DB Fiddle

Thanks for the quick response. I had this in my mind, but the test table is really huge. Isn't it going to be a bad code if use test to calculate on pseudo column — harsha87, Mar 29 '19 at 15:27

score 0 · Accepted Answer · answered Mar 29 '19 at 15:38

0

SELECT t.*, max(CASE WHEN col2 = 'ADM' THEN col3 END) OVER (PARTITION BY col1) AS col4
FROM   tbl t;

The aggregate FILTER clause was introduced in PostgreSQL 9.4 and would be nice for this. But Redshift is not Postgres, and most of the later additions in Postgres are unsupported there. Compare:

How can I simplify this game statistics query?

answered Mar 29 '19 at 15:38

Erwin Brandstetter

605,456
145
1,078
1,228

1

This is what i was looking for. Thank for your response. – harsha87 Mar 29 '19 at 16:52

Get max value from a window of rows as new column for all rows

4 Answers4

Linked