1

Is there a way in PostgreSQL to take this table:

ID country name values
1 USA John Smith {1,2,3}
2 USA Jane Smith {0,1,3}
3 USA Jane Doe {1,1,1}
4 USA John Doe {0,2,4}

and generate this table from it with the column agg_values:

ID country name values agg_values
1 USA John Smith {1,2,3} {0,1,3,1,1,1,0,2,4}
2 USA Jane Smith {0,1,3} {1,2,3,1,1,1,0,2,4}
3 USA Jane Doe {1,1,1} {1,2,3,0,1,3,0,2,4}
4 USA John Doe {0,2,4} {1,2,3,0,1,3,1,1,1}

Where each row aggregates all values except from the current row and its peers.
So if name = John Smith then agg_values = aggregate of all values where name not = John Smith. Is that possible?

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Rose Day
  • 15
  • 4

2 Answers2

1

You can use a lateral join to a derived table that unnests all rows where the name is not equal and then aggregates that back into an array:

select t1.*, xu.agg_values
from the_table t1
  cross join lateral (
      select array_agg(tu.v) as agg_values
      from the_table t2
        cross join unnest(t2."values") as tu(v)
      where t2.name <> t1.name
  ) xu 

This can be simplified by creating a custom aggregate to avoid the unnesting and aggregating:

create aggregate array_combine(anyarray)
(
  sfunc = array_cat(anyarray, anyarray),
  stype = anyarray
);

Then this can be written as:

select t1.*, xu.agg_values
from the_table t1
  cross join lateral (
      select array_combine(t2.values) as agg_values
      from the_table t2
      where t2.name <> t1.name
  ) xu 
1

In Postgres 11 or later, use a window function with a custom frame and a frame_exclusion:

SELECT *, array_combine(values) OVER (ROWS BETWEEN UNBOUNDED PRECEDING
                                           AND UNBOUNDED FOLLOWING
                                           EXCLUDE CURRENT ROW) AS agg_values
FROM   tbl;

If name is not UNIQUE, and since you asked:

all values where name not = John Smith

SELECT *, array_combine(values) OVER (ORDER BY name
                                      ROWS BETWEEN UNBOUNDED PRECEDING
                                           AND UNBOUNDED FOLLOWING
                                           EXCLUDE GROUP) AS agg_values
FROM   tbl;

db<>fiddle here

The first one (also) works with arbitrary order of rows, only excluding the current one. The second requires ORDER BY to establish which rows are in the same group.

The manual:

The frame_exclusion option allows rows around the current row to be excluded from the frame, even if they would be included according to the frame start and frame end options. EXCLUDE CURRENT ROW excludes the current row from the frame. EXCLUDE GROUP excludes the current row and its ordering peers from the frame. EXCLUDE TIES excludes any peers of the current row from the frame, but not the current row itself. [...]

Bold emphasis mine.

This uses the custom aggregate function array_combine(anyarray) provided by a_horse.
Or here:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228