1

Like the title of the question suggests, I'm attempting take a number of arbitrary sub-queries and combine them into a single, large query.

Ideally, I'd like to the data to be returned as a single record, with each column being the result of one of the sub-queries. E.G.

|   sub-query 1   |   sub-query 2   |      ...
|-----------------|-----------------|-----------------
| (array of rows) | (array of rows) |      ...

The sub-queries themselves are built using Knex.js in a Node app and are completely arbitrary. I've come fairly close to a proper solution, but I've hit a snag.

My current implementation has the final query like so:

SELECT
  array_agg(sub0.*) as s0,
  array_agg(sub1.*) as s1,
  ...
FROM
  (...) as sub0,
  (...) as sub1,
  ...
;

Which mostly works, but causes huge numbers of duplicates in the output. During my testing, I found that it returns records such each record is duplicated a number of times equal to how many records would have been returned without the duplicates. For example, a sub-query that should return 10 records would, instead, return 100 (each record being duplicated 10 times).

I've yet to figure out why this occurs or how to fix the query to not get the issue.

So far, I've only been able to determine that:

  • The number of records returned by the sub-queries is correct when queried separately
  • The duplicates are not caused by intersections between the sub-queries
    • i.e. sub-queries contain rows that exist in other sub-queries

Thanks in advance.

Chris Hall
  • 875
  • 10
  • 18
  • 1
    The reason why you are seeing 100 total records when you have 10 records in each subquery is that you are performing an implicit cross join of your two subqueries. A cross join returns the cartesian product of two tables (or derived tables). In other words, each record from sub0 is joined to each record from sub1. – Sam Choukri Mar 04 '15 at 23:13

2 Answers2

2

Just place the arbitrary queries in the select list:

with sq1 as (
    values (1, 'x'),(2, 'y')
), sq2 as (
    values ('a', 3), ('b', 4), ('c', 5)
)
select
    (select array_agg(s.*) from (select * from sq1) s) as s0,
    (select array_agg(s.*) from (select * from sq2) s) as s1
;
        s0         |            s1             
-------------------+---------------------------
 {"(1,x)","(2,y)"} | {"(a,3)","(b,4)","(c,5)"}
Clodoaldo Neto
  • 118,695
  • 26
  • 233
  • 260
  • Can you please help me for this: http://stackoverflow.com/questions/28855405/postgresql-9-3-dynamic-pivot-table-for-huge-records – MAK Mar 05 '15 at 06:41
0

Also you can add row_number to sub-queries and use that column to outer join tables (instead of cross join):

SELECT
  array_agg(sub0.*) as s0,
  array_agg(sub1.*) as s1
FROM
  (SELECT row_number() OVER (), * FROM (VALUES (1, 'x'),(2, 'y')) t) as sub0
  FULL OUTER JOIN
  (SELECT row_number() OVER (), * FROM (VALUES ('a', 3), ('b', 4), ('c', 5)) t1) as sub1
  ON sub0.row_number=sub1.row_number
;

             s0             |               s1                
----------------------------+---------------------------------
 {"(1,1,x)","(2,2,y)",NULL} | {"(1,a,3)","(2,b,4)","(3,c,5)"}
(1 row)
ndpu
  • 22,225
  • 6
  • 54
  • 69