1

I have a crosstab() query similar to the one in my previous question:
Unexpected effect of filtering on result from crosstab() query

The common case is to filter extra1 field with multiples values: extra1 IN(value1, value2...). For each value included on the extra1 filter, I have added an ordering expression like this (extra1 <> valueN), as appear on the above mentioned post. The resulting query is as follows:

SELECT *
FROM crosstab(
 'SELECT row_name, extra1, extra2..., another_table.category, value
  FROM   table t
  JOIN   another_table ON t.field_id = another_table.field_id
  WHERE  t.field = certain_value AND t.extra1 IN (val1, val2, ...) --> more values
  ORDER  BY row_name ASC, (extra1 <> val1), (extra1 <> val2)', ... --> more ordering expressions
 'SELECT category_name FROM category_name WHERE field = certain_value'
) AS ct(extra1, extra2...)
WHERE extra1 = val1; --> condition on the result

The first value of extra1 included on the ordering expression value1, get the correct resulting rows. However, the following ones value2, value3..., get wrong number of results, resulting on less rows on each one. Why is that?

UPDATE:

Giving this as our source table (table t):

+----------+--------+--------+------------------------+-------+
| row_name | Extra1 | Extra2 | another_table.category | value |
+----------+--------+--------+------------------------+-------+
| Name1    | 10     | A      | 1                      | 100   |
| Name2    | 11     | B      | 2                      | 200   |
| Name3    | 12     | C      | 3                      | 150   |
| Name2    | 11     | B      | 3                      | 150   |
| Name3    | 12     | C      | 2                      | 150   |
| Name1    | 10     | A      | 2                      | 100   |
| Name3    | 12     | C      | 1                      | 120   |
+----------+--------+--------+------------------------+-------+

And this as our category table:

+-------------+--------+
| category_id | value  |
+-------------+--------+
| 1           | Cat1   |
| 2           | Cat2   |
| 3           | Cat3   |
+-------------+--------+

Using the CROSSTAB, the idea is to get a table like this:

+----------+--------+--------+------+------+------+
| row_name | Extra1 | Extra2 | cat1 | cat2 | cat3 |
+----------+--------+--------+------+------+------+
| Name1    | 10     | A      | 100  | 100  |      |
| Name2    | 11     | B      |      | 200  | 150  |
| Name3    | 12     | C      | 120  | 150  | 150  |
+----------+--------+--------+------+------+------+

The idea is to be able to filter the resulting table so I get results with Extra1 column with values 10 or 11, as follow:

+----------+--------+--------+------+------+------+
| row_name | Extra1 | Extra2 | cat1 | cat2 | cat3 |
+----------+--------+--------+------+------+------+
| Name1    | 10     | A      | 100  | 100  |      |
| Name2    | 11     | B      |      | 200  | 150  |
+----------+--------+--------+------+------+------+

The problem is that on my query, I get different result size for Extra1 with 10 as value and Extra1 with 11 as value. With (Extra1 <> 10) I can get the correct result size on Extra1 for that value but not in the case of 11 as value.

Here is a fiddle demonstrating the problem in more detail:

https://dbfiddle.uk/?rdbms=postgres_11&fiddle=5c401f7512d52405923374c75cb7ff04

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Your new example is not helpful either, as it only shows rows with the same `extra1` value for the same `row_name`. Please provide a working fiddle demonstrating the actual problem. – Erwin Brandstetter Jul 26 '19 at 15:52
  • I have updated the fiddle that you have shared before with an example. On the last two queries you can see how the results change depending on the order expression, which it is the same that happens to me. What I would like is to filter `extra` with values `1` and `2` but receiving the correct result regardless the order expression. Source element with `1546300800` as `unixdatetime` has two different values for `extra`, `1` and `2`, so the result changes depending of the ordering expression. Is it possible to get two rows for `1546300800` or something like that? It is the only solution I get – jesusgonzalezrivera Jul 26 '19 at 16:57

1 Answers1

1

All "extra" columns are copied from the first row of the group (as pointed out in my previous answer)

While you filter with:

.... WHERE extra1 = 'val1';

...it makes no sense to add more ORDER BY expressions on the same column. Only rows that have at least one extra1 = 'val1' in their source group survive.

From your various comments, I guess you might want to see all distinct existing values of extra - within the set filtered in the WHERE clause - for the same unixdatetime. If so, aggregate before pivoting. Like:

SELECT * 
FROM   crosstab(
   $$
   SELECT unixdatetime, x.extras, c.name, s.value
   FROM  (
      SELECT unixdatetime, array_agg(extra) AS extras
      FROM  (
         SELECT DISTINCT unixdatetime, extra
         FROM   source_table   s
         WHERE  extra IN (1, 2)     -- condition moves here
         ORDER  BY unixdatetime, extra
         ) sub
      GROUP  BY 1
      ) x
   JOIN   source_table   s USING (unixdatetime)
   JOIN   category_table c ON c.id = s.gausesummaryid
   ORDER  BY 1
   $$
 , $$SELECT unnest('{trace1,trace2,trace3,trace4}'::text[])$$
) AS final_result (unixdatetime int
                 , extras int[]
                 , trace1 numeric
                 , trace2 numeric
                 , trace3 numeric
                 , trace4 numeric);

Aside: advice given in the following related answer about the 2nd function parameter applies to your case as well:

I demonstrate a static 2nd parameter query above. While being at it, you don't need to join to category_table at all. The same, a bit shorter and faster, yet:

SELECT * 
FROM   crosstab(
   $$
   SELECT unixdatetime, x.extras, s.gausesummaryid, s.value
   FROM  (
      SELECT unixdatetime, array_agg(extra) AS extras
      FROM  (
         SELECT DISTINCT unixdatetime, extra
         FROM   source_table
         WHERE  extra IN (1, 2)     -- condition moves here
         ORDER  BY unixdatetime, extra
         ) sub
      GROUP  BY 1
      ) x
   JOIN   source_table s USING (unixdatetime)
   ORDER  BY 1
   $$
,  $$SELECT unnest('{923,924,926,927}'::int[])$$
) AS final_result (unixdatetime int
                 , extras int[]
                 , trace1 numeric
                 , trace2 numeric
                 , trace3 numeric
                 , trace4 numeric);

db<>fiddle here - added my queries at the bottom of your fiddle.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228