1

I'm trying to extract transaction details from two already existing tables:

  1. transactions, containing the total amount received,

  2. bills, with a row for each bill received in the transaction, and containing the denomination of the bill.

Both are indexed with a common session id. [Correction: only the transactions table is indexed on session id.]

I've joined the tables and made subqueries to count the number of each bill denomination per transaction (how many 10s, 20s, etc.). I want to get one record for each transaction with all the counts on the same row.

I made it as far as this query:

SELECT
t.session,
to_char(t.amount::numeric, '"$"9990D99') AS "USD",
(select count(b.denom) where b.denom = '50' ) AS "50",
(select count(b.denom) where b.denom = '20') AS "20",
(select count(b.denom) where b.denom = '10') AS "10",
(select count(b.denom) where b.denom = '5') AS "5",
(select count(b.denom) where b.denom = '1') AS "1"

FROM transactions AS t JOIN bills AS b USING (session)
GROUP BY
t.session, t.amount, b.denom
ORDER BY 
t.session,
b.denom ASC;

... which correctly gives me the bill counts, but with one row for each denomination:

   session    |    USD    | 50 | 20 | 10 | 5 | 1
--------------+-----------+----+----+----+---+----
 c64af32f1815 | $  135.00 |    |    |    | 1 |
 c64af32f1815 | $  135.00 |    |    |  1 |   |
 c64af32f1815 | $  135.00 |    |  6 |    |   |
 643e096b6542 | $  175.00 |    |    |    |   | 10
 643e096b6542 | $  175.00 |    |    |    | 1 |
 643e096b6542 | $  175.00 |    |  8 |    |   |
 ce7d2c647eff | $  200.00 |  4 |    |    |   |

What I want is this, with one row per transaction:

   session    |    USD    | 50 | 20 | 10 | 5 | 1
--------------+-----------+----+----+----+---+----
 c64af32f1815 | $  135.00 |    |  6 |  1 | 1 |
 643e096b6542 | $  175.00 |    |  8 |    | 1 | 10
 ce7d2c647eff | $  200.00 |  4 |    |    |   |

What do I need to understand to fix this query?

Revised Query (following @erwin suggestion to avoid subqueries):

SELECT
t.session,
to_char(t.amount::numeric, '"$"9990D99') AS "USD",
COUNT(NULLIF(b.denom = '100', FALSE)) AS "100",
COUNT(NULLIF(b.denom = '50', FALSE)) AS "50",
COUNT(NULLIF(b.denom = '20', FALSE)) AS "20",
COUNT(NULLIF(b.denom = '10', FALSE)) AS "10",
COUNT(NULLIF(b.denom = '5', FALSE)) AS "5",
COUNT(NULLIF(b.denom = '1', FALSE)) AS "1"

FROM transactions AS t JOIN bills AS b USING (session)
GROUP BY
t.session, t.amount, b.denom
ORDER BY 
t.session,
b.denom ASC;

This query still generates one line of output for each aggregate (count) function call.

  • Thank you all for the suggestions. Two of the answers solved the immediate problem at hand, and seem to be similar solutions. The suggestion of @erwin allowed me to reduce the number of subquery layers from 2 to 1, but I'm still not able to achieve what I want without subqueries. I want to explore this possibility more deeply before choosing one as the answer. And I still want to understand why the results aren't what I expected... Why does each aggregate function call generate a full new line in the output (even when I removed the 2nd-level subquery)? – W L Bentley Mar 12 '16 at 23:21
  • The table definition should be in the question - ideally complete `CREATE TABLE` scripts. We need to see data types and constraints. And your version of Postgres. The cast to `numeric` raises questions. And there are inconsistencies in your example: You say `ORDER BY t.session ...`, but the desired result shows a different sort order. And I already stressed that you cannot include `b.denom` in your `GROUP BY`, which is the primary remaining problem. – Erwin Brandstetter Mar 13 '16 at 04:31
  • The context here is that I'm trying to extract reports from a database that is not mine to change. The queries and the outputs were slightly edited/simplified to focus on the issues at hand -- inconsistencies arose only from that. The cast to numeric is unnecessary and was a copy/paste error. (Thanks for pointing that out.) – W L Bentley Mar 13 '16 at 06:38
  • Ok. All three answers give the results I wanted. (Some of my confusion was self-inflicted in the end.) I'm choosing @erwin's as the best for eliminating the subqueries. Thanks to all for your help. – W L Bentley Mar 13 '16 at 07:41

3 Answers3

0

I think you are almost there. You need to do another group by after your query. For instance:

WITH q1 as (SELECT
    t.session,
    to_char(t.amount::numeric, '"$"9990D99') AS "USD",
    (select count(b.denom) where b.denom = '50' ) AS "50",
    (select count(b.denom) where b.denom = '20') AS "20",
    (select count(b.denom) where b.denom = '10') AS "10",
    (select count(b.denom) where b.denom = '5') AS "5",
    (select count(b.denom) where b.denom = '1') AS "1"

    FROM transactions AS t JOIN bills AS b USING (session)
    GROUP BY t.session, t.amount, b.denom)

SELECT session, "USD", SUM("50") AS "50", SUM("20") AS "20", SUM("10") AS "10",
       SUM("5") AS "5", SUM("1") AS "1"
FROM q1
GROUP BY session, "USD"
Renzo
  • 26,848
  • 5
  • 49
  • 61
  • I tried modifying the query as you suggest, but it does not recognize USD as a valid column name in the second SELECT. I received `ERROR: column "usd" does not exist`. If I instead refer to it with t.amount or trnasactions.amount, i receive `ERROR: missing FROM-clause entry for table "t"`. So, unable to make this work. – W L Bentley Mar 12 '16 at 07:50
  • Did you tried to modify both occurrencies of USD with "USD"? I edited my answer. – Renzo Mar 12 '16 at 07:52
  • Yes. With "USD" in double quotes, this produces the results I want. – W L Bentley Mar 12 '16 at 07:55
0

Do not use correlated subqueries. That's inefficient.
And do not include b.denom in the GROUP BY clause. That's your primary error.

Postgres 9.4+

In Postgres 9.4 or later use the dedicated aggregate FILTER feature:

SELECT t.session
     , to_char(t.amount::numeric, '"$"9990D99') AS "USD"
     , count(*) FILTER (WHERE b.denom = '50')   AS "50"  -- !
     , count(*) FILTER (WHERE b.denom = '20')   AS "20"  -- !
     , ...
FROM   ...
GROUP  BY t.session, t.amount  -- !
ORDER  BY ...

Explanation and links to more:

Postgres 9.3-

For older versions (Postgres 9.3 or older), there is a variety of (less elegant) alternatives:

SELECT t.session
     , to_char(t.amount::numeric, '"$"9990D99') AS "USD"  -- why cast to numeric?
     , count(b.denom = '100' OR NULL) AS "100"  -- bad column name
     , count(b.denom =  '50' OR NULL) AS "50"
     , count(b.denom =  '20' OR NULL) AS "20"
     , count(b.denom =  '10' OR NULL) AS "10"
     , count(b.denom =   '5' OR NULL) AS "5"
     , count(b.denom =   '1' OR NULL) AS "1"
FROM   transactions t
JOIN   bills b USING (session)
GROUP  BY t.session, t.amount
ORDER  BY t.session;

After cleaning up:

SELECT t.session, to_char(t.amount, '"$"9990D99') AS usd
     , d100, d50, d20, d10, d5, d1
FROM   transactions t
LEFT   JOIN (
   SELECT session
        , nullif(count(denom = 100 OR NULL), 0) AS d100
        , nullif(count(denom =  50 OR NULL), 0) AS d50
        , nullif(count(denom =  20 OR NULL), 0) AS d20
        , nullif(count(denom =  10 OR NULL), 0) AS d10
        , nullif(count(denom =   5 OR NULL), 0) AS d5
        , nullif(count(denom =   1 OR NULL), 0) AS d1
   FROM   bills
   GROUP  BY 1
   ) b USING (session)
ORDER  BY session;

What's changed?

  • Don't use illegal identifiers if you can avoid it, then you also don't need double quotes.

  • Assuming integer for bills.denom, so we also don't need single quotes around the constants.

  • Seems like you want NULL in columns where no bills are found. Wrap the result in NULLIF()

  • Since you retrieve the whole table it's faster to aggregate before you join.

SQL Fiddle.

More techniques and explanation:

Faster alternative: crosstab()

For top performance, use actual cross tabulation with crosstab(). You need the additional module tablefunc installed. Read this first if you are not familiar with it:

SELECT t.session, to_char(t.amount, '"$"9990D99') AS usd
     , d100, d50, d20, d10, d5, d1
FROM   transactions t
LEFT   JOIN crosstab(
     $$SELECT session, denom, count(*)::int
       FROM   bills
       GROUP  BY 1,2
       ORDER  BY 1$$
   , $$SELECT unnest ('{100,50,20,10,5,1}'::int[])$$
   ) AS ct (session text, d100 int, d50 int, d20 int, d10 int, d5 int, d1 int)
     USING (session);
Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Currently running version 9.3.11, so the 9.4 / FILTER suggestion will have to wait until we can upgrade to 9.4. FILTER looks like a more elegant solution, and is very appealing after struggling with subqueries. – W L Bentley Mar 12 '16 at 08:06
  • With version 9.3: Removing the individual SELECTs and replacing them with `SUM((b.denom = '100')::int) AS "100"` is clearly a step in the right direction. But this still leaves each denomination on its own line. If I remove `b.denom` from the GROUP BY clause, I get `ERROR: column "b.denom" must appear in the GROUP BY clause...`. Isn't one subquery still needed to reduce the rows to one per transaction? Or am I missing something? – W L Bentley Mar 12 '16 at 08:47
  • Yes. The "After cleaning up" query works as I expected. Thank you for pushing me to go beyond multiple subqueries. – W L Bentley Mar 13 '16 at 07:35
0

can you just add query globale :

SELECT
session,
"USD",
sum("50") AS "50",
sum("20") AS "20",
sum("10") AS "10",
sum("5") AS "5",
sum("1") AS "1"

from (SELECT
t.session,
to_char(t.amount::numeric, '"$"9990D99') AS "USD",
(select count(b.denom) where b.denom = '50' ) AS "50",
(select count(b.denom) where b.denom = '20') AS "20",
(select count(b.denom) where b.denom = '10') AS "10",
(select count(b.denom) where b.denom = '5') AS "5",
(select count(b.denom) where b.denom = '1') AS "1"

FROM transactions AS t JOIN bills AS b USING (session)
GROUP BY
t.session, t.amount, b.denom
ORDER BY 
t.session,
b.denom ASC)youre_query
GROUP BY session,"USD"
Mesbah Gueffaf
  • 518
  • 1
  • 7
  • 21