2

I know I'm probably just thinking about this wrong. I have the following structure:

  CREATE TABLE mytable (
  id       serial PRIMARY KEY
, employee text UNIQUE NOT NULL
, data     jsonb
);

And the following data:

INSERT INTO mytable (employee, data)
VALUES
 ('Jim', '{"sales": [{"value": 10, "yr": "2010"}, {"value": 5, "yr": "2011"}, {"value": 40, "yr": "2012"}]}'),
 ('Rob', '{"sales": [{"value": 10, "yr": "2009"}, {"value": 5, "yr": "2010"}, {"value": 41, "yr": "2011"}]}')

I'm trying to return all the employees and the "value" of their sales in 2012. If there is no sales in 2012 then return "No Data". I have:

SELECT id, employee, 
coalesce((SELECT s.value AS value FROM mytable, jsonb_to_recordset(mytable.data->'sales') AS s(yr text, value float)
WHERE s.yr='2012'), 0) AS b FROM mytable

I get:

id |employee |b
53 |Jim      |40 
54 |Rob      |40

The value is wrong for 'Rob'. It should be 'No Data'. (I am using 0 as the 2nd parameter for coalesce as I get an error "invalid input syntax for type double precision: 'No Data'"

  • 1
    The best query depends on the details: how many rows total, how many different employees? How many rows per employee? You defined `employee` and `data` so they can be NULL. Is that just a design flaw or can each column be NULL? How to deal with NULL employees? – Erwin Brandstetter Jul 10 '15 at 01:25

2 Answers2

0

The problem is that your inner SELECT does not actually filter on the employee's id. As such, Jim's 40 is also selected for Rob.

It is actually much easier to see if you insert another employee with a value for 2012.
You'd get

ERROR:  21000: more than one row returned by a subquery used as an expression

because the SELECT value FROM ... WHERE yr = '2012' would return multiple values, ie you would (essentially) be asking for (assuming the second employee has 41 sales)

SELECT COALESCE((VALUES (40), (41)));

You could use a CTE (might not be the most efficient way though) :

WITH sales_2012 AS (
  SELECT id, s.value
  FROM mytable,
       jsonb_to_recordset(mytable.data->'sales') AS s(yr text, value float)
  WHERE s.yr='2012'
)
SELECT employee, COALESCE(value, 0)       
FROM mytable
LEFT OUTER JOIN sales_2012
ON mytable.id = sales_2012.id
;
┌──────────┬────────┐
│ employee │ value  │
├──────────┼────────┤
│ Jim      │     40 │
│ Rob      │      0 │
└──────────┴────────┘
Marth
  • 23,920
  • 3
  • 60
  • 72
0

The key element is to use LEFT JOIN LATERAL instead of the implicit CROSS JOIN LATERAL as which the short notation with just a comma is interpreted.

The query can simply be:

SELECT t.id, t.employee, s.*
FROM   mytable t
LEFT   JOIN LATERAL jsonb_to_recordset(t.data->'sales')
                 AS s(yr int, value int) ON s.yr = 2012;

We can conveniently pick sales with yr = 2012 right away without losing employees from the result.

To prettify with 'NO Data' the value column must be a matching string type:

SELECT t.id, t.employee
     , COALESCE(s.yr, 2012) AS yr
     , COALESCE(s.value, 'No Data') AS value
FROM   mytable t
LEFT   JOIN LATERAL jsonb_to_recordset(t.data->'sales')
                 AS s(yr int, value text) ON s.yr = 2012;

Building on these missing (likely) details:

  • There is exactly one row per employee in the table: empoyee is UNIQUE NOT NULL.
  • Each employee can have 0-n years of sales in data - data can be NULL.

  • yr and value store valid integer numbers. Else adapt the type.

Proper table definition:

CREATE TABLE mytable (
  id       serial PRIMARY KEY
, employee text UNIQUE NOT NULL
, data     jsonb
);

What's a LATERAL JOIN?

As per request in comment, these links should help, especially the first for beginners:

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Thanks! I am new to postgres and don't quite understand a Lateral join....If I take out the word "LATERAL" from your query and keep everything else as is I get the same results....I just don't understand what Lateral does here even after reading the docs –  Jul 10 '15 at 04:18
  • @kristen: You can omit the key word `LATERAL` here, because for set-returning functions in the `FROM` list it is assumed automatically. Have you also read the linked answers and did those not help to understand either? Either way, before you ask more question, please provide some answers. Are my assumptions correct? Please update your question to clarify. – Erwin Brandstetter Jul 10 '15 at 04:31
  • Yes, those links help....You are also correct in your assumptions. I have updated my question. Thanks for all your help! –  Jul 10 '15 at 20:31