I'm building a hierarchical JSON result from several tables. These are just examples but should be sufficient for the purpose of this demonstration to get the idea:
CREATE TABLE book (
id INTEGER PRIMARY KEY NOT NULL,
data JSONB
);
CREATE TABLE author (
id INTEGER PRIMARY KEY NOT NULL,
data JSONB
);
CREATE TABLE book_author (
id INTEGER PRIMARY KEY NOT NULL,
author_id INTEGER,
book_id INTEGER
);
CREATE UNIQUE INDEX pk_unique ON book_author (author_id, book_id);
Test data:
INSERT INTO book (id, data) VALUES
(1, '{"pages": 432, "title": "2001: A Space Odyssey"}')
, (2, '{"pages": 300, "title": "The City And The City"}')
, (3, '{"pages": 143, "title": "Unknown Book"}');
INSERT INTO author (id, data) VALUES
(1, '{"age": 90, "name": "Arthur C. Clarke"}')
, (2, '{"age": 43, "name": "China Miéville"}');
INSERT INTO book_author (id, author_id, book_id) VALUES
(1, 1, 1)
, (2, 1, 2);
I've created the following function:
CREATE OR REPLACE FUNCTION public.book_get()
RETURNS json AS
$BODY$
DECLARE
result json;
BEGIN
SELECT to_json(array_agg(_b)) INTO result
FROM (
SELECT
book.id id,
book.data->>'title' title,
book.data->>'pages' pages,
(
SELECT to_json(array_agg(_a))
FROM (
SELECT
author.id id,
author.data->>'name' "name",
author.data->>'age' age
FROM
author, book_author ba
WHERE
ba.author_id = author.id AND
ba.book_id = book.id
ORDER BY id
) _a
) authors
FROM
book
ORDER BY id ASC
) _b;
RETURN result;
END;
$BODY$ LANGUAGE plpgsql VOLATILE;
Executing the function book_get
SELECT book_get();
Produces the following results
[
{
"id":1,
"title":"2001: A Space Odyssey",
"pages":432,
"authors":[
{
"id":1,
"name":"Arthur C. Clarke",
"age":90
}
]
},
{
"id":2,
"title":"The City And The City",
"pages":300,
"authors":[
{
"id":2,
"name":"China Miéville",
"age":43
}
]
},
{
"id":3,
"title":"Unknown Book",
"pages":143,
"authors":null
}
]
Now I'm able to filter the data with a WHERE
clause, e.g.
SELECT to_json(array_agg(_b)) INTO result
FROM (
...
) _b
-- give me the book with id 1
WHERE _b.id = 1;
-- or give me all titles with the occurrence of 'City' anywhere
WHERE _b.title LIKE '%City%';
-- or has more than 200 pages
WHERE _b.pages > 200;
How would I make it possible to filter on authors
? E.g. something equivalent to WHERE _b.authors.'name' = 'Arthur C. Clarke'
.
I have absolutely no idea what type authors
becomes? Or is? Is it still a record (array)? Is it JSON already? I guess because I can access id
, title
and pages
accessing _b.authors
isn't such a problem?
Accessing _b.authors
gives me ERROR: missing FROM-clause entry for table "authors"
Accessing with JSON operators _b.authors->>..
or _b->authors->>..
gives me
operator does not exist: record -> json Hint: No operator matches the given name and argument type(s). You might need to add explicit type casts.
I remember using GROUP BY
with the HAVING
clause:
GROUP BY _b.authors
HAVING _b.authors->>'name' = 'Arthur C. Clarke';
But it gives me the error:
ERROR: could not identify an equality operator for type json
To make it a bit more clear:
SELECT to_json(array_agg(_b)) INTO result
FROM (
...
) _b
WHERE _b.authors->0->>'name' = 'Arthur C. Clarke';
Will basically do what I need, this only matches if the author on index 0
is Arthur C. Clarke
. If he has cowritten the book and he would be on second place (index 1) then there wouldn't have been a match. So what I try to find is the correct syntax on scanning _b.authors
which happens to be an JSON array filled with authors. It just doesn't accept any attempt. As far as I understand are the @>
and #>
only supported for JSONB
. So how do I get the correct syntax on selecting _b.authors
on any column against a value.
Update 2
Ok read the docs again ... it seems I didn't get the part from Postgres documentation that there is a difference between JSON and JSONB regarding functions, I thought it was only concerning the data type. Replacing to_json
with to_jsonb
seems to do the trick with using operators like @>
etc in the where clause.
Update 3
@ErwinBrandstetter: Makes sense. LATERAL wasn't known to me yet, good to know it exists. I got the hang on JSON/JSONB's functions and operators, makes a lot of sense to me now. What isn't clear to me is finding occurrences with LIKE
for example in the WHERE
clause.
If I need to use jsonb_array_elements
to unnest objects in an array (because in the final WHERE
clause, the content of b.authors
is of JSONB datatype). I could then do
SELECT * FROM jsonb_array_elements('[
{"age": 90, "name": "the Arthur C. Clarke"},
{"age": 43, "name": "China Miéville"},
{"age": null, "name": "Erwin the Brandstetter"}
]'::jsonb) author
WHERE
author->>'name' LIKE '%the%';
and get the desired result,
1: {"age": 90, "name": "the Arthur C. Clarke"}
2: {"age": null, "name": "Erwin the Brandstetter"}
but whats the approach to achieve this in the final (last) WHERE
clause in my example? Pointing out the final WHERE
clause because I'd like to filter the complete set of results, and not partially filter somewhere in the middle of a sub-select. So in general I would like to filter out books in the final result set that have an author with a middle name 'C.' or a first name 'Arthur'.
Update 4
In the FROM
clause of course. I'll have to do performance tweaking at the end when I figured out all possibilities, but this is what I came up with.
SELECT json_agg(_b) INTO result
FROM (
...
) _b,
jsonb_array_elements(_b.authors) AS arrauthors
WHERE arrauthors->>'name' LIKE 'Arthur %';
Will give all books with an author name starting with 'Arthur'. I still appreciate comments or updates to this approach.