I'm a working on a system where we have arbitrary incoming data stored into a table structure that basically is a key-value based storage.
The table structure in reality are a bit more complex but for the sake of solving the problem I will use a simplified example:
CREATE TABLE records (id BIGINT, type TEXT, identifier TEXT);
CREATE TABLE record_items (id BIGINT, rec_id BIGINT, key TEXT, value TEXT);
INSERT INTO records VALUES (10, 'PERSON', 'P12823');
INSERT INTO record_items VALUES
(11, 10, 'FIRST_NAME', 'James'),
(12, 10, 'LAST_NAME', 'Burwell'),
(13, 10, 'AGE', '76'),
(14, 10, 'HEIGHT', '5-9'),
(15, 10, 'WEIGHT', '149'),
(16, 10, 'BLOOD', 'A+');
INSERT INTO records VALUES (20, 'PERSON', 'P83494');
INSERT INTO record_items VALUES
(21, 20, 'FIRST_NAME', 'David'),
(22, 20, 'LAST_NAME', 'McDonnell'),
(23, 20, 'AGE', '47'),
(24, 20, 'COMPANY', 'Super Saver Foods');
INSERT INTO records VALUES (30, 'PERSON', 'P81323');
INSERT INTO record_items VALUES
(31, 30, 'FIRST_NAME', 'William'),
(32, 30, 'LAST_NAME', 'Nelson'),
(32, 30, 'HEIGHT', '5-6'),
(33, 30, 'VEHICLE', '2005 Dodge Charger');
As you can see, incoming data have arbitrary keys. There are some keys that are present for all incoming data and then there are some that changes from record to record and some that are missing:
SELECT r.id, r.type, r.identifier, ri.key, ri.value
FROM records r
JOIN record_items ri ON ri.rec_id = r.id
ORDER BY r.id, ri.key
id | type | ident | key | value
-------------------------------------------
10 | PERSON | P12823 | AGE | 76
10 | PERSON | P12823 | BLOOD | A+
10 | PERSON | P12823 | FIRST_NAME | James
10 | PERSON | P12823 | HEIGHT | 5-9
10 | PERSON | P12823 | LAST_NAME | Burwell
10 | PERSON | P12823 | WEIGHT | 149
20 | PERSON | P83494 | AGE | 47
20 | PERSON | P83494 | COMPANY | Flix
20 | PERSON | P83494 | FIRST_NAME | David
20 | PERSON | P83494 | LAST_NAME | Donnell
30 | PERSON | P81323 | FIRST_NAME | William
30 | PERSON | P81323 | HEIGHT | 5-6
30 | PERSON | P81323 | LAST_NAME | Nelson
30 | PERSON | P81323 | VEHICLE | Dodge
What we need to get, however, is all these arbitrary rows transposed to columns (order of columns is not important):
id | identifier | first_name | last_name | age | height | weight | company | vehicle
-----------------------------------------------------------------------------------
10 | P12823 | James | Burwell | 76 | 5-9 | 149 | |
20 | P83494 | David | Donnell | 47 | | | Flix |
30 | P81323 | William | Nelson | | 5-6 | | | Dodge
I am aware of the existence of the crosstab
function, however I was not able to make it to work as we need. The main problem is that, as far as I understand, crosstab
expects that all records have exactly the same keys in the same order and any missing or arbitrary keys mixes up the resultset. Another problem with crosstab
is that I need to explicitly define what are the columns for the resultset but what we need is to be able to call the function in a manner similar to this:
SELECT *
FROM some_crosstab_like_function('PERSON') -- PERSON identifies record type
I am thinking of creating some mechanism that consists of functions and temporary tables or views that are created on the fly and return the necessary data. However, before proceeding with this I wanted to know is there any other more easier solution to my problem.
We are PostgreSQL 10.5.
Any ideas will be much appreciated.