Problem
I've got a PostgreSQL 9.6 database with a table designed according to an EAV model with different types of values. An example excerpt looks like this:
name |arrivalTime | boolValue | intValue | floatValue | stringValue
------+------------+-----------+----------+------------+------------
a1 | 10:00:00 | true | | |
c3 | 10:00:00 | | 12 | |
d4 | 10:00:00 | | | | hello
e5 | 15:00:00 | | | 45.67 |
c3 | 15:00:00 | | 45 | |
b2 | 20:00:00 | | | 4.567 |
a1 | 20:00:00 | false | | |
d4 | 22:00:00 | | | | bye
b2 | 22:00:00 | | | 12.34 |
Empty cells represent null
values in the database.
Now I would like to get a Pivot table, were the new columns are the arrivalTime
and the content of name
. For the example from above it should look like this:
arrivalTime | a1 | b2 | c3 | d4 | e5
------------+-------+-------+-------+-------+-------
10:00:00 | true | | 12 | hello |
15:00:00 | | | 45 | | 45.67
20:00:00 | false | 4.567 | | |
22:00:00 | | 12.34 | | bye |
As an input to query for retrieving this result I get an pattern matching the name
s and a start and end time specifying the range of arrivalTime
.
Properties of the original table:
- The entries in the name column are volatile, i.e. new names come in and old names vanish regularly.
- Each combination of
name
andarrivalTime
is unique. - Each
name
andarrivalTime
combination has exactly one entry in one of the value columns.
Ideas
I already gave it some considerations:
- I figured so much, that the
crosstab
function should be used - Since the columns are dynamic, two queries will be needed as explained in Dynamically generate columns in PostgreSQL or Execute a dynamic crosstab query.
- Using the
format()
function to generate the first query is probably a good idea.
Example Table
Here is the SQL Code to create the example table:
CREATE TABLE IF NOT EXISTS playTable (
name TEXT NOT NULL,
arrivalTime TIME NOT NULL,
floatValue REAL NULL,
intValue INT NULL,
boolValue BOOLEAN NULL,
stringValue TEXT NULL,
PRIMARY KEY (name, arrivalTime),
CONSTRAINT single_value CHECK(
(boolValue IS NOT NULL)::INT +
(intValue IS NOT NULL)::INT +
(floatValue IS NOT NULL)::INT +
(stringValue IS NOT NULL)::INT = 1
)
);
And to insert the values:
INSERT INTO playTable ( name, arrivalTime, boolValue ) VALUES ( 'a1', '10:00:00', true );
INSERT INTO playTable ( name, arrivalTime, intValue ) VALUES ( 'c3', '10:00:00', 12 );
INSERT INTO playTable ( name, arrivalTime, stringValue ) VALUES ( 'd4', '10:00:00', 'hello' );
INSERT INTO playTable ( name, arrivalTime, floatValue ) VALUES ( 'e5', '15:00:00', 45.67 );
INSERT INTO playTable ( name, arrivalTime, intValue ) VALUES ( 'c3', '15:00:00', 45 );
INSERT INTO playTable ( name, arrivalTime, floatValue ) VALUES ( 'b2', '20:00:00', 4.567 );
INSERT INTO playTable ( name, arrivalTime, boolValue ) VALUES ( 'a1', '20:00:00', false );
INSERT INTO playTable ( name, arrivalTime, stringValue ) VALUES ( 'd4', '22:00:00', 'bye' );
INSERT INTO playTable ( name, arrivalTime, floatValue ) VALUES ( 'b2', '22:00:00', 12.34 );
Pivot table, non dynamic
klin provided the starting point of the solution, I guess:
SELECT *
FROM crosstab(
$ct$
SELECT
arrivalTime, name, concat(boolValue, intValue, floatValue, stringValue)
FROM playTable
ORDER BY 1, 2
$ct$,
$ct$
SELECT DISTINCT name
FROM playTable
ORDER BY 1
$ct$)
AS ct("arrivalTime" time, "a1" BOOLEAN, "b2" REAL, "c3" INT, "d4" TEXT, "e5" REAL);
What is missing from this solution is the dynamic aspect. As an input, a LIKE
pattern for name
is supplied and the range (i. e. min and max value) of arrivalTime
. This makes the argument of as ct(...)
dynamic.