Dynamic pivoting with Informix SQL

Question

This is my data:

date    id  value
1/1/2021    a   5
1/1/2021    b   10
1/1/2021    c   7
1/1/2021    d   5
1/1/2021    e   6
1/2/2021    a   4
1/2/2021    b   8
1/2/2021    c   12
1/2/2021    d   3
1/2/2021    e   5

What I want to get is this:

>       1/1/2021    1/2/2021
>     a     5         4
>     b    10         8
>     c     7         12
>     d     5         3
>     e     6         5

I found soultion how to do this if date column is fixed, but it isn't. It can have other values next time. Also, I found some solutions with dynamic sql, but none of these works with Informix (at least I wasn't able to replicate those result).

How can this be done in Informix?

Maybe you should show what you found, or links to what you found? How many dates might you be working with? You show 2, but is it just 2 or could it be 7, 31, 365, …? Do you always have all 5 of the ID entries `a` .. `e` for each date? Never any repetition of the lD values on a given date? — Jonathan Leffler, Jun 29 '21 at 18:31
I don't know how many dates I might be working with, but probably from 2 do 12, should n't be more than 12 dates. ID's will vary too, and some dates might have them all, other don't. This solution works for fixed dates: https://stackoverflow.com/questions/38246875/sql-select-transform-rows-in-columns And this is mentoined that works, but not for Informix: https://stackoverflow.com/questions/10404348/sql-server-dynamic-pivot-query — Nikola, Jun 29 '21 at 20:19
Note that you do not specify the data table names — a surprisingly common oversight in SQL questions. If you don't specify the table names, everyone who answers has to make up their own table names, which makes life harder than need be for people answering. Please also read the guidelines under the [tag:sql] tag [wiki](https://stackoverflow.com/tags/sql/info). — Jonathan Leffler, Jul 02 '21 at 20:24

Jonathan Leffler · Answer 1 · 2021-07-02T20:45:05.533

You can use dynamic SQL — or text manipulation of SQL results — to build a moderately complex SQL statement that returns the data you are after.

The answer below assumes that the table name is Data and that there is a primary key (unique) constraint on the combination of the date and id columns — assumptions that address the questions in my comment:

How many dates might you be working with? You show 2, but is it just 2 or could it be 7, 31, 365, …? Do you always have all 5 of the ID entries a .. e for each date? Is there ever any repetition of the ID values on a given date?

and answers in your response:

I don't know how many dates I might be working with, but probably from 2 to 12, shouldn't be more than 12 dates. ID's will vary too, and some dates might have them all, others don't.

Note: Informix allows you to create this table:

CREATE TABLE data
(
    date    DATE    NOT NULL,
    id      CHAR(1) NOT NULL,
    value   INTEGER NOT NULL,
    PRIMARY KEY(DATE, id)
);

Many DBMS would require the date column name to be presented as a delimited identifier enclosed in double quotes (and case-sensitive — "date"), or use a proprietary extension such as enclosing the identifier in square brackets ([date]), both in the CREATE TABLE statement and in the subsequent SQL. Informix does not — and manages to distinguish between the letters DATE as column name, data type and function name correctly.

This answer uses what I call TDQD — Test-Driven Query Design.

Relevant dates

SELECT UNIQUE date FROM data

This gives you the dates that will appear as columns. It is probable that you'll filter the data more — such as:

SELECT UNIQUE date
  FROM data
 WHERE date BETWEEN (TODAY - 7) AND (TODAY - 1)
 ORDER BY date

You might format the results to give string usable as a column name (and using a different date range):

SELECT UNIQUE
       date AS column_date,
       TO_CHAR(date, 'd%Y_%m_%d') AS column_name
  FROM data
 WHERE date BETWEEN DATE('2021-01-01') AND DATE('2021-01-31')
 ORDER BY column_date

This assumes you have set the Informix-specific environment variable DBDATE="Y4MD-" so that DATE values are presented and interpreted like DATETIME YEAR TO DAY values are.

Relevant ID values

SELECT UNIQUE id
  FROM data
 WHERE date BETWEEN DATE('2021-01-01') AND DATE('2021-01-31')
 ORDER BY id

This will give you the list of ID values in column 1 of the final result. However, it isn't crucial to the generated SQL.

Generate SQL for Result Table

SELECT id,
       MAX(CASE WHEN date = DATE('2021-01-01') THEN value ELSE NULL END) AS d2021_01_01,
       MAX(CASE WHEN date = DATE('2021-01-02') THEN value ELSE NULL END) AS d2021_01_02,
       MAX(CASE WHEN date = DATE('2021-01-03') THEN value ELSE NULL END) AS d2021_01_03,
       MAX(CASE WHEN date = DATE('2021-01-04') THEN value ELSE NULL END) AS d2021_01_04,
       MAX(CASE WHEN date = DATE('2021-01-05') THEN value ELSE NULL END) AS d2021_01_05,
       MAX(CASE WHEN date = DATE('2021-01-06') THEN value ELSE NULL END) AS d2021_01_06,
       MAX(CASE WHEN date = DATE('2021-01-07') THEN value ELSE NULL END) AS d2021_01_07,
       MAX(CASE WHEN date = DATE('2021-01-08') THEN value ELSE NULL END) AS d2021_01_08
  FROM data
 GROUP BY id
 ORDER BY id;

This SQL is built using the column date and column name values from the 'relevant dates' query to generate the MAX(CASE … END) AS dYYYY_MM_DD clauses in the select-list. That has to be done outside SQL — using some program to read the relevant date information and produce the corresponding SQL.

For example, if the output of the last 'relevant dates' query is in the file date.columns, this shell script would generate the requisite SQL:

printf "SELECT id"
while read column_date column_name
do
    printf ",\n       MAX(CASE WHEN date = DATE('%s') THEN value ELSE NULL END) AS %s" $column_date $column_name
done < date.columns
printf "\n  FROM data\n GROUP BY id\n ORDER BY id;\n"

The only difference here is that the column for the date 2021-01-08 is omitted because the value is not selected by the SQL (not present in the date.columns file).

You can use any appropriate tools to run some SQL to generate the required list of dates and give the appropriate values for column_date and column_name and then format the data into an SQL statement as shown.

Sample Data

INSERT INTO data VALUES('2021-01-01', 'a', 5);
INSERT INTO data VALUES('2021-01-01', 'b', 10);
INSERT INTO data VALUES('2021-01-01', 'c', 7);
INSERT INTO data VALUES('2021-01-01', 'd', 5);
INSERT INTO data VALUES('2021-01-01', 'e', 6);

INSERT INTO data VALUES('2021-01-02', 'a', 4);
INSERT INTO data VALUES('2021-01-02', 'b', 8);
INSERT INTO data VALUES('2021-01-02', 'c', 12);
INSERT INTO data VALUES('2021-01-02', 'd', 3);
INSERT INTO data VALUES('2021-01-02', 'e', 5);

INSERT INTO data VALUES('2021-01-03', 'b', 18);
INSERT INTO data VALUES('2021-01-03', 'c', 112);
INSERT INTO data VALUES('2021-01-03', 'd', 13);
INSERT INTO data VALUES('2021-01-03', 'e', 15);

INSERT INTO data VALUES('2021-01-04', 'a', 24);
INSERT INTO data VALUES('2021-01-04', 'c', 212);
INSERT INTO data VALUES('2021-01-04', 'd', 23);
INSERT INTO data VALUES('2021-01-04', 'e', 25);

INSERT INTO data VALUES('2021-01-05', 'a', 34);
INSERT INTO data VALUES('2021-01-05', 'b', 38);
INSERT INTO data VALUES('2021-01-05', 'd', 33);
INSERT INTO data VALUES('2021-01-05', 'e', 35);

INSERT INTO data VALUES('2021-01-06', 'a', 44);
INSERT INTO data VALUES('2021-01-06', 'b', 48);
INSERT INTO data VALUES('2021-01-06', 'c', 412);
INSERT INTO data VALUES('2021-01-06', 'e', 45);

INSERT INTO data VALUES('2021-01-07', 'a', 54);
INSERT INTO data VALUES('2021-01-07', 'c', 512);
INSERT INTO data VALUES('2021-01-07', 'd', 53);

Sample output

Using a Stack Overflow Markdown table:

id	d2021_01_01	d2021_01_02	d2021_01_03	d2021_01_04	d2021_01_05	d2021_01_06	d2021_01_07	d2021_01_08
CHAR(1)	INTEGER	INTEGER	INTEGER	INTEGER	INTEGER	INTEGER	INTEGER	INTEGER
a	5	4		24	34	44	54
b	10	8	18		38	48
c	7	12	112	212		412	512
d	5	3	13	23	33		53
e	6	5	15	25	35	45

Tested on a MacBook Pro running macOS 10.14.6 Mojave (yes, antique), using IBM Informix Dynamic Server Version 12.10.FC6 (yes, also antique).