1

Hei everyone:)

I am trying to create a pivot table locally on mysql. I tried to implement the solutions from here however I was not able to implemnt it so far. I also watched and did a bunch of tutorials but with the time I only got more confused... I already countless hours respective days on it and I am getting now a bit desperat... So I was wondering if you might help what I do wrong respectively how that I can create my pivot table...

I have currently the following table "act_gen_raw" from where I want to take the data from and two control tables "plant_information" and "dt". "plant_information" stores the distinct plant names (PowerSystemResourceName) and "dt" the distinct datetimes which I have collected from the "act_gen_raw" to control if all data are as they should be.

The "act_gen_raw" tables looks something like this:

Datetime              PowerSystemResourceName   ActualGenerationOutput
2014-12-10 20:00:00   Stalon G1                 29.2
2014-12-11 01:00:00   Miranda - G3              0
2014-12-10 20:00:00   CSP_GT1                   0.1
2014-12-11 01:00:00   Stalon G1                 0
2014-12-11 01:00:00   CSP_GT1                   120.5
2014-12-10 20:00:00   Miranda - G3              139.0

I put also some datasamples of my tables as mysqldumps here. I would like to achieve the following table:

Datetime              Stalon G1  Miranda - G3  CSP_GT1
2014-12-10 20:00:00   29.2       139.0         0.1
2014-12-11 01:00:00   0          0             120.5

I have however 2153 distinct plant names (PowerSystemResourceName) and 243030 distinct Datetimes variables. How can I now write a mysql querie such that I can create my desired table without manually write each column name?

If possible I would also like to make sure that all that all plant names (PowerSystemResourceName) matches there counterpart in the table "plant_information" and each datetime matches one in the "dt" table.

Best, David

UPDATE

I currently trying to solve this problem with the following script where I put the name for the first step manually:

drop table if exists plants_production;  
CREATE TABLE plants_production AS(    
SELECT 
tab2.Date_and_Time,
CASE
    WHEN PlantName = 'Stalon G1' THEN gen
END AS 'Stalon G1',
CASE
    WHEN PlantName = 'Miranda - G3' THEN gen
END AS 'Miranda - G3',
CASE
    WHEN PlantName = 'CSP_GT1' THEN gen
END AS 'CSP_GT1'
FROM    (SELECT Date_and_Time, gen, PlantName FROM (
          SELECT 
            dt.datetime_variable AS Date_and_Time, 
            act_gen_raw_test.ActualGenerationOutput AS gen,
            act_gen_raw_test.PowerSystemResourceName AS PlantName
          FROM 
            dt, act_gen_raw_test
          ) tab1
    WHERE tab1.PlantName IN (SELECT 
      plant_information.PowerSystemResourceName 
      FROM plant_information)
    ) tab2);

However I get the output in this form:

Datetime              Stalon G1  Miranda - G3  CSP_GT1
2014-12-06 23:00:00              0.1
2014-12-06 23:00:00   121   
2014-12-06 23:00:00                            59.5 

How can I make sure that I only get one row for each Datetime?

UPDATE 2- dynamic query does not run I tried to implement the query according to the proposed example:

drop table if exists plants_production;  
CREATE TABLE plants_production AS( 
SELECT GROUP_CONCAT(DISTINCT
act_gen_raw_test.PowerSystemResourceName)
INTO @sql
FROM act_gen_raw_test;
SET @sql = CONCAT('
SELECT 
tab2.Date_and_Time, gen ', @sql,'
FROM    (
    SELECT Date_and_Time, gen, PlantName FROM (
        SELECT 
            dt.datetime_variable AS Date_and_Time, 
            act_gen_raw_test.ActualGenerationOutput AS gen,
            act_gen_raw_test.PowerSystemResourceName AS PlantName
        FROM 
            dt, act_gen_raw_test
        ) tab1
    WHERE tab1.PlantName IN (SELECT plant_information.PowerSystemResourceName FROM plant_information)
    ) tab2'
    #end my select query
 )
);
#prepares statment for execution
PREPARE stmt FROM @sql;
#executes statmen
EXECUTE stmt;
#deallocate statment
DEALLOCATE PREPARE stmt;

I am however unable to run it due to the following error when I try to run PREPARE stmt FROM @sql;:

Error Code: 1064. You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'NULL' at line 1
fidu13
  • 37
  • 7
  • 1
    Use stored procedure with dynamic SQL. – Akina Mar 12 '22 at 10:41
  • 1
    Try e.g. [MySQL pivot table query with dynamic columns](https://stackoverflow.com/q/12598120). Generally though: don't do it. Find some other representation of your data and/or let your application do it. – Solarflare Mar 12 '22 at 10:45
  • This kind of [tag:pivot] operation is a notorious pain in the axx neck in MySQL, unfortunately. Many developers use a host language (c#, java, php) to do this work, out of a desire to get it done and tested. Plus, you'll need to use a host language to make sense of a resultset with thousands of columns. – O. Jones Mar 12 '22 at 11:06
  • @Akina; @ Solarflare Thanks very much for the suggestion. I was however to implement it correctly... Do you know what I have done wrong? – fidu13 Mar 12 '22 at 13:04
  • @O.Jones My end goal is to generate a csv file which is then used for further analizations. As I am still new to this entire thematic, how what did you mean with the use of a host language? Is that something simliar as pands in python? And would that not make the entire query much slower? – fidu13 Mar 12 '22 at 13:07
  • Yes, use Python to read your table. Construct your pivoted result as an array or a dataframe. Then write that to your csv file. Doing it this way will likely be quite a bit faster than pivoting with a stored procedure. – O. Jones Mar 12 '22 at 13:37
  • You can probably adapt the stored proc in http://mysql.rjweb.org/doc.php/pivot – Rick James Mar 12 '22 at 15:05

1 Answers1

0

You can not use prepared statement outside a routine, you must create a procedure to to do so.

You can create a procedure that will create a query with values as columns (it's dirty but it works)

delimiter $$ 

DROP PROCEDURE IF EXISTS DoTheMagic;
CREATE PROCEDURE DoTheMagic()
BEGIN
  DECLARE `values` TEXT DEFAULT '';
  DECLARE `tables` TEXT DEFAULT '';
  DECLARE `request` TEXT DEFAULT '';
  DECLARE `name` TEXT;
  DECLARE `index` INT UNSIGNED DEFAULT 0;

  DECLARE `done` INT DEFAULT FALSE;
  DECLARE `columns` CURSOR FOR SELECT PowerSystemResourceName FROM act_gen_raw GROUP BY PowerSystemResourceName;
  DECLARE CONTINUE HANDLER FOR NOT FOUND SET `done` = TRUE;

  OPEN `columns`;

  columns_loop: LOOP
    FETCH `columns` INTO `name`;

    IF `done` THEN
      LEAVE columns_loop;
    END IF;

    SET `index` = `index` + 1;

    SET `values` = CONCAT(`values`, 'IFNULL(table_', `index`, '.ActualGenerationOutput, 0) AS ', QUOTE(`name`), ',');

    SET `tables` = CONCAT(`tables`, ' LEFT JOIN act_gen_raw AS table_', `index`);
    SET `tables` = CONCAT(`tables`, ' ON master.DateTime = table_', `index`, '.DateTime');
    SET `tables` = CONCAT(`tables`, ' AND table_', `index`, '.PowerSystemResourceName = ', QUOTE(`name`));
  END LOOP;

  SET `request` = CONCAT(`request`, ' SELECT master.DateTime,', TRIM(TRAILING ',' FROM `values`));
  SET `request` = CONCAT(`request`, ' FROM act_gen_raw AS master ', `tables`);

  PREPARE stmt FROM `request`;
  EXECUTE stmt;
  DEALLOCATE PREPARE stmt;
END $$

delimiter ;

CALL DoTheMagic();

I have with that

+-------------------------+---------+--------------+-----------+
| DateTime                | CSP_GT1 | Miranda - G3 | Stalon G1 |
+-------------------------+---------+--------------+-----------+
| 2014-12-10 20:00:00.000 |       0 |            0 |     118.9 |
| 2014-12-10 21:00:00.000 |       0 |            0 |     119.5 |
| 2014-12-10 22:00:00.000 |       0 |            0 |     120.5 |
| 2014-12-10 23:00:00.000 |       0 |            0 |      29.2 |
| 2014-12-11 00:00:00.000 |       0 |            0 |         0 |
| 2014-12-20 07:00:00.000 |       0 |         21.4 |         0 |
| 2014-12-20 08:00:00.000 |       0 |           50 |         0 |
| 2014-12-20 09:00:00.000 |       0 |         59.4 |         0 |
| 2014-12-28 05:00:00.000 |     0.1 |            0 |         0 |
+-------------------------+---------+--------------+-----------+

CURSOR allow to loop over a query, I use it to get the values you want as columns.

For every value, I add a new join, with the same table, using the date and the value to filter data and using an increment as alias. Then, I select the data from the freshly joint table.

At the end, I simply create the full query and run it.

Joel
  • 1,187
  • 1
  • 6
  • 15
  • Thanks very much this looks amazing! I tried to run it but I got the following error, do you know how I could resolve it? Error Code: 2014 Commands out of sync; you can't run this command now I tried to run the script on mysql workbench. – fidu13 Mar 14 '22 at 10:20
  • I tried it on a MariaDB docker image without pb, I will try on a MySQL docker image but what MySQL version do you have ? – Joel Mar 14 '22 at 10:48
  • I have the MySQL Workbench 8.0 version – fidu13 Mar 15 '22 at 12:01