1

I have a MariaDB database.

Inside that DB, I have the following table, table1:

|     id       |     timestamp     |  unit |  detector  |    value    |
 -------------------------------------------------------------------------
     UUID()         2020-12-02          1        1             0.1
     UUID2()        2020-12-02          1        2             0.2
     UUID3()        2020-12-02          2        1             0.3
     UUID4()        2020-12-02          2        2             0.4
     UUID5()        2020-12-03          1        1             0.5
     UUID6()        2020-12-03          1        2             0.6
     UUID7()        2020-12-03          2        1             0.7
     UUID8()        2020-12-03          2        2             0.8

I have been asked to map the data to this new table, table2

 |     id       |     timestamp     |  detector 11   |   detector 12   |  detector 21  |  detector 22 |
  ----------------------------------------------------------------------------------------------------
       UUI9()         2020-12-02            0.1              0.2             0.3             0.4
       UUID10()        2020-12-03            0.5              0.6             0.7             0.8

The only difference from this situation is that I have 100 detector and unit combinations and 36 million rows. I have written code that can get the desired value for 1 detector but I can't figure out a way of doing multiple rows -> columns at the same time. No way I'm doing this manually, it would take weeks.

INSERT INTO table2
(id, timestamp, detector11)
SELECT UUID(), t1.timestamp, t1.value FROM table1 t1
WHERE t1.unit='1' AND t1.detector='1'
ORDER BY timestamp ;

This successfully translates the data from table1 where (detector=1, unit=1) to the column (detector11) with a good timestamp. However, now I have all the other columns except for id, timestamp, and detector11 to NULL.

Ideally, someone could help me to code something like that:

INSERT INTO table2
(id, timestamp, detector11, detector12, detector21, detector22)
SELECT UUID(), t1.timestamp, 
VALUES(t1.value FROM table1 t1
       WHERE t1.unit='1' AND t1.detector='1'
       ORDER BY timestamp,

       t1.value FROM table1 t1
       WHERE t1.unit='1' AND t1.detector='2'
       ORDER BY timestamp,

       t1.value FROM table1 t1
       WHERE t1.unit='2' AND t1.detector='1'
       ORDER BY timestamp,

       t1.value FROM table1 t1
       WHERE t1.unit='2' AND t1.detector='2'
       ORDER BY timestamp) ;

Which would fill all the columns at the same time.

Rick James
  • 135,179
  • 13
  • 127
  • 222
PyThagoras
  • 195
  • 2
  • 18

2 Answers2

1

You can only join the separate Tables.

The problem is to join the values, there you have to see if your ON statement for the join is so valid

CREATE TABLE table2 (id varchar(20), `timestamp` TIMESTAMP, detector11 DECIMAL(4,2), detector12 DECIMAL(4,2)
, detector21 DECIMAL(4,2), detector22 DECIMAL(4,2))
CREATE TABLE table1 (`timestamp`TIMESTAMP, value DECIMAL(4,2), unit INT,detector INT)
INSERT INTO table2
(id, timestamp, detector11, detector12, detector21, detector22)
SELECT UUID(),t1a.timestamp, t1a.detector11,t1b.detector12,t1c.detector21,t1d.detector22
FROM
(SELECT
t1.timestamp, t1.value as detector11 FROM table1 t1
WHERE t1.unit='1' AND t1.detector='1'
ORDER BY timestamp ) t1a
JOIN
  (SELECT t1.timestamp, t1.value AS detector12 FROM table1 t1
       WHERE t1.unit='1' AND t1.detector='2'
       ORDER BY timestamp) t1b
       ON t1a.timestamp = t1b.timestamp
JOIN       
  (SELECT t1.timestamp,t1.value AS detector21 FROM table1 t1
       WHERE t1.unit='2' AND t1.detector='1'
       ORDER BY timestamp) t1c
       ON t1a.timestamp = t1c.timestamp
JOIN
  (SELECT t1.timestamp,t1.value AS detector22 FROM table1 t1
       WHERE t1.unit='2' AND t1.detector='2'
       ORDER BY timestamp) t1d
  ON t1a.timestamp = t1d.timestamp

db<>fiddle here

We can try another approach, but this i also never used with hundrds of columns so you have to test it yourself

CREATE TABLE table1
    (`id` varchar(7), `timestamp` varchar(10), `unit` int, `detector` int, `value` DECIMAL(10,1))
;
    
INSERT INTO table1
    (`id`, `timestamp`, `unit`, `detector`, `value`)
VALUES
    ('UUID()', '2020-12-02', 1, 1, 0.1),
    ('UUID2()', '2020-12-02', 1, 2, 0.2),
    ('UUID3()', '2020-12-02', 2, 1, 0.3),
    ('UUID4()', '2020-12-02', 2, 2, 0.4),
    ('UUID5()', '2020-12-03', 1, 1, 0.5),
    ('UUID6()', '2020-12-03', 1, 2, 0.6),
    ('UUID7()', '2020-12-03', 2, 1, 0.7),
    ('UUID8()', '2020-12-03', 2, 2, 0.8)
;
SET SESSION group_concat_max_len=4294967295;
SET @sql = NULL;
SELECT
  GROUP_CONCAT(DISTINCT
    CONCAT(
      "MAX(CASE WHEN `unit` = ",
      `unit`,
      " AND `detector` = ",`detector`," THEN `value` ELSE 0  END) AS 'Detector",
      
      `unit`,`detector`,"'"
    )
  ) INTO @sql
  
  FROM `table1`;

SET @sql = CONCAT("SELECT uuid(),`timestamp`, ",@sql," from table1
group by `timestamp`");

PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
✓

✓

✓

✓

✓

uuid()                               | timestamp  | Detector11 | Detector12 | Detector21 | Detector22
:----------------------------------- | :--------- | ---------: | ---------: | ---------: | ---------:
91b7521b-7298-11eb-a7c5-1f5a66f289d6 | 2020-12-02 |        0.1 |        0.2 |        0.3 |        0.4
91b752ed-7298-11eb-a7c5-1f5a66f289d6 | 2020-12-03 |        0.5 |        0.6 |        0.7 |        0.8

✓

db<>fiddle here

nbk
  • 45,398
  • 8
  • 30
  • 47
  • Will this work for "hundreds of detector and unit numbers"? – June7 Feb 16 '21 at 19:57
  • Yes it wiill work and you can program a loop taht creates the query – nbk Feb 16 '21 at 20:07
  • of course mysql has some limitations but hundr4ets of detectors is no problems thousamds yes as mysql supports only 4096 columns – nbk Feb 16 '21 at 20:28
  • The answer is probably good. However, running this for 2 columns I had to stop it after 1hour 45 minutes, it seemed like it wouldn't really get there. At this point, for hundreds of columns, I should do it manually, won't it take the same time... – PyThagoras Feb 18 '21 at 13:54
  • 1
    for how many rows. Bulk inserts take always time, there is at the mysql site a very good article for bulk inserts wuith many insights – nbk Feb 18 '21 at 13:58
  • OK this is weird, for 1 column it took 4 minutes. I guess the scalability isn't linear then. I'll just run the query hundreds of time for different rows and I should get there in less than a day. But now that a UUID and timestamp already has been set, the query will have to be different. – PyThagoras Feb 18 '21 at 14:04
  • yes, you must find the joing part yourself, for that you only have the data and must see the pattern – nbk Feb 18 '21 at 14:09
  • I'll accept this, but be aware that there is a scalability issue (time-consuming) for very large datasets – PyThagoras Feb 18 '21 at 14:51
  • Sorry to say that this technique did not work. MariaDB only allows for 61 joints or less in 1 query, which makes this answer unfortunately not the right choice. Also, I tried running it with 60 JOINTS and it has been running for now 24hours. So I really wouldn't recommend JOINTS for scalability reasons. – PyThagoras Feb 19 '21 at 00:38
  • @PyThagoras you can try my other approach, if this also doesn't work, you have to do the pivoting in aniother language that has less limitatations. Also group_concat_max_len is set the the max of a 32 buit system, today is evrything 64c bit, so you can incease that number – nbk Feb 19 '21 at 09:59
0

This is de-normalizing data. Looks like a simple CROSSTAB.

Consider:

TRANSFORM First(Table1.value) AS FirstOfvalue
SELECT Table1.timestamp
FROM Table1
GROUP BY Table1.timestamp
PIVOT "detector " & [unit] & [detector];

If you want to make sure fields are in numerical order, something like:
PIVOT "detector " & Format([unit], "000") & Format([detector], "000");

Use that query as source for SELECT INTO.
SELECT *.Query1 INTO Test FROM Query1
The unique identifier field will have to be added afterward. Or just export query instead of saving to table.

No idea how performance will be affected by "millions" of rows.

Apparently MySQL table is limited to 4096 fields. Hopefully your output will not exceed that.

Well, did some research which indicates PIVOT is not available in MySQL and will have to program a loop that builds query. Sorry! PIVOT works in Access, SQLServer, Oracle.

June7
  • 19,874
  • 8
  • 24
  • 34
  • I want to try your approach first because it doesn't require a 200 lines query. Is PIVOT supposed to be a command or is this just conceptual code? Can't seem to find "PIVOT" I suppose it was pseudocode, I'll try to read on pivots. – PyThagoras Feb 17 '21 at 23:51
  • It's an actual query that works in Access and SQLServer. Sorry, looks like not available in MySQL and will have to do what nbk said "program a loop taht creates the query". Review https://codingsight.com/pivot-tables-in-mysql/. I should probably delete answer. – June7 Feb 18 '21 at 00:14
  • Don't delete it. Might be useful for those who use SQL Server – PyThagoras Feb 18 '21 at 14:53