0

MySLQ novice here. I am trying to write a query to pivot a large table of weather data. By that I mean the following. I have a table "weatherData" with three columns: 1. "timeStamp", 2. "stationID" (ID of different weather stations) and 3. "temperature" (temperature measured by respective station at respective time). I would like my query to return as first column "timeStamp" and then n columns named like the station IDs containing the measured temperatures (n is the number of stations there are). There is a large number of stations (n is around 4000) and the temperatures are recorded in 5 min. intervals over 2.5 years, so "weatherData" has around 7 million rows.

The first thing I tried was generating the columns "manually":

SELECT 
  timeStamp,
  SUM(CASE WHEN stationID=1253 THEN temperature ELSE 0 END) AS '1253',
  SUM(CASE WHEN stationID=1254 THEN temperature ELSE 0 END) AS '1254',
  SUM(CASE WHEN stationID=1255 THEN temperature ELSE 0 END) AS '1255',
  SUM(CASE WHEN stationID=1256 THEN temperature ELSE 0 END) AS '1256',
  SUM(CASE WHEN stationID=1257 THEN temperature ELSE 0 END) AS '1257',
  SUM(CASE WHEN stationID=1258 THEN temperature ELSE 0 END) AS '1258'
FROM weatherData
GROUP BY timeStamp

This works as expected but adding all stations would make it a long query. I then tried to dynamically generate the query as follow:

SET @query = NULL;
SELECT GROUP_CONCAT(
  DISTINCT CONCAT( 
    'SUM(',
      'CASE WHEN stationID = ',stationID,' THEN temperature ELSE 0 END',
    ') AS "',stationID,'"'  
  )      
)

INTO @query 
FROM weatherData;

SET @query  = CONCAT(
  'SELECT temperature, ',@query,
  'FROM weatherData',
  'GROUP BY timeStamp'  
);

PREPARE stmt FROM @query;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

This gives me the following error:

You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'weatherData GROUP BY timeStamp' at line 1

What's odd is that when I generate a smaller table which only contains the data of up to 14 weather stations from weatherData and run the above "dynamic" query on this smaller table it works perfectly fine. So, it only doesn't work when I include data from more than 14 stations and it does not depend on which particular stations are included. The "manual" query always works no matter how many stations are included. What is the issue here?

The second strange thing is that the first query takes significantly longer when including (the same) 14 stations, I would have thought these queries are equivalent. I am using the latest free version "Toad for MySQL".

salvador
  • 99
  • 1
  • 2
  • 10

1 Answers1

0

As for now you have a problem with missing spaces. Add them before FROM and GROUP BY clauses.

SET @query  = CONCAT(
  'SELECT temperature, ',@query,
  ' FROM weatherData', -- added space
  ' GROUP BY timeStamp' -- added space
);

Though, if your GROUP_CONCAT would grow big enough (> 1024 characters) and your setting is the default one, it will truncate your @query variable and may cause error with your query. To inspect if that's the case - when your dynamic grows bigger, add a SELECT @query for debugging purposes.


I believe that my recent answer is going to resolve your problems. I've covered problems with GROUP_CONCAT max length setting there as well as debugging dynamic query by investigating the string before it executes.

To see it click here: MySQL Dynamic Pivot Query with over 3000 columns

Community
  • 1
  • 1
Kamil Gosciminski
  • 16,547
  • 8
  • 49
  • 72
  • I actually have the spaces in my query, for some reason I had deleted them in my post. Adding the line SET SESSION group_concat_max_len = 1000000; solves my original problem but now I get a "too many columns" error. I get that no table should have this many columns but why shouldn't a query be able to return this? – salvador Apr 08 '16 at 17:49