Timestream - Pivot data per dimensions

Question

TimeStream stores data with a key:value approach.

Is there any simple way to pivot the data based on the dimensions to get for instance the max of all available measures in a column that is named as the measure name ?

Let take an example, with following dataset :

| time                              | instance_id   | measure_name          | measure_value::double     | measure_value::bigint     |
|-------------------------------    |-------------  |--------------------   |-----------------------    |-----------------------    |
| 2019-12-04 19:00:00.000000000     | A             | cpu_utilization       | 35                        | null                      |
| 2019-12-04 19:00:01.000000000     | A             | cpu_utilization       | 38.2                      | null                      |
| 2019-12-04 19:00:02.000000000     | B             | cpu_utilization       | 45.3                      | null                      |
| 2019-12-04 19:00:00.000000000     | A             | memory_utilization    | 54.9                      | null                      |
| 2019-12-04 19:00:01.000000000     | A             | memory_utilization    | 42.6                      | null                      |
| 2019-12-04 19:00:02.000000000     | B             | memory_utilization    | 33.3                      | null                      |

We want to construct a generic request that would provide the following result without needing any code transformation :

| instance_id   | cpu_utilization (max)     | memory_utilization (max)  |
|-------------  |-----------------------    |-------------------------- |
| A             | 38.2                      | 54.9                      |
| B             | 45.3                      | 33.3                      |

Of course, I know that by getting the following (see next table), with few code in my favorite language it is very easy to make the pivot, but I was wondering if this is possible on a native way.

| instance_id   | measure_name          | max(measure_value)    |
|-------------  |--------------------   |--------------------   |
| A             | cpu_utilization       | 38.2                  |
| B             | cpu_utilization       | 45.3                  |
| A             | memory_utilization    | 54.9                  |
| B             | memory_utilization    | 33.3                  |

Thank you

score 1 · Answer 1 · answered Dec 03 '20 at 10:32

1

Do a GROUP BY. Use case expressions to do conditional aggregation:

select instance_id, 
       max(case when measure_name = 'cpu_utilization' then measure_value end),
       max(case when measure_name = 'memory_utilization' then measure_value end)
from tablename
group by instance_id

answered Dec 03 '20 at 10:32

jarlh

42,561
8
45
63

Thank you. This is a quite efficient solution. However, I was looking for a more systematic one where I would not need to mention all the columns I want to pivot. – Florck Dec 03 '20 at 11:34
I understand. And a regular GROUP BY with one row per id/measure-type isn't enough? – jarlh Dec 03 '20 at 12:07
Indeed, yes, the best solution I guess will be to pivot through the code using the data. – Florck Dec 03 '20 at 12:17

Timestream - Pivot data per dimensions

1 Answers1