3

I have two queries which I want to achieve.

I have a table like so...

Date        | Period | Location | Price
2017-01-01      1         A         10
2017-01-01      2         A         15
2017-01-01      1         B         15
2017-01-01      2         B         16

Each date has 48 readings (one every half hour).

Query 1: I wish to get the average prices for any given location within a date range in the following format:

e.g. Between 2017-01-01 and 2017-06-30, average price per period for location 'A'

  Period 1  | Period 2|  Period 3 ...
       10           11          15

Query 2: I wish to have the average price for any given location going back x number of months: (period does not matter)

Month   |  Average price
january        10
february       12
march          16

Any help would be greatly appreciated :)

Daryn
  • 1,551
  • 1
  • 15
  • 21

2 Answers2

1

The second query you listed is a very simple group by operation. The only thing that's slightly tricky about it is that you need to extract the month from the listed date.

select
    date_part('month', t.Date) as month,
    avg(t.Price) as average_price
from
    mytable t
group by
    date_part('month', t.Date)

The first query however is significantly more complicated. It involves the crosstab(), which you'll have to enable in your database if you haven't already. The general idea is to calculate the averages for each period, and then pivot the data much like you would in excel.

select * from crosstab(
'   select
        t.Period
        , avg(t.Price) as avg_price
    from
        mytable t
    group by
        t.Period
    order by
        1
        , 2
'
) as ct(
    "Period" text
    , "1"  int
    , "2"  int
    , "3"  int
    , "4"  int
    , "5"  int
    , "6"  int
    , "7"  int
    , "8"  int
    , "9"  int
    , "10" int
    , "11" int
    , "12" int
)

This answer has a great amount of information on Postgre's crosstab()

Doron Segal
  • 2,242
  • 23
  • 22
Jacobm001
  • 4,431
  • 4
  • 30
  • 51
  • Thanks for your reply. I get the following error though... ERROR: invalid source data SQL statement DETAIL: The provided SQL must return 3 columns: rowid, category, and values. ********** Error ********** ERROR: invalid source data SQL statement SQL state: 22023 Detail: The provided SQL must return 3 columns: rowid, category, and values. – Daryn Jul 17 '17 at 05:03
  • It won't be good if you have rows from different years because `date_part('month', t.Date)` will return the same. For example `date_part('month', '2017-01-01')` returns the same value as `date_part('month', '2018-01-01')`. `DATE_TRUNC('month', t.Date)` will be better in my opinion – TOUDIdel Jan 11 '23 at 12:21
0

First question:

You need a pivot. If you do not want to do this in a static way (name all period columns manually in the query) you could to it with a more dynamic way with JSON:

SQL Fiddle Query

Result:

{"Period 1": 15.0000000000000000, "Period 2": 22.5000000000000000, "Period 3": 100.0000000000000000, "Period 4": 150.0000000000000000}

You get a JSON string which can be used for further calculations, e.g.

SELECT json_pivot -> 'Period 2' FROM (/* subquery */)

see here: SQL Fiddle: Further calculation

With the JSON functions you can get a dynamic pseudo pivot result.

Query:

SELECT 
    jsonb_object_agg(period, avg) as pivot_json         -- B
FROM (
    SELECT 
        'Period ' || period as period,  AVG(price)      -- A
    FROM 
        period_prices pp
    WHERE 
        location = 'A' 
        AND date BETWEEN '2017-01-01' AND '2017-06-30'
    GROUP BY location, period
)s

A: Grouping by location and period, renaming the period number to "Period [number]"

B: jsonb_object_agg aggregates two columns to the shown result.

https://www.postgresql.org/docs/current/static/functions-json.html

https://www.postgresql.org/docs/current/static/functions-aggregate.html


Second question is a simple GROUP BY. SQL Fiddle

SELECT
    to_char(pp.date, 'month') as month,    
    avg(pp.price) as avg_price
FROM
    period_prices pp
GROUP BY
    to_char(pp.date, 'month')

You could group by date_part. But while date_part('month', ...) gives you the number of the month (which is the preferred solution for further calculations) you wanted to get the month's name in the end. For that you can use the to_char() function.

https://www.postgresql.org/docs/current/static/functions-formatting.html

S-Man
  • 22,521
  • 7
  • 40
  • 63