I don't know much at all about SQL, I've just toyed with it here and there through the years but never really 'used' it.
I'm trying to get a list of prices / volumes and aggregate them:
CREATE TABLE IF NOT EXISTS test (
ts timestamp without time zone NOT NULL,
price decimal NOT NULL,
volume decimal NOT NULL
);
what I'd like is to extract:
- min price
- max price
- sum volume
- sum (price * volume) / sum (volume)
By 1h slices
If I forget about the last line for now, I have:
SELECT MIN(price) min_price, MAX(price) max_price, SUM(volume) sum_vol, date_trunc('hour', ts) ts_group FROM test
GROUP BY ts_group;
My understanding is that 'GROUP BY ts_group' will calculate ts_group, build groups of rows and then apply the MIN / MAX / SUM functions after. Since the syntax doesn't make any sense to me (entries on the select line would be treated differently while being declared together vs. building groups and then declaring operations on the groups), I could be dramatically wrong here.
But that will not return the min_price, max_price and sum_vol results after the grouping; I get ts, price and volume in the results.
If I remove the GROUP BY line to try to see all the output, I get the error:
column "test.ts" must appear in the GROUP BY clause or be used in an aggregate function
Which I don't really understand either...
I looked at: must appear in the GROUP BY clause or be used in an aggregate function but I don't really get it and I looked at the doc (https://www.postgresqltutorial.com/postgresql-group-by/) which shows working example, but doesn't really clarify what is wrong with what I'm trying to do here.
While I'd be happy to have a working solution, I'm more looking from an explanation, or pointers toward good resources, that would allow me to understand this.
I have this working solution:
SELECT MIN(price) min_price, MAX(price) max_price, SUM(volume) sum_vol, (SUM(price * volume)/SUM(volume)) vwap FROM test
GROUP BY date_trunc('hour', ts);
but I still don't understand the error message from my question