0

I have the following schema representing financial trades:

instrument     VARCHAR                     NOT NULL,
ts             TIMESTAMP WITHOUT TIME ZONE NOT NULL,
quantity       FLOAT8                      NOT NULL,
price          FLOAT8                      NOT NULL,
direction      INTEGER                     NOT NULL

And I have the following query to aggregate it into candles of 1min each:

SELECT
    date_trunc('minute', ts) ts,
    instrument,
    (array_agg(price order by ts))[1] open,
    MAX(price) high,
    MIN(price) low,
    (array_agg(price order by ts))[array_upper((array_agg(price order by ts)), 1)] close,
    (SUM(price * price * quantity) / SUM(price * quantity)) midpoint,
    SUM(price * quantity) volume,
    SUM(CASE WHEN direction = 1 THEN price * quantity else 0 END) volume_taker_buy,
    count(*) trade_count,
FROM {exchangeName}.{tableName}
WHERE instrument = '{instrument.Ticker}' AND ts BETWEEN '{fromTime}' AND '{toTime}'
GROUP BY date_trunc('minute', ts), instrument
ORDER BY ts

I think the details of the calculations are not only simple but they're not relevant to the question itself as the issue is more about getting data from the right time range. If something is not clear, please let me know.

How could I modify the query to allow arbitrary aggregation timeframes? for example group the data by 5seconds, 30 seconds, 1m, by 5min, by 15min, etc?

The goal is to use the query by requesting data from 2022-1-1 00:00:00 to 2022-1-1 01:00:00 by slices of 5 seconds, for example.


edit: here is some sample data

INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:06:19.119000', 0.001, 42969, 1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:06:19.018000', 0.008, 42969, 1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:06:18.816000', 0.092, 42969, 1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:06:18.442000', 0.002, 42969, 1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:06:21.651000', 0.1, 42968.99, -1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:06:21.380000', 0.002, 42969, 1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:06:21.039000', 0.114, 42969, 1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:06:20.684000', 0.002, 42969, 1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:06:20.538000', 0.111, 42969, 1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:06:20.525000', 0.012, 42968.99, -1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:06:20.405000', 0.058, 42968.99, -1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:06:20.266000', 0.001, 42968.99, -1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:06:19.885000', 0.111, 42969, 1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:06:19.581000', 0.001, 42969, 1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:06:22.660000', 0.001, 42969, 1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:06:22.486000', 0.002, 42968.99, -1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:06:23.626000', 0.001, 42968.99, -1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:06:23.503000', 0.001, 42969, 1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:06:23.374000', 0.006, 42969, 1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:06:23.271000', 0.123, 42968.99, -1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:29.099000', 0.001, 42968.92, -1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:29.022000', 0.001, 42968.93, 1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:28.783000', 0.088, 42968.93, 1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:28.564000', 0.023, 42968.92, -1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:28.521000', 0.009, 42968.93, 1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:28.462000', 0.065, 42968.92, -1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:28.251000', 0.005, 42968.92, -1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:28.059000', 0.605, 42968.93, 1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:27.864000', 1.346, 42970, -1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:27.692000', 0.002, 42970.01, 1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:27.361000', 0.001, 42970.01, 1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:27.189000', 0.005, 42970.01, 1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:27.074000', 0.043, 42970.01, 1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:26.966000', 0.862, 42970.01, 1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:30.328000', 0.002, 42965.85, -1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:30.328000', 0.056, 42966.01, -1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:30.328000', 0.199, 42966.02, -1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:30.328000', 0.076, 42966.69, -1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:30.328000', 0.081, 42966.8, -1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:30.328000', 0.08, 42966.86, -1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:30.328000', 0.001, 42968, -1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:30.328000', 0.626, 42968.92, -1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:30.222000', 0.002, 42968.93, 1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:30.044000', 0.019, 42968.93, 1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:29.835000', 0.06, 42968.93, 1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:29.496000', 0.001, 42968.93, 1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:29.202000', 0.045, 42968.92, -1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:29.142000', 0.216, 42968.93, 1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:31.243000', 0.083, 42968.56, 1);
INSERT INTO trades (instrument, ts, quantity, price, direction) VALUES ('BTCUSDT', '2022-01-15 12:07:31.038000', 0.002, 42968.56, 1);

Thomas
  • 10,933
  • 14
  • 65
  • 136
  • I have added some sample data here: https://drive.google.com/file/d/14LO0w0NOmrzk2uNUC2MmcJ0tHTYNTuP8/view?usp=sharing (sql file, about 40mb) – Thomas Jan 15 '22 at 18:37
  • 1
    Lol, it's a long list of updates. That doesn't count as sample data. – LukStorms Jan 15 '22 at 18:42
  • And it's off site. Which is also a no-no-never... – sticky bit Jan 15 '22 at 18:49
  • I can export a CSV, but how can I put this amount of data here? I can't put a few mb of text on the thread; how do people post data for others to use then? I usually just post code – Thomas Jan 15 '22 at 18:50
  • You're supposed to deliver a [example], not the real thing. Usually there's no need of a few MB as **sample** data. Some rows that cover the possible logical constellations are enough. You can also shrink down your query and the table to the important part. It should be runable though, test that before you post it. Yes, it's work to present a question in the right way. – sticky bit Jan 15 '22 at 18:55
  • I added sample data in the question, maybe that's better that way; this data should cover cases with low timeframes, like 5sec, 1min, etc – Thomas Jan 15 '22 at 18:58
  • To truncate by 5 seconds : `date_trunc('min', ts) + date_part('sec', ts)::int / 5 * interval '5 sec'` – LukStorms Jan 15 '22 at 19:26
  • @LukStorms, I've to pass the tuncation timeframe as a parameter, which needs to accomodate all kind of ranges, from seconds to hours – Thomas Jan 15 '22 at 19:32

1 Answers1

0

You can implement a function that will return a time. You can extract the hour, minute and so on. Then you can convert the year, month, day, hour and minute into a groupable value, that is, a text or a number (a number is to be preferred). This automatically solves it for minute-by-minute group. However, if you need to group by n minutes, assuming that 60 (the number of minutes in an hour) is divisible with the group length, you could apply the following formula to find out what group a record is in:

extract(year from timestamp <your value here>) * 10000 + extract(hour from timestamp <your value here>) * 100 + (extract(minute from timestamp <your value here>)) - MOD(extract(minute from timestamp <your value here>), 15))

Lajos Arpad
  • 64,414
  • 37
  • 100
  • 175
  • 1
    Hmm, a binary expression... That'll give them maximally three groups for `NULL`, `false` and `true`. I might misunderstand the question as it's still not a 100% clear to me, but I have doubt this is what they had in mind... – sticky bit Jan 15 '22 at 18:59
  • let me try to clarify: this query will aggregate data per minute; so if you pass data spanning 15 minutes, you get 15 rows out. Each row is made of a varying number of rows contained in each minute. What I'm trying to achieve is to make 'one minute' flexible; for example, pass 15 min of data, but aggregate it by blocks of 10 seconds and then get 90 rows out – Thomas Jan 15 '22 at 19:02
  • @stickybit you are correct, I mistyped my answer. Will edit it soon. Thanks for pointing it out! – Lajos Arpad Jan 15 '22 at 19:29
  • @stickybit rephrased the answer accordingly. – Lajos Arpad Jan 15 '22 at 19:42
  • @Thomas I have edited my answer – Lajos Arpad Jan 15 '22 at 19:43
  • @LajosArpad, I've implemented that: to_timestamp((extract(epoch FROM ts)::bigint / 10) * 10)::timestamp for 10sec for example; is there a performance benefit to the one you've proposed? – Thomas Jan 15 '22 at 19:55
  • @Thomas I did not measure the compared performances. If your solution works properly, then you may want to write your own answer and accept it when stackoverflow.com lets you to do so. – Lajos Arpad Jan 15 '22 at 22:17
  • @LajosArpad, it works, but I'll compare performance and retain the faster one as the answer. – Thomas Jan 15 '22 at 22:18
  • @Thomas okay. My suggestion is mainly an idea at this point, so if it does not work properly, then let me know. – Lajos Arpad Jan 15 '22 at 22:20