I've got the following database table:
date a b c d add result
23.02.07 A B C1 D1 1 1
24.02.07 A B C1 D1 0 1
25.02.07 A B C1 D1 1 2
26.02.07 A B C1 D1 1 3
27.02.07 A B C1 D1 1 4
28.02.07 A B C1 D1 0 4
01.03.07 A B C1 D1 0 4
02.03.07 A B C1 D1 0 4
03.03.07 A B C1 D1 1 5
04.03.07 A B C1 D1 0 5
05.03.07 A B C1 D1 0 5
06.03.07 A B C1 D1 0 5
07.03.07 A B C1 D1 2 7
17.02.07 A B C2 D2 1 1
18.02.07 A B C2 D2 0 1
19.02.07 A B C2 D2 0 1
20.02.07 A B C2 D2 0 1
21.02.07 A B C2 D2 0 1
22.02.07 A B C2 D2 0 1
23.02.07 A B C2 D2 0 1
24.02.07 A B C2 D2 0 1
25.02.07 A B C2 D2 1 2
26.02.07 A B C2 D2 3 5
27.02.07 A B C2 D2 1 6
28.02.07 A B C2 D2 0 6
The column result (last one) is not part of the actual dataset. This column is showcasing what I am trying to achieve. Basically I am summin' up all previous values of "add" and the current one of a given partition by using a analytical function like below:
SUM(add) OVER(PARTITION BY
A,
B,
C,
D,
ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS result
This approach works but is slow when querying large collections of data.
Some more insights:
- There is a btree index on date (is not a date datatype but a character varying).
- All other are character varying instead of add which is an integer.
- A and B have a btree index as well.
Is there better / more performant way to do it?