So we need to take one value for day X, take second value for day X - 7, and calculate %.
The query may look like:
SELECT a.timestamp,
a.cnt,
b.cnt cnt_minus_7_day,
round( 100.0 *( a.cnt - b.cnt ) / b.cnt , 2 ) change_7_days
from (
SELECT timestamp::date, COUNT(DISTINCT userid) cnt
FROM logs
GROUP BY timestamp::date
ORDER BY timestamp::date
) a
left join (
SELECT timestamp::date, COUNT(DISTINCT userid) cnt
FROM logs
GROUP BY timestamp::date
ORDER BY timestamp::date
) b
ON a.timestamp = b.timestamp - 7
;
You can also try another version - this one should be faster,
because it seems that postgresql is not smart enought and evaluates the same subquery twice,
instead of cashing results in memory or temp table.
WITH clause helps to avoid this (compare plans below).
with src as (
SELECT timestamp::date, COUNT(DISTINCT userid) cnt
FROM logs
GROUP BY timestamp::date
ORDER BY timestamp::date
)
SELECT a.timestamp,
a.cnt,
b.cnt cnt_minus_7_day,
round( 100.0 *( a.cnt - b.cnt ) / b.cnt , 2 ) change_7_days
FROM src a
left join src b
on a.timestamp = b.timestamp - 7
Here is a plan for the first query (running on my sample data):
"Hash Left Join (cost=5136.71..5350.93 rows=101 width=20) (actual time=77.778..88.676 rows=101 loops=1)"
" Hash Cond: (public.logs."timestamp" = (b."timestamp" - 7))"
" -> GroupAggregate (cost=2462.13..2672.31 rows=101 width=8) (actual time=44.398..55.129 rows=101 loops=1)"
" -> Sort (cost=2462.13..2531.85 rows=27889 width=8) (actual time=44.290..48.392 rows=27889 loops=1)"
" Sort Key: public.logs."timestamp""
" Sort Method: external merge Disk: 488kB"
" -> Seq Scan on logs (cost=0.00..402.89 rows=27889 width=8) (actual time=0.037..10.396 rows=27889 loops=1)"
" -> Hash (cost=2673.32..2673.32 rows=101 width=12) (actual time=33.355..33.355 rows=101 loops=1)"
" Buckets: 1024 Batches: 1 Memory Usage: 5kB"
" -> Subquery Scan on b (cost=2462.13..2673.32 rows=101 width=12) (actual time=22.883..33.306 rows=101 loops=1)"
" -> GroupAggregate (cost=2462.13..2672.31 rows=101 width=8) (actual time=22.881..33.288 rows=101 loops=1)"
" -> Sort (cost=2462.13..2531.85 rows=27889 width=8) (actual time=22.817..26.507 rows=27889 loops=1)"
" Sort Key: public.logs."timestamp""
" Sort Method: external merge Disk: 488kB"
" -> Seq Scan on logs (cost=0.00..402.89 rows=27889 width=8) (actual time=0.014..3.696 rows=27889 loops=1)"
"Total runtime: 100.360 ms"
and for second version:
"Hash Left Join (cost=2675.59..2680.64 rows=101 width=20) (actual time=60.612..60.785 rows=101 loops=1)"
" Hash Cond: (a."timestamp" = (b."timestamp" - 7))"
" CTE src"
" -> GroupAggregate (cost=2462.13..2672.31 rows=101 width=8) (actual time=46.498..60.425 rows=101 loops=1)"
" -> Sort (cost=2462.13..2531.85 rows=27889 width=8) (actual time=46.382..51.113 rows=27889 loops=1)"
" Sort Key: logs."timestamp""
" Sort Method: external merge Disk: 488kB"
" -> Seq Scan on logs (cost=0.00..402.89 rows=27889 width=8) (actual time=0.037..8.945 rows=27889 loops=1)"
" -> CTE Scan on src a (cost=0.00..2.02 rows=101 width=12) (actual time=46.504..46.518 rows=101 loops=1)"
" -> Hash (cost=2.02..2.02 rows=101 width=12) (actual time=14.084..14.084 rows=101 loops=1)"
" Buckets: 1024 Batches: 1 Memory Usage: 5kB"
" -> CTE Scan on src b (cost=0.00..2.02 rows=101 width=12) (actual time=0.002..14.033 rows=101 loops=1)"
"Total runtime: 67.799 ms"