0

I have a SQLite database of transactions. Each has a date, a value, and a corresponding period. I'm looking to identify the date at which the gains for a period are erased. Here's a sample table.

period      date           value
----------  -------------  --------
1           2000-01-01     16.3
1           2000-01-02     1.5
1           2000-01-03     2.9
1           2000-01-11     32.8
2           2000-01-27     -53.5
2           2000-01-27     17.0
2           2000-01-27     94.4
2           2000-02-02     168.0
2           2000-02-05     24.7
2           2000-02-05     66.0
2           2000-02-08     60.5
2           2000-02-09     80.2
3           2000-02-22     7.3
3           2000-02-22     -510.9
3           2000-02-23     25.3
3           2000-02-25     4.2
3           2000-03-20     3.1

For period 1, the sum of period gains is 16.3 + 1.5 + 2.9 + 32.8 = 53.5. The negative value of -53.5 on 2000-01-27 completely erases these gains. Ideally, I'd like a new column that summarizes these corresponding dates. For example

period      date           value     erase_date
----------  -------------  --------  -------------
1           2000-01-01     16.3      2000-01-27
1           2000-01-02     1.5       2000-01-27
1           2000-01-03     2.9       2000-01-27
1           2000-01-11     32.8      2000-01-27
2           2000-01-27     -53.5
2           2000-01-27     17.0      2000-01-27
2           2000-01-27     94.4      2000-02-22
2           2000-02-02     168.0     2000-02-22
2           2000-02-09     80.2      2000-02-22
3           2000-02-22     7.3       2000-02-22
3           2000-02-22     -510.9
3           2000-02-23     25.3      2000-02-22
3           2000-02-25     4.2       2000-02-22
3           2000-03-20     3.1       2000-02-22

I've been able to get what I'm looking for by using CASE to isolate positives and negatives, then selecting a cumulative SUM over the positives ordered by the period column and a cumulative SUM over the negatives ordered by the date column, and finally selecting the minimum date where the magnitude of value of the summed negatives was greater than the magnitude of the summed positives. This was very slow for large tables (a couple thousand entries). I'm hoping that I'm just missing a pattern that is obvious to someone with more experience using SQL.

Here's an example of the code I'm trying

WITH view AS (
    SELECT 
        period, date, value
        SUM(positive) OVER (ORDER BY period) period_positive_sum,
        SUM(negative) OVER (ORDER BY date) daily_negative_sum
    FROM (
        SELECT
            period, date, value
            CASE WHEN value >= 0 THEN value END positive
            CASE WHEN value < 0 THEN value END negative
        FROM table
    )
)
SELECT
    v.period, v.date, v.value,
    (SELECT
        MIN(date)
    FROM view
    WHERE
        -daily_negative_sum >= v.period_positive_sum
    ) erase_date
FROM view AS v
mitchnegus
  • 85
  • 5

1 Answers1

0

Use the self-join from this answer to produce a cumulative sum.

select
  t1.date,
  t1.value,
  sum(t2.value) as gain
from transactions t1
inner join transactions t2 on t1.date >= t2.date
group by t1.date, t1.value
order by t1.date;

Then a subselect to find the next date which is <= 0.

with gains as (
    select t1.date, t1.value, sum(t2.value) as gain
    from transactions t1
    inner join transactions t2 on t1.date >= t2.date
    group by t1.date, t1.value
    order by t1.date
)
select
  g1.*,
  (
      select g2.date
      from gains g2
      where g2.date > g1.date
        and g2.gain <= 0
      order by date asc
      limit 1
  ) as erase_date
from gains g1
Schwern
  • 153,029
  • 25
  • 195
  • 336