1

// EDIT //

Below is the original question, but my problem can be really simplified with one question... how can I get the below output without using user variables?!

SELECT ID, @cumu_return:=IF(id = 1, 1, @cumu_return + (@cumu_return * ret)) AS cumulative_return 
FROM (
    SELECT 1 AS ID, 1 AS num, 0.1 AS ret UNION ALL
    SELECT 2 AS ID, 1 AS num, 0.1 AS ret UNION ALL
    SELECT 3 AS ID, 1 AS num, 0.1 AS ret UNION ALL
    SELECT 4 AS ID, 1 AS num, 0.1 AS ret UNION ALL
    SELECT 5 AS ID, 1 AS num, 0.1 AS ret
) t

// END EDIT //

I have a table as follows...

CREATE TABLE `daily_return` (
  `id` int(11) NOT NULL,
  `list_id` int(11) NOT NULL,
  `last_updated` datetime NOT NULL DEFAULT '2000-01-01 00:00:00',
  `daily_return` float NOT NULL,
  `last_return` float NOT NULL, KEY (`id`)
)

'daily return' is a percentage change in the price for a list_id, calculated every day and the last_return is the first/starting return number.

I need to calculate a cumulative return metric using the following logic:

'Previous Day Return' + ('Previous Day Return' * 'Daily Return')

Hence, I have the following query...

INSERT INTO cmc_cumulative_return (list_id, last_updated, cumulative_return)
SELECT list_id, last_updated, cumulative_return FROM (
    SELECT id, list_id, last_updated, daily_return, last_return,
        @cumu_return:=IF(id = 1, last_return, @cumu_return + (@cumu_return * daily_return)) AS cumulative_return 
    FROM daily_return c
) t WHERE id <> 1;

When I run this inside a procedure, it throws a warning:

Setting user variables within expressions is deprecated and will be removed in a future release. Please set variables in separate statements instead.

I have seen some other similar questions on stackoverflow, but they were all simple increment calculations that can be replaced by ROW_NUMBER() OVER or SUM() OVER, but I am unable to figure out how to remove variables in my query above.

Originally, I was using 3-4 variables, but now I have got it down to one and request your help removing the last one.

EDIT 2

Query with actual data: (Gordon's calculation included)

SELECT id, list_id, last_updated, daily_return, last_return,
    @cumu_return:=IF(id = 1, last_return, @cumu_return + (@cumu_return * daily_return)) AS cumulative_return,
    (
        EXP(SUM(LN(1 + daily_return)) OVER (ORDER BY id)) / (1 + daily_return)
    ) as cumulative_return2
FROM (
    SELECT 1 AS id, 2 AS list_id, '2019-02-20' AS last_updated, 0 AS daily_return, 1.15 AS last_return UNION ALL
    SELECT 2 AS id, 2 AS list_id, '2019-02-21' AS last_updated, 0.0145999858 AS daily_return, 1.15 AS last_return UNION ALL
    SELECT 3 AS id, 2 AS list_id, '2019-02-22' AS last_updated, -0.0503679203 AS daily_return, 1.15 AS last_return UNION ALL
    SELECT 4 AS id, 2 AS list_id, '2019-02-23' AS last_updated, 0.0111594238 AS daily_return, 1.15 AS last_return
) t
Shahid Thaika
  • 2,133
  • 5
  • 23
  • 59

1 Answers1

1

You can use natural logs and exponentiation for the cumulative product calculation.

INSERT INTO cmc_cumulative_return (list_id, last_updated, cumulative_return)
    SELECT list_id, last_updated, 
           (MAX(CASE WHEN id = 1 THEN last_return END) OVER () *
            EXP(SUM(LN(1 + daily_return)) OVER (ORDER BY id)
               ) / (1 + daily_return)
           ) as cumulative_return
    FROM daily_return c;

If you really want to exclude where id = 1, then that requires an additional level of subqueries.

Here is a db<>fiddle.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • This query sort of works when last_return is equal to '1'. I added a second edit to the question with some actual data. Can you please adjust the formula to consider last_return? Basically, last_return is the cumulative total when the procedure was run last, and the procedure calculates cumulative totals further for new data. – Shahid Thaika Feb 23 '19 at 18:45
  • @ShahidThaika . . . Did you try the db<>fiddle? This should be doing the same thing as your query. – Gordon Linoff Feb 23 '19 at 18:47
  • I noticed that there was a comment in my query. Can you please try running the query you find in this fiddle in MySQL. You'll see the two columns give different values. It's cause my daily_return starts with a '0'. https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=8598b91eb09c94788ef67c8d3b202db8 – Shahid Thaika Feb 23 '19 at 20:03