-2

I am stuck on a MySQL problem. I am trying to calculate the return series of a portfolio using:

 for(i = startdate+1; i <= enddate; i++) {
   return[i]=0;
   for(n = 0;  n < count(instruments); n++) {
     return[i] += price[i,n] / price[i-1, n] * weight[n];
   }
 }

So, the return of portfolio today is calculated as a sum of price_today/price_yesterday*weight over the instruments in the portfolio.

I created a scribble at http://rextester.com/FUC35243.

If it doesn't work, the code is:

DROP TABLE IF EXISTS x_ports;
DROP TABLE IF EXISTS x_weights;
DROP TABLE IF EXISTS x_prices;

CREATE TABLE IF NOT EXISTS x_ports (id INT NOT NULL AUTO_INCREMENT, name VARCHAR(20), PRIMARY KEY (id));
CREATE TABLE IF NOT EXISTS x_weights (id INT NOT NULL AUTO_INCREMENT, port_id INT, inst_id INT, weight DOUBLE, PRIMARY KEY (id));
CREATE TABLE IF NOT EXISTS x_prices (id INT NOT NULL AUTO_INCREMENT, inst_id INT, trade_date DATE, price DOUBLE, PRIMARY KEY (id));

INSERT INTO x_ports (name) VALUES ('PORT A');
INSERT INTO x_ports (name) VALUES ('PORT B');

INSERT INTO x_weights (port_id, inst_id, weight) VALUES (1, 1, 20.0);
INSERT INTO x_weights (port_id, inst_id, weight) VALUES (1, 2, 80.0);
INSERT INTO x_weights (port_id, inst_id, weight) VALUES (2, 1, 100.0);

INSERT INTO x_prices (inst_id, trade_date, price) VALUES (1, '2018-01-01', 1.12);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (1, '2018-01-02', 1.13);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (1, '2018-01-03', 1.12);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (1, '2018-01-04', 1.12);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (1, '2018-01-05', 1.13);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (1, '2018-01-06', 1.14);

INSERT INTO x_prices (inst_id, trade_date, price) VALUES (2, '2018-01-01', 50.23);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (2, '2018-01-02', 50.45);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (2, '2018-01-03', 50.30);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (2, '2018-01-04', 50.29);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (2, '2018-01-05', 50.40);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (2, '2018-01-06', 50.66);


# GETTING THE DATES

SET @DtShort='2018-01-01';
SET @DtLong=@DtShort;

SELECT
    @DtShort:=@DtLong as date_prev,
    @DtLong:=dt.trade_date as date_curent
FROM
    (SELECT DISTINCT trade_date FROM x_prices ORDER BY trade_date) dt;


# GETTING RETURN FOR SINGLE DAY

SET @DtToday='2018-01-03';
SET @DtYesterday='2018-01-02';

SELECT
    x2.trade_date,
    x2.portfolio,
    sum(x2.val*x2.weight)/sum(x2.weight) as ret
FROM

    (SELECT
        x1.trade_date, 
        x1.portfolio,
        sum(x1.weight)/2.0 as weight,
        sum(x1.val_end)/sum(x1.val_start) as val, 
        sum(x1.val_start) as val_start,
        sum(x1.val_end) as val_end
    FROM

        (SELECT
            @DtToday as trade_date,
            prt.name as portfolio,
            wts.inst_id as iid,
            wts.weight,
            if(prc.trade_date=@DtToday,prc.price*wts.weight,0) as val_start,
            if(prc.trade_date=@DtYesterday,prc.price*wts.weight,0) as val_end
        FROM
            x_ports prt,
            x_weights wts,
            x_prices prc
        WHERE
            wts.port_id=prt.id and 
            prc.inst_id=wts.inst_id and
            (prc.trade_date=@DtToday or prc.trade_date=@DtYesterday)) x1

    GROUP BY x1.portfolio) x2

GROUP BY x2.portfolio;

I hope to be able to produce a result looking like this:

Date        Port A      Port B
--------------------------------------------
01/01/2010      
02/01/2010  1.005289596 1.004379853
03/01/2010  0.995851496 0.997026759
04/01/2010  0.999840954 0.999801193
05/01/2010  1.003535565 1.002187314
06/01/2010  1.005896896 1.00515873

The return for Port A on the 2/1/2018 should be calculated as 1.13/1.12*20/(20+80) + 50.45/50.23*80/(20+80).

The return for Port B on the 2/1/2018 should be calculated as 50.45/50.23*100/100, or possibly 1.13/1.12*0/(0+100) + 50.45/50.23*100/(0+100).

FYI, in the looping function above, I only calculate at the nominator (or the unscaled weight) so that Port A would be calculated as 1.13/1.12*20+50.45/50.23*80, which I see as the crucial step when calculating the return. The return is then found by dividing it by the sum of the weight.

Though it certainly can be done better, I can get the dates and I can calculate the return of a single day, but I just can't put the two together.

halfer
  • 19,824
  • 17
  • 99
  • 186
Joe Lager
  • 13
  • 3
  • I don't understand why I get voted down as not being clear or researched enough, but I guess it has to do with the fact that I chose to display my aim as a function instead of with words. – Joe Lager Oct 05 '18 at 12:11
  • What version of mySQL? [Lead window function](https://dev.mysql.com/doc/refman/8.0/en/window-function-descriptions.html#function_lead) would let you look at prices x records ahead or back if you're on version 8.0 or greater. If not 8.0 we can simulate the behavior using user variables to define a row number and then simply left join on A.row+1= B.row and other key values then we have the 2 different dates on the same row. We could use dates; but if gaps existed then +1 to date woudln't work. thus rownumber helps handle date gaps. – xQbert Oct 05 '18 at 13:33
  • I am definitely not on 8 on this server. I think I'm on 5 or 6. I've been toying with a lead/lag function, as my price dates are all workdays...mon-fri. – Joe Lager Oct 05 '18 at 13:40
  • use of double for price doens't seem wise any floating point data types when dealing with money doesn't end well.https://spin.atomicobject.com/2014/08/14/currency-rounding-errors/ https://stackoverflow.com/questions/3730019/why-not-use-double-or-float-to-represent-currency – xQbert Oct 05 '18 at 13:56
  • Thanks for your replies. I've read about the double/float problem before and try to avoid it when possible. The above is just a scribble to illustrate my issue. – Joe Lager Oct 05 '18 at 14:10
  • Is your looks like representative of expected results for your scribble? – xQbert Oct 05 '18 at 14:22
  • Heres what I'm getting: you may want to look at the "RAW data section alone first then look at the math which I'm doing wrong somehow based ont eh above results: http://rextester.com/ITJP56819 – xQbert Oct 05 '18 at 14:43
  • Helpful feedback: my guess is the downvotes is because of the extensive pleading. "Time is running out for me", "desperate", "stuck for days" may be understood as the emotional blackmail of volunteers. I recommend sticking to neutral technical writing here. (I would have downvoted for this, but you have -2 already). – halfer Oct 05 '18 at 20:43
  • With that in mind, please read [Under what circumstances may I add “urgent” or other similar phrases to my question, in order to obtain faster answers?](//meta.stackoverflow.com/q/326569). It's probably also advisable to [ask confident questions](https://meta.stackoverflow.com/questions/366264/how-can-we-encourage-new-authors-to-ask-confident-questions) too, since the inadvertent creation of psychological discrepancies can (paradoxically) discourage people from replying. – halfer Oct 05 '18 at 20:43

1 Answers1

0

Simulating analytics is no fun! Demo

The math on this doesn't seem right to me; as I'm no where close to your 'looks like results'

I'd like to be able to reuse CurDay but as the version is lower I couldn't use a common table expression.

What this does:

  • X1 generate the join of the tables
  • X2 gives us a count of the instruments in a portfolio used later in math
  • r generates a uservariable on which we can assign rows @rn and @Rn2
  • CurDay generate a rownumber ordered correctly so we can join
  • NextDay generates a copy of CurDay so we can join curday to next day on RN+1
  • Z allows us to do the math and group by current day and prepare for pivot on the portfolio name.
  • Outer most select allows us to pivot the data so we have date+2 columns

.

 SELECT Z.Trade_Date
     , sum(case when name = 'Port A' then P_RETURN end) as PortA
     , sum(case when name = 'Port B' then P_RETURN end) as PortB
FROM (
## Raw data
SELECT CurDay.*, NextDay.Price/CurDay.Price*CurDay.Weight/CurDay.Inst_Total_Weight as P_Return
FROM (SELECT x1.*, @RN:=@RN+1 rn,x2.inst_cnt, x2.Inst_Total_Weight
      FROM (SELECT prt.name, W.port_ID, W.inst_ID, W.weight, prc.trade_Date, Prc.Price
            FROM x_ports Prt
            INNER JOIN x_weights W
              on W.Port_ID = prt.ID
            INNER JOIN x_prices Prc
              on Prc.INST_ID = W.INST_ID
            ORDER BY W.port_id, W.inst_id,trade_Date) x1
     CROSS join (SELECT @RN:=0) r
     INNER join (SELECT count(*) inst_Cnt, port_ID, sum(Weight) as Inst_Total_Weight 
                 FROM x_weights
                 GROUP BY Port_ID) x2
        on X1.Port_ID = X2.Port_ID) CurDay
LEFT JOIN (SELECT x1.*, @RN2:=@RN2+1 rn2
           FROM (SELECT prt.name, W.port_ID, W.inst_ID, W.weight, prc.trade_Date, Prc.Price
                 FROM x_ports Prt
                 INNER JOIN x_weights W
                   on W.Port_ID = prt.ID
                 INNER JOIN x_prices Prc
                   on Prc.INST_ID = W.INST_ID
                 ORDER BY W.port_id, W.inst_id,trade_Date) x1
                 CROSS join (SELECT @RN2:=0) r
           ) NextDay
   on NextDay.Port_ID = CurDay.Port_ID
  and NextDay.Inst_ID = curday.Inst_ID
  and NextDay.RN2 = CurDay.RN+1
GROUP BY CurDay.Port_ID,  CurDay.Inst_ID,  CurDay.Trade_Date) Z
##END RAW DATA
GROUP BY Trade_Date;




+----+---------------------+-------------------+-------------------+
|    |     Trade_Date      |       PortA       |       PortB       |
+----+---------------------+-------------------+-------------------+
|  1 | 01.01.2018 00:00:00 | 1,00528959642786  | 1,00892857142857  |
|  2 | 02.01.2018 00:00:00 | 0,995851495829569 | 0,991150442477876 |
|  3 | 03.01.2018 00:00:00 | 0,999840954274354 | 1                 |
|  4 | 04.01.2018 00:00:00 | 1,0035355651507   | 1,00892857142857  |
|  5 | 05.01.2018 00:00:00 | 1,00589689563141  | 1,00884955752212  |
|  6 | 06.01.2018 00:00:00 | NULL              | NULL              |
+----+---------------------+-------------------+-------------------+
xQbert
  • 34,733
  • 2
  • 41
  • 62
  • Wow, thank you so much. You are so close and very much better at sql than I am. I can see that you divide by the CurDay.Inst_Cnt. I think that should be the sum(CurDay.Weight), but I'm not 100% sure yet...testing it right now. – Joe Lager Oct 05 '18 at 16:52
  • My "looks like result" was just to show what kind how I hoped to have my output presented...the numbers where just dummy numbers. I've updated my original post to include the correct numbers. – Joe Lager Oct 05 '18 at 17:02
  • Well, I'm off for now. I'll check back to see what I can do to get my numbers to match your expected results. I think I just don't get the math. so for Port A inst_1 I get (1.13/1.12*20/2) TodayPrice/YesterdayPrice*weight/count of instruments in Port_A What's wrong with the formula? – xQbert Oct 05 '18 at 17:44
  • The weight should be a percentage. If you break out Port A inst_1, it should be 1.13/1.12*20/(20+80). The entire Port A value should with the same breaking-out be 1.13/1.12*20/(20+80) + 50.45/50.23*80/(20+80). – Joe Lager Oct 05 '18 at 17:49
  • That helps alot updated to reflect this improved understanding ! Though Port B seems incorrect... – xQbert Oct 05 '18 at 18:17
  • That's it! You're a superstar! ...and correct, had a misconception, I thought Port B consisted of 100% inst_2 and not inst_1. I'm beyond impressed. – Joe Lager Oct 05 '18 at 18:25
  • awe shucks just here to help out :P on future questions include the math all the way though. It makes understanding the question much easier. Especially when we see the expected results and the math to get there. Understanding is the hardest part for us. There's some garbage in the query not needed such as the inst_Cnt and I might be able to tune it a little better given a better understanding now; but my rules are 1) DO Something, 2) laugh 3) do it right, 4) make it better. WE did 1-2-3 It's up to you if we need to look at 4. – xQbert Oct 05 '18 at 18:26
  • Lastly, if you want an explanation I'll write one up later and update the question explaining what was done in greater detail; but if it's not needed, I'll move on. I also noticed I have two badly named sets as curday and nextday should have been prevday, curday respectively to better match the problem. – xQbert Oct 05 '18 at 18:32
  • I am increadibly grateful. If you ever need any help with anything related to finance, .net, c++ or even vba, let me know. I will try to keep an eye out for you in the future. I've been sitting with this for 2 days. – Joe Lager Oct 05 '18 at 18:37
  • Been doing programming and sql for 20+ years. Nothing really new here – xQbert Oct 05 '18 at 20:00