4

I have a table:

q)data:([]dt:2017.01.05D19:45:00.238248239 2017.01.05D20:46:00.282382392 2017.01.05D21:47:00.232842342 2017.01.05D22:48:00.835838442 2017.01.05D20:49:00.282382392;sym:`AAPL`GOOG`AAPL`BBRY`GOOG;price:101.20 800.20 102.30 2.20 800.50;shares:500 100 500 900 100)
q)data
dt                            sym    price   shares
2017.01.05D19:45:00.238248239 AAPL   101.20  500
2017.01.05D20:46:00.282382392 GOOG   800.20  100
2017.01.05D21:47:00.232842342 AAPL   102.30  500
2017.01.05D22:48:00.835838442 BBRY     2.20  900
2017.01.05D20:49:00.282382392 GOOG   800.50  100

I need to create a column containing the sum of price*shares for the latest observation of each individual ticker.

To demonstrate using the above data, we're looking for:

data:
dt                            sym    price   shares   index
2017.01.05D19:45:00.238248239 AAPL   101.20  500      50,600
2017.01.05D20:46:00.282382392 GOOG   800.20  100      130,620
2017.01.05D21:47:00.232842342 AAPL   102.30  500      131,170
2017.01.05D22:48:00.835838442 BBRY     2.20  900      133,150
2017.01.05D20:49:00.282382392 GOOG   800.50  100      133,180

To further clarify, at row 1, only 1 symbol is included, at row 2, 2 symbols, then 2 again, then 3, then 3 again at row 5.

Answered in a different thread: Apply formula to current and previous rows only (Q/KDB)

Jonathon McMurray
  • 2,881
  • 1
  • 10
  • 22
Fomalhaut -C
  • 320
  • 2
  • 13

3 Answers3

4

Slight variation on Jonathon's solution, using a vector conditional:

q)delete dict from update index:?[all flip distinct[sym]in/: key'[dict]; {sum[x]*sum[y]%sum z} ./: flip each value each dict;0N] from update dict:@[;;:;]\[()!();sym;flip (price;shares;divisor)] from data
dt                            sym  price shares divisor index
----------------------------------------------------------------
2018.02.05D22:47:22.175914000 AAPL 101.2 500    2
2018.02.05D22:21:10.175914000 GOOG 800.2 500    1
2018.02.05D22:58:00.175914000 AAPL 102.3 500    2
2018.02.05D22:19:27.175914000 BBRY 2.2   500    1       339262.5
1
q)delete ind from update index:sum@'ind from (update ind:@\[()!();sym;:;shares*price%divisor] from data) where i>=max(first;i)fby sym
dt                            sym  price shares divisor index
--------------------------------------------------------------
2017.01.05D19:45:00.238248239 AAPL 101.2 500    2
2017.01.05D20:45:00.282382392 GOOG 800.2 500    1
2017.01.05D21:45:00.232842342 AAPL 102.3 500    2
2017.01.05D22:45:00.835838442 BBRY 2.2   500    1       426775

Slightly different answer to what you got, this is doing sum(shares*price%divisor) rather than summing each individually.

A slightly more messy and complicated version that gets the same answer as you seem to be expecting:

q)delete ind from update index:sum'[ind[;;0]]*sum'[ind[;;1]]%sum'[ind[;;2]] from (update ind:@\[()!();sym;:;shares,'price,'divisor] from data) where i>=max(first;i)fby sym
dt                            sym  price shares divisor index
----------------------------------------------------------------
2017.01.05D19:45:00.238248239 AAPL 101.2 500    2
2017.01.05D20:45:00.282382392 GOOG 800.2 500    1
2017.01.05D21:45:00.232842342 AAPL 102.3 500    2
2017.01.05D22:45:00.835838442 BBRY 2.2   500    1       339262.5
Jonathon McMurray
  • 2,881
  • 1
  • 10
  • 22
  • For neatness you can remove the lambda if you use `@` apply: `update ind:@\[()!();sym;:;shares*price%divisor] from data`. – Thomas Smyth - Treliant Feb 05 '18 at 20:14
  • 1
    Ah yes, thanks Thomas, meant to change that before posting – Jonathon McMurray Feb 05 '18 at 20:26
  • Something is amiss with this solution. I'm getting numbers 30-40% larger than anticipated and I cannot reproduce the results in Excel. To clarify, we need to get the most recent price for each stock observed thus far at any given tick, and multiply that by only the latest price%divisor. In other words, if only 3 symbols have been observed at t=n, but there are 5 in the list, only 3 should be included in the calculation, and only the latest prices, divisors, and share counts for each. I think this solution arrives at a different result due to the sums being used, but am not certain. – Fomalhaut -C Feb 06 '18 at 17:52
  • Does either method give you the expected result? – Jonathon McMurray Feb 06 '18 at 18:04
  • Unfortunately no. Every answer here reports the 30-40% larger than actual. However, when I export the data to excel and do (last price * (shares/divisor)) for each symbol and aggregate, I get the actual result, so at least the data is correct. – Fomalhaut -C Feb 06 '18 at 18:14
  • Can you provide a sample data set which exhibits this behaviour? As it stands I've replicated the original example exactly – Jonathon McMurray Feb 06 '18 at 18:17
  • Much as I'd like to, that would likely get me fired. How about this last one: a function that returns just the sum of the latest price for each ticker? So if there are 20 tickers, each row is the sum of the latest 20 prices. I tried this: `update rolling:sum ask from data where i>=max(first;i)fby sym` but it yields only a static sum on each row. – Fomalhaut -C Feb 07 '18 at 14:37
  • I'm not suggesting posting real market data, just a sufficient sample of mock data to actually demonstrate the problem - with the sample you already posted, my code works & gives the expected output – Jonathon McMurray Feb 07 '18 at 16:55
1

Given the question has changed considerably since the initial posting and my previous answer, here is an updated solution:

q)delete ind from update index:sum@'ind from (update ind:@\[()!();sym;:;shares*price] from data) where i>=max(first;i)fby sym
dt                            sym  price shares index
------------------------------------------------------
2017.01.05D19:45:00.238248239 AAPL 101.2 500
2017.01.05D20:46:00.282382392 GOOG 800.2 100
2017.01.05D21:47:00.232842342 AAPL 102.3 500
2017.01.05D22:48:00.835838442 BBRY 2.2   900    133150
2017.01.05D20:49:00.282382392 GOOG 800.5 100    133180

Or without the other initial condition that it should only be populated once all tickers have ticked:

q)delete ind from update index:sum@'ind from update ind:@\[()!();sym;:;shares*price] from data
dt                            sym  price shares index
------------------------------------------------------
2017.01.05D19:45:00.238248239 AAPL 101.2 500    50600
2017.01.05D20:46:00.282382392 GOOG 800.2 100    130620
2017.01.05D21:47:00.232842342 AAPL 102.3 500    131170
2017.01.05D22:48:00.835838442 BBRY 2.2   900    133150
2017.01.05D20:49:00.282382392 GOOG 800.5 100    133180

(Note these are only minor modifications to the solution I posted yesterday, updated for the changed requirements in the question)

Jonathon McMurray
  • 2,881
  • 1
  • 10
  • 22