2

I have this script which calculates profit and loss of trades. It works fine but I think it can be improved. It will be great to get rid of the for loops at least to make the code look compact. Can anyone please help me out ?

The logic to calculate profit/loss is first to match the sell trades with potential buy trades. A single sell trade can be matched with multiple buys. So the cost might be distributed to multiple buys.

Steps :

  • separate the trades into buy and sell in increasing dates.
  • calculate the average cost price
  • calculate profit/loss = (selling price - cost price)*matching vol

Thanks

Here is the sample data set

> structure(list(AsxCode = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "QAN", class = "factor"), Order.Type = structure(c(1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L), .Label = c("Buy", "Sell"), class = "factor"), Trade.Date = structure(c(13L, 12L, 12L, 11L, 10L, 9L, 8L, 7L, 6L, 5L, 4L, 3L, 2L, 1L), .Label = c("2014-03-28", "2014-05-22", "2014-11-07", "2014-11-18", "2014-12-04", "2015-03-02", "2015-03-24", "2015-03-27", "2015-05-11", "2015-05-15", "2015-08-21", "2016-04-15", "2016-04-18"), class = "factor"), Price = c(3.75, 4.05, 4.01, 3.55, 3.68, 3.38, 2.9, 2.98, 2.9, 2.05, 1.8, 1.65, 1.25, 1.07), Quantity = c(850L, 1350L, 150L, 1000L, 1500L, 1400L, 1091L, 2000L, 1750L, 600L, 366L, 375L, 500L, 500L), Consideration = c(3198.5, 5456.5, 590.5, 3561, 5531, 4743, 3152.9, 5949, 5086, 1241, 669.8, 629.75, 614, 546), match_status = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), match_vol = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), avg_price = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), profit_loss = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L)), .Names = c("AsxCode", "Order.Type", "Trade.Date", "Price", "Quantity", "Consideration", "match_status", "match_vol", "avg_price", "profit_loss"), row.names = c(NA, -14L), class = "data.frame")

   AsxCode Order.Type Trade.Date Price Quantity Consideration match_status match_vol avg_price profit_loss
1      QAN        Buy 2016-04-18  3.75      850       3198.50           NA         0         0           0
2      QAN       Sell 2016-04-15  4.05     1350       5456.50           NA         0         0           0
3      QAN       Sell 2016-04-15  4.01      150        590.50           NA         0         0           0
4      QAN        Buy 2015-08-21  3.55     1000       3561.00           NA         0         0           0
5      QAN        Buy 2015-05-15  3.68     1500       5531.00           NA         0         0           0
6      QAN        Buy 2015-05-11  3.38     1400       4743.00           NA         0         0           0
7      QAN       Sell 2015-03-27  2.90     1091       3152.90           NA         0         0           0
8      QAN       Sell 2015-03-24  2.98     2000       5949.00           NA         0         0           0
9      QAN        Buy 2015-03-02  2.90     1750       5086.00           NA         0         0           0
10     QAN        Buy 2014-12-04  2.05      600       1241.00           NA         0         0           0
11     QAN        Buy 2014-11-18  1.80      366        669.80           NA         0         0           0
12     QAN        Buy 2014-11-07  1.65      375        629.75           NA         0         0           0
13     QAN       Sell 2014-05-22  1.25      500        614.00           NA         0         0           0
14     QAN        Buy 2014-03-28  1.07      500        546.00           NA         0         0           0


calculate.profit <- function(trades){       
    trades$match_vol <- 0
    s <- trades[trades$Order.Type== 'Sell', ]
    sell.trades <- s[order(s$Trade.Date, decreasing=FALSE),]    

    b <- trades[trades$Order.Type== 'Buy', ]
    buy.trades <- b[order(b$Trade.Date, decreasing=FALSE),]     

    # Don't want to execute the for loop when there is no sell trades. In other words when there is no profit/loss unless you sell
    if(nrow(sell.trades)==0){
        return (buy.trades)
    }

    # for each sell find the associated buys
    for(i in 1:nrow(sell.trades))
    {           
        # calculate average price. The Consideration column contains total cost  
        s.price <- sell.trades[i, 'Consideration']/sell.trades[i,'Quantity']        

        for(j in 1:nrow(buy.trades))
        {   
            # this part matches sell with a buy trade
            # if sell volume and buy volume are same, the sell is fully matched otherwise it has to find the remaining sell units.      
            s.vol <- sell.trades[i,'Quantity'] - sell.trades[i,'match_vol']         
            b.vol <- buy.trades[j, 'Quantity'] - buy.trades[j, 'match_vol']

            if (b.vol != 0)         
            {               
                b.price <- buy.trades[j, 'Consideration']/buy.trades[j, 'Quantity']
                # contains the volume which is matched between buy and sell
                # trades
                match.vol <- min(s.vol, b.vol)              
                profit <- match.vol * (s.price - b.price)               

                buy.trades[j, 'match_vol'] <- match.vol + buy.trades[j, 'match_vol']

                sell.trades[i, 'profit_loss'] <- profit + sell.trades[i, 'profit_loss'] 
                sell.trades[i, 'match_vol'] <- match.vol + sell.trades[i, 'match_vol']              
            }

            # sell parcel fully processed           
            if (sell.trades[i ,'match_vol'] == sell.trades[i ,'Quantity'])
            {               
                j=1
                break;                   
            }                       
        }           
    }   
    return (rbind(buy.trades, sell.trades))
}
Android Beginner
  • 486
  • 1
  • 4
  • 18

1 Answers1

0

There is a number of improvements that could be made.

Preallocating object sizes

The most obvious thing would be to preallocate objet sizes. The received wisdom is that it is inefficient to expand objects in loops. Hence you would do:

# On example of a single column 
sell.trades.vec <- vector(mode = "numeric", length = nrow(buy.trades))

in order to avoid objects being expended within loops.

seq_along()

Broadly speaking, it is neat to use seq_along instead of 1:something, have a look:

>> a <- NULL
>> 1:length(a)
[1] 1 0
>> seq_along(a)
integer(0)
>> 

as well:

>> 1:0
[1] 1 0
>> seq_along(0)
[1] 1
>> 

I'm guessing that you will (most probably) always have some sensible nrow value but seq_along maybe worth reflecting on, in case there is a risk to get some odd data.

Community
  • 1
  • 1
Konrad
  • 17,740
  • 16
  • 106
  • 167