How to pick only efficient frontier points in a plot of portfolio performance?

Question

The name of this question does not do it justice. This is best explained by numerical example. Let's say I have the following portfolio data, called data.

> data
    Stdev AvgReturn
1   1.92      0.35
2   1.53      0.34
3   1.39      0.31
4   1.74      0.31
5   1.16      0.30
6   1.27      0.29
7   1.78      0.28
8   1.59      0.27
9   1.05      0.27
10  1.17      0.26
11  1.62      0.25
12  1.33      0.25
13  0.96      0.24
14  1.47      0.24
15  1.09      0.24
16  1.20      0.24
17  1.49      0.23
18  1.01      0.23
19  0.88      0.22
20  1.21      0.22
21  1.37      0.22
22  1.09      0.22
23  0.95      0.21
24  0.81      0.21

I have already sorted the data data.frame by AvgReturn to make this (what I believe to be easier). My goal is to essentially eliminate all the points that do not make sense to choose, i.e., I would not want a portfolio where I choose a lower AvgReturn but receive a higher Stdev (assuming stdev is an appropriate measure of risk, but I am assuming that for now).

Essentially, does any know of an efficient (in the code sense) way to choose the "rational" portfolio choices. I have manually created a third column to this data frame to show you which portfolio choices should be kept. I would want to remove portfolio 4 because I would never choose it since I can choose portfolio 3 and receive the same return and a lower stdev. Similarly, I would never choose 8 because I can choose 5 with a higher return and a lower stdev.

> res
    Stdev AvgReturn  Keep
1   1.92      0.35  TRUE
2   1.53      0.34  TRUE
3   1.39      0.31  TRUE
4   1.74      0.31 FALSE
5   1.16      0.30  TRUE
6   1.27      0.29 FALSE
7   1.78      0.28 FALSE
8   1.59      0.27 FALSE
9   1.05      0.27  TRUE
10  1.17      0.26 FALSE
11  1.62      0.25 FALSE
12  1.33      0.25 FALSE
13  0.96      0.24  TRUE
14  1.47      0.24 FALSE
15  1.09      0.24 FALSE
16  1.20      0.24 FALSE
17  1.49      0.23 FALSE
18  1.01      0.23 FALSE
19  0.88      0.22  TRUE
20  1.21      0.22 FALSE
21  1.37      0.22 FALSE
22  1.09      0.22 FALSE
23  0.95      0.21 FALSE
24  0.81      0.21  TRUE

The only way I can think of solving this issue is by looping through and checking each condition. This, however, will be relatively inefficient in R my preferred language for this solution. I am having difficulty thinking of a vectorized solution. Any help is appreciated!

EDIT Here I believe is a solution:

domstrat <- function(data){
  keep <- c(-1,sign(diff(cummin(data[[1]]))))
  data <- data[which(keep!=0),]

  return(data)
}

   Stdev AvgReturn
1   1.92      0.35
2   1.53      0.34
3   1.39      0.31
5   1.16      0.30
9   1.05      0.27
13  0.96      0.24
19  0.88      0.22
24  0.81      0.21

I am not familiar with that concept in mathematics so I am not sure. But I know in portfolio theory, it is usually done with various portfolio weights. And you only pick the points above the "minimum variance" point. This link might be more helpful: http://www.uam.es/personal_pdi/economicas/bdeblas/teaching/ucd/ecn134/lectures/mv_review.pdf — road_to_quantdom, Mar 04 '15 at 07:12
If you just want the efficient frontier of points with "low Stdev and high AvgReturn" consider this question and the answers there: http://stackoverflow.com/questions/9106401/implementation-of-skyline-query-or-efficient-frontier/25135886#25135886 — Patrick Roocks, Mar 04 '15 at 07:20
Were you looking for an R coding solution, logic for choosing a portfolio, or both? My answer addresses the coding part. — Tim Biegeleisen, Mar 04 '15 at 08:14
I think both. I have a general idea of how i want to do it. it follows your structure a little bit, I will post it as an edit when it's almost done — road_to_quantdom, Mar 04 '15 at 09:06

IRTFM · Accepted Answer · 2015-03-04T16:14:55.070

1

This uses the function cummax to identify a series of qualifying points by then testing against the original data:

> data <- data[order(data$Stdev),]
> data[ which(data$AvgReturn == cummax(data$AvgReturn)) , ]
   Stdev AvgReturn
24  0.81      0.21
19  0.88      0.22
13  0.96      0.24
9   1.05      0.27
5   1.16      0.30
3   1.39      0.31
2   1.53      0.34
1   1.92      0.35


> plot(data)
> points( data[ which(data$AvgReturn == cummax(data$AvgReturn)) , ] , col="green")

It's not actually the convex hull but what might be called the "monotonically increasing hull".

enter image description here

edited Mar 04 '15 at 16:14

answered Mar 04 '15 at 16:03

IRTFM

258,963
21
364
487

This is perfect. Very similar to my solution except you chose to sort by variance first. Mine sorts by return first. – road_to_quantdom Mar 05 '15 at 01:52

score 0 · Answer 2 · answered Mar 04 '15 at 07:00

You can define a custom R function which contains some logic to decide whether or not to keep a certain portfolio depending on the standard deviation and the average return:

>portfolioKeep <- function(x){
+ # x[1] contains the Stdev for the input row
+ # x[2] contains the AvgReturn for the input row
+ # make your decision based on these inputs here...
+ # and remember to return either "TRUE" or "FALSE"
+ }

Next we can use an apply function on your input data frame to come up with the Keep column you want:

# your 'input' data frame
input.mat <- data.matrix(input)
# apply custom function to rows
keep <- apply(input.mat, 1, portfolioKeep)
# bind keep vector to input data frame
input <- cbind(input, keep)

The above code first converts the input data frame into a numeric matrix so that we can use the apply function on it. The apply function will run portfolioKeep on each row, returning either "TRUE" or "FALSE." Finally, we roll the Keep column up into the original data frame for convenience.

And now you can do your reporting easily with the data frame input with which you started.

How to pick only efficient frontier points in a plot of portfolio performance?

2 Answers2