-3

Possible Duplicate:
In R, how do I find the optimal variable to maximize or minimize correlation between several datasets

This can be done in Excel, but my dataset has gotten too large. In excel, I would use solver.

I have 5 variables and I want to recreate a weighted average of these 5 variables so that they have the lowest correlation to a 6th variable.

Column A,B,C,D,E = random numbers

Column F = random number (which I want to minimise the correlation to)

Column G = Awi1+Bwi2+C*2i3+D*wi4+wi5*E

where wi1 to wi5 are coefficients resulted from solver In a separate cell, I would have correl(F,G)

This is all achieved with the following constraints in mind: 1. A,B,C,D, E have to be between 0 and 1 2. A+B+C+D+E= 1

I'd like to print the results of this so that I can have an efficient frontier type chart. How can I do this in R? Thanks for the help.

Community
  • 1
  • 1
pabs_17
  • 17
  • 3
  • 2
    If these two questions are in fact by the same person, you should know that using multiple accounts like this is frequently frowned upon. – joran Mar 18 '12 at 00:37
  • How is this an exact duplicate? It's similar theme but I want to minimise and create a efficient style frontier. If you search the net i cannot find the answer – pabs_17 Mar 18 '12 at 16:54

1 Answers1

3

I looked at the other thread mentioned by Vincent and I think I have a better solution. I hope it is correct. As Vincent points out, your biggest problem is that the optimization tools for such non-linear problems do not offer a lot of flexibility for dealing with your constraints. Here, you have two types of constraints: 1) all your weights must be >= 0, and 2) they must sum to 1.

The optim function has a lower option that can take care of your first constraint. For the second constraint, you have to be a bit creative: you can force your weights to sum to one by scaling them inside the function to be minimized, i.e. rewrite your correlation function as function(w) cor(X %*% w / sum(w), Y).

# create random data
n.obs <- 100
n.var <- 6
X <- matrix(runif(n.obs * n.var), nrow = n.obs, ncol = n.var)
Y <- matrix(runif(n.obs), nrow = n.obs, ncol = 1)

# function to minimize
correl <- function(w)cor(X %*% w / sum(w), Y)
# inital guess
w0 <- rep(1 / n.var, n.var)
# optimize
opt <- optim(par = w0, fn = correl, method = "L-BFGS-B", lower = 0)
optim.w <- opt$par / sum(opt$par)
flodel
  • 87,577
  • 21
  • 185
  • 223