I am having a trouble turning 2 uncorrelated varialbes to correlated variables without using transformation method (like Cholesky method).
I have 2 original variables, say Original1 and Original2. The number of data points is 50 and the correlation between the two variables is -0.95.
Then I fit the variables into empirical distributions and generate 10,000 random numbers for both variables, say Random1 and Random2. The correlation betweten the two is close to 0.
Then I use this algorithm by F. Jatpil in https://stats.stackexchange.com/questions/38856/how-to-generate-correlated-random-numbers-given-means-variances-and-degree-of
The Algorithm in python is like this:
while True:
n1 = random.randrange(0, 10000 - 1)
a = Random1.iloc[n1] - Random1_Average
b = Random2.iloc[n1] - Random2_Average
while True:
n2 = random.choice(list(range(0, n1)) + list(range(n1 + 1, 9999)))
c = Random1.iloc[n2] - Random1_Average
d = Random2.iloc[n2] - Random2_Average
if (a - c) * (b - d) > 0:
break
else:
continue
Random1.iloc[n1], Random1.iloc[n2] = Random1.iloc[n2], Random1.iloc[n1]
c_new = np.corrcoef(Random1, Random2)[0][1]
if r_lowerlimit <= c_new and c_new <= r_upperlimit:
break
else:
continue
Here, the reason why I check for (a - c) * (b - d) > 0 is because it is the needed condition for the correlation to decrease after swapping n1 and n2. After swapping, then I check if the new correlation is between certain lower limit and upper limit (-0.90 and 1.00, in this case). If the new correlation falls in that range, then we now have Random1 that is correlated with Random2 with similar correlation as the original.
There are 2 problems with this code right now:
- As you can see, this brutal force method takes a long time.
- Most times it works under 1 minute. Sometimes, however, it seems that as the correlation gets close to the limit, it never gets out of the inner while loop because it has a hard time finding a good n2 value to swap.
What would be a good method to fix this problem? Thanks.