Having read carefully the previous question Random numbers that add to 100: Matlab
I am struggling to solve a similar but slightly more complex problem.
I would like to create an array of n elements that sums to 1, however I want an added constraint that the minimum increment (or if you like number of significant figures) for each element is fixed.
For example if I want 10 numbers that sum to 1 without any constraint the following works perfectly:
num_stocks=10;
num_simulations=100000;
temp = [zeros(num_simulations,1),sort(rand(num_simulations,num_stocks-1),2),ones(num_simulations,1)];
weights = diff(temp,[],2);
I foolishly thought that by scaling this I could add the constraint as follows
num_stocks=10;
min_increment=0.001;
num_simulations=100000;
scaling=1/min_increment;
temp2 = [zeros(num_simulations,1),sort(round(rand(num_simulations,num_stocks-1)*scaling)/scaling,2),ones(num_simulations,1)];
weights2 = diff(temp2,[],2);
However though this works for small values of n & small values of increment, if for example n=1,000 & the increment is 0.1% then over a large number of trials the first and last numbers have a mean which is consistently below 0.1%.
I am sure there is a logical explanation/solution to this but I have been tearing my hair out to try & find it & wondered anybody would be so kind as to point me in the right direction. To put the problem into context create random stock portfolios (hence the sum to 1).
Thanks in advance
Thank you for the responses so far, just to clarify (as I think my initial question was perhaps badly phrased), it is the weights that have a fixed increment of 0.1% so 0%, 0.1%, 0.2% etc.
I did try using integers initially
num_stocks=1000;
min_increment=0.001;
num_simulations=100000;
scaling=1/min_increment;
temp = [zeros(num_simulations,1),sort(randi([0 scaling],num_simulations,num_stocks-1),2),ones(num_simulations,1)*scaling];
weights = (diff(temp,[],2)/scaling);
test=mean(weights);
but this was worse, the mean for the 1st & last weights is well below 0.1%.....
Edit to reflect excellent answer by Floris & clarify
The original code I was using to solve this problem (before finding this forum) was
function x = monkey_weights_original(simulations,stocks)
stockmatrix=1:stocks;
base_weight=1/stocks;
r=randi(stocks,stocks,simulations);
x=histc(r,stockmatrix)*base_weight;
end
This runs very fast, which was important considering I want to run a total of 10,000,000 simulations, 10,000 simulations on 1,000 stocks takes just over 2 seconds with a single core & I am running the whole code on an 8 core machine using the parallel toolbox.
It also gives exactly the distribution I was looking for in terms of means, and I think that it is just as likely to get a portfolio that is 100% in 1 stock as it is to geta portfolio that is 0.1% in every stock (though I'm happy to be corrected).
My issue issue is that although it works for 1,000 stocks & an increment of 0.1% and I guess it works for 100 stocks & an increment of 1%, as the number of stocks decreases then each pick becomes a very large percentage (in the extreme with 2 stocks you will always get a 50/50 portfolio).
In effect I think this solution is like the binomial solution Floris suggests (but more limited)
However my question has arrisen because I would like to make my approach more flexible & have the possibility of say 3 stocks & an increment of 1% which my current code will not handle correctly, hence how I stumbled accross the original question on stackoverflow
Floris's recursive approach will get to the right answer, but the speed will be a major issue considering the scale of the problem.
An example of the original research is here
http://www.huffingtonpost.com/2013/04/05/monkeys-stocks-study_n_3021285.html
I am currently working on extending it with more flexibility on portfolio weights & numbers of stock in the index, but it appears my programming & probability theory ability are a limiting factor.......