Finding all combinations of multiple variables summing to 1

Question

I am trying to solve the equation

x1 + x2 + x3 + .... + xn = 1

where the values of all xi are restricted to [0, 0.1, 0.2, ..., 0.9, 1].

Currently, I am solving the problem by first generating an n-dimensional array mat, where at each element location the value is the sum of the axis values, which vary in axisValues = 0:0.1:1:

mat(i,j,k,...,q) = axisValues(i) + axisValues(j) + ... + axisValues(q).

Then I search for all entries of the resulting array that are equal to one. The code (shown below for further clarification) is working fine and has been tested for up to 5 dimensions. The problem is, that the run time increases exponentially and I need the script to work for more than a few dimensions.

clear all
dim = 2; % The dimension of the matrix is defined here. The script has been tested for dim ≤ 5
fractiles(:,1) = [0:0.1:1]; % Produces a vector containing the initial axis elements, which will be used to calculate the matrix elements
fractiles = repmat(fractiles,1,dim); % multiplies the vector to supply dim rows with the axis elements 0:0.1:1. These elements will be changed later, but the symmetry will remain the same.
dim_len = repmat(size(fractiles,1),1,size(fractiles,2)); % Here, the length of the dimensions is checked, which his needed to initialize the matrix mat, which will be filled with the axis element sums
mat = zeros(dim_len); % Here the matrix mat is initialized
Sub=cell(1,dim);
mat_size = size(mat);
% The following for loop fills each matrix elements of the dim dimensional matrix mat with the sum of the corresponding dim axis elements.
for ind=1:numel(mat)
    [Sub{:}]=ind2sub(mat_size,ind);
    SubMatrix=cell2mat(Sub);
    sum_indices = 0;
    for j = 1:dim
        sum_indices = sum_indices+fractiles(SubMatrix(j),j);
    end
    mat(ind) = sum_indices;
end
Ind_ones = find(mat==1); % Finally, the matrix elements equal to one are found.

I have the feeling that the following idea using the symmetry of the problem might help to significantly reduce calculation time:

For a 2D matrix, all entries that fulfill the condition above lie on the diagonal from mat(1,11) to mat(11,1), i.e. from the maximal value of x1 to the maximal value of x2.

For a 3D Matrix, all entries fulfill the condition that lie on a diagonal plane through mat(1,1,11), mat(1,11,1), mat(11,1,1), i.e. from the maximal value of x1 and x2 to the maximal value of x3.

The same is true for higher dimensions: All matrix elements of interest lie on an n-1 dimensional hyper-plane fixed on the highest axis value in each dimension.

The question is: Is there a way to directly determine the indices of the elements on these n-1 dimensional hyper-plane? If so, the whole problem could be solved in one step and without needing to calculate all entries of the n-dimensional matrix and then searching for the entries of interest.

If you try to generate this humongous matrix it will naturally increase exponentially. You could speed this up a bit, but if it is worth it really depends on if there is no way around this matrix. What is the application? I still don't quite get it. What problem do you want solved? What problem does the code you posted solve? — knedlsepp, Feb 13 '15 at 19:37
I imagine this should have something to do with the convex hull of the vectors `e1` to `en`... What would be your next step if you had all those linear indices? — knedlsepp, Feb 13 '15 at 19:45
These linear indices represent all possible mixtures of n components (e.g. hydrogen + oxygen + nitrogen for n=3). After this script has run through, the indeces will be fed into another script, which approximates each possible composition of the component data sets to a reference (e.g. water) and finds the best approximation (in this case 0.1119 H + 0.8881 O + 0 Ti). This script at hand is supposed to find all possible combinations of n components and pass it to the follow up script for the optimization process. — Andreas Schönfeld, Feb 13 '15 at 20:54
(above: nigtrogen = titanium) The problem is mostly memory and run time. Of course, run time will remain exponentially, but I like to reduce the problem to the matrix elements that are actually of interest and avoid calculating unnecessary values. Hence the (hyper-) plane, which includes all values of interest. This (hyper-) plane would indeed be the convex hull, between the maxima of e1 to en, I believe. — Andreas Schönfeld, Feb 13 '15 at 21:00
So basically you don't really care about hyperplanes, but want to compute all possible combinations of percentages of mixtures of `n` different components, where each component can have one of: 0,10,20,...,90,100%? — knedlsepp, Feb 14 '15 at 01:21
Exactly! :-) The idea with the planes only stems from the geometry of the problem. In 2D all possible combinations lie on a diagonal of a matrix, in 3D on a plane and so on. — Andreas Schönfeld, Feb 14 '15 at 08:20

score 3 · Accepted Answer · edited May 23 '17 at 12:06

Math:

Instead of going the hypercube-way, we solve the equation

x(1) + x(2) + ... + x(n) = 1

where each x(i) can vary in [0, 1/k, 2/k, ... (k-1)/k, 1] instead. In your case k will be 10, as this will then result in the percentages [0, 10, 20, ... 90, 100]. Multiplied by k this corresponds to the diophantine equation

x(1) + x(2) + ... + x(n) = k,

where all the x(i) vary in [0, 1, 2, ... k-1, k].

We can build a bijection between this and the combinatorial concept of combinations with repetition.

The wikipedia article even implicitly mentions the underlying bijection by the statement:

The number of multisubsets of size k is then the number of nonnegative integer solutions of the Diophantine equation x1 + x2 + ... + xn = k.

For a smaller example, say we are going with k=3 and percentages in [0, 33, 66, 100] instead. Given all k-multicombinations of the set {1,2,3}:

RepCombs = 
     1     1     1
     1     1     2
     1     1     3
     1     2     2
     1     2     3
     1     3     3
     2     2     2
     2     2     3
     2     3     3
     3     3     3

Then we map these to your percentages using the following rule: For every row i, if the entry is j, then add 1/3 of the percentage to the corresponding Matrix entry M(i,j). The first row will correspond to [1/3 + 1/3 + 1/3, 0, 0] = [1,0,0]. The overall matrix generated by this process will look like this:

M =
    1.0000         0         0
    0.6667    0.3333         0
    0.6667         0    0.3333
    0.3333    0.6667         0
    0.3333    0.3333    0.3333
    0.3333         0    0.6667
         0    1.0000         0
         0    0.6667    0.3333
         0    0.3333    0.6667
         0         0    1.0000

Code:

And now for the MATLAB code that generates all this: I use the function nmultichoosek from this answer and accumarray to accomplish our goal:

function M = possibleMixturesOfNSubstances(N, percentageSteps)
RepCombs = nmultichoosek(1:N, percentageSteps);
numCombs = size(RepCombs,1);
M = accumarray([repmat((1:numCombs).', percentageSteps, 1), RepCombs(:)], 1/percentageSteps, [numCombs, N]);

If you want percentages in [0, 10, ... 90, 100] and have 4 substances, call this function using possibleMixturesOfNSubstances(4,10)

Finding all combinations of multiple variables summing to 1

1 Answers1

Math:

Code:

Linked