1

Let d and p be two integers. I need to generate a large matrix A of integers, having d columns and N=nchoosek(d+p,p) rows. Note that nchoosek(d+p,p) increases quickly with d and p, so it's very important that I can generate A quickly. The rows of A are all the multi-indices with components from 0 to p, such that the sum of the components is less than or equal to p. This means that, if d=3 and p=3, then A is an [N=nchoosek(3+3,3)=20x3] matrix with the following structure:

A=[0 0 0;
   1 0 0;
   0 1 0;
   0 0 1;
   2 0 0;
   1 1 0;
   1 0 1;
   0 2 0;
   0 1 1;
   0 0 2;
   3 0 0;
   2 1 0;
   2 0 1;
   1 2 0;
   1 1 1;       
   1 0 2;   
   0 3 0;      
   0 2 1;
   0 1 2;
   0 0 3]       

It is not indispensable to follow exactly the row ordering I used, although it would make my life easier (for those interested, it's called graded lexicographical ordering and it's described here: http://en.wikipedia.org/wiki/Monomial_order).

In case you are curious about the origin of this weird matrix, let me know!

Daniel
  • 36,610
  • 3
  • 36
  • 69
DeltaIV
  • 4,773
  • 12
  • 39
  • 86
  • Maybe [generate all possibilities (Cartesian product)](http://stackoverflow.com/q/21895335/2586922) and then keep only those that meet the sum criterion? – Luis Mendo Feb 02 '15 at 11:16
  • As you say: The size of the matrix will grow pretty fast. Are you sure you need all the values simultaneously? – knedlsepp Feb 02 '15 at 12:09
  • What are the actual sizes that you need to work with? – Dennis Jaheruddin Feb 02 '15 at 13:44
  • possible duplicate of [combinations totaling to sum](http://stackoverflow.com/questions/21500539/combinations-totaling-to-sum) – knedlsepp Feb 02 '15 at 14:04
  • The possible duplicate provides a very clever approach that is way more efficient than all current solutions. It is not exactly the same, but you can get your matrix by concatenating multiple outputs of it. – knedlsepp Feb 02 '15 at 14:27
  • @LuisMendo the number of rows for all possibilities grow exponentially with d, while nchoosek(d+p,p) grows polynomially, so creating the former and pruning is definitely not acceptable. – DeltaIV Feb 02 '15 at 15:27
  • @knedlsepp, if I understand correctly your suggestion, I should use Mark Dickinson's answer in the other thread, for each i between 1 and p, and then concatenate the p matrices by row (together with the initial [0 0 0...0] row). On paper, this looks great because it doesn't involve generating even bigger matrices and then pruning. I'll try it out in MATLAB and let you know how it works. – DeltaIV Feb 02 '15 at 15:37
  • @knedlsepp, Mark Dickinson's answer gives the rows in the opposite order to what I need. As I don't understand the code enough to change the ordering, I used `flipud` to solve, but if there's a way to get directly the ordering I need, without unnecessary flipping, please let me know. Also, how should I proceed now? I cannot choose Mark's as the answer to my question, since it's in another thread. – DeltaIV Feb 02 '15 at 15:45
  • @DeltaIV: I flagged this post as a duplicate, so if the community considers it to be a duplicate this will be closed anyway. Otherwise I did reformulate my answer using *Mark*'s post, so you could also accept my answer if it solved your problem. – knedlsepp Feb 02 '15 at 22:53
  • @knedlsepp it did solve my problem indeed! Thanks to you and all the others who helped. – DeltaIV Feb 03 '15 at 10:44

3 Answers3

2

Solution using nchoosek and diff

The following solution is based on this clever answer by Mark Dickinson.

function degrees = monomialDegrees(numVars, maxDegree)
if numVars==1
    degrees = (0:maxDegree).';
    return;
end
degrees = cell(maxDegree+1,1);
k = numVars;
for n = 0:maxDegree
    dividers = flipud(nchoosek(1:(n+k-1), k-1));
    degrees{n+1} = [dividers(:,1), diff(dividers,1,2), (n+k)-dividers(:,end)]-1;
end
degrees = cell2mat(degrees);

You can get your matrix by calling monomialDegrees(d,p).

Solution using nchoosek and accumarray/histc

This approach is based on the following idea: There is a bijection between all k-multicombinations and the matrix we are looking for. The multicombinations give the positions, where the entries should be added. For example the multicombination [1,1,1,1,3] will be mapped to [4,0,1], as there are four 1s, and one 3. This can be either converted using accumarray or histc. Here is the accumarray-approach:

function degrees = monomialDegrees(numVars, maxDegree)
if numVars==1
    degrees = (0:maxDegree).';
    return;
end
degrees = cell(maxDegree+1,1);
degrees{1} = zeros(1,numVars);
for n = 1:maxDegree
    pos = nmultichoosek(1:numVars, n);
    degrees{n+1} = accumarray([reshape((1:size(pos,1)).'*ones(1,n),[],1),pos(:)],1);
end
degrees = cell2mat(degrees);

And here the alternative using histc:

function degrees = monomialDegrees(numVars, maxDegree)
if numVars==1
    degrees = (0:maxDegree).';
    return;
end
degrees = cell(maxDegree+1,1);
degrees(1:2) = {zeros(1,numVars); eye(numVars);};
for n = 2:maxDegree
    pos = nmultichoosek(1:numVars, n);
    degrees{n+1} = histc(pos.',1:numVars).';
end
degrees = cell2mat(degrees(1:maxDegree+1));

Both use the following function to generate multicombinations:

function combs = nmultichoosek(values, k)
if numel(values)==1
    n = values;
    combs = nchoosek(n+k-1,k);
else
    n = numel(values);
    combs = bsxfun(@minus, nchoosek(1:n+k-1,k), 0:k-1);
    combs = reshape(values(combs),[],k);
end

Benchmarking:

Benchmarking the above codes yields that the diff-solution is faster if your numVars is low and maxDegree high. If numVars is higher than maxDegree, then the histc solution will be faster.

Old approach:

This is an alternative to Dennis' approach of dec2base, which has a limit on the maximum base. It is still a lot slower than the above solutions.

function degrees = monomialDegrees(numVars, maxDegree)
Cs = cell(1,numVars);
[Cs{:}] = ndgrid(0:maxDegree);
degrees = reshape(cat(maxDegree+1, Cs{:}),(maxDegree+1)^numVars,[]);
degrees = degrees(sum(degrees,2)<=maxDegree,:);
Community
  • 1
  • 1
knedlsepp
  • 6,065
  • 3
  • 20
  • 41
  • Wow I didn't see your successive edits, where you compared the `diff` solution to the other ones. It's nice to know that there are other solutions which may be faster in some cases, but the `diff` solution is already so much faster than any pruning-based solution, that I'm more than satisfied with using it in all cases. – DeltaIV Mar 09 '15 at 18:33
1

I would solve it this way:

ncols=d;
colsum=p;
base=(0:colsum)';
v=@(dm)permute(base,[dm:-1:1]);

M=bsxfun(@plus,base,v(2));
for idx=3:ncols
    M=bsxfun(@plus,M,v(idx));
end
L=M<=colsum;
A=cell(1,ncols);
[A{:}]=ind2sub(size(L),find(L));
a=cell2mat(A);
%subtract 1 because 1 based indexing but base starts at 0
a=a-1+min(base);

It builds up a p-dimensional matrix which contains the sum. The efficiency of this code depends on sum(L(:))/numel(L), this quotient tells you how much of the created matrix is actually used for solutions. If this gets low for your intput, there probably exits a better solution.

Daniel
  • 36,610
  • 3
  • 36
  • 69
1

Here is a very easy way to do it:

L = dec2base(0:4^3-1,4);
idx=sum(num2str(L)-'0',2)<=3;
L(idx,:)

I think the first line can be very time efficient for creating a list of candidates, but unfortunately I don't know how to reduce the list in an efficient way after that.

So the second line works, but could use improvement performance wise.

Dennis Jaheruddin
  • 21,208
  • 8
  • 66
  • 122