3

I work with vectors of natural numbers of length N and sum-of-entries S, for instance, with (N,S)=(4,7) one example vector E=[1,2,1,3] where all entries in the vector are assumed > 0.

I want to list all vectors with the same configuration (N,S)=(4,7), but rotations should be filtered out.

Question: what is the best algorithm?
(my practical implementation is in Pari/GP which provides a nested for-loop using a vector of bounds for lower and upper index value)

I've tried first a "brute force" solution, in that I generate a list using the nested for-loop, but storing the vector concatenated twofold concat(E,E) say in the list EE[], and then, for each new vector E=[e_1,e_2,e_3,e_4] checking whether this vector occurs in the concatenated vectors in the already checked list EE[], and if not then append it as new valid entry
EE[k]=E||E = [e_1,e_2,e_3,e_4,e_1,e_2,e_3,e_4]. The comparision here is like string comparision - a match is always found if it begins at positions 1 or up to N.

This method works, but looks to me a bit like brute force and due to its combinatorical structure explodes with increasing N and S.

Does a better method exist?

Note: target language is Pari/GP
Note: a pseudoalgorithm would suffice - but perhaps the tools in Pari/GP allow some more compact solutions/notation.


Example, (N,S)=(4,7)
The following is a very simple example.

Assume by a nested loop I get the vectors E in the following way:

[1,1,1,4] --- first vector: store as [1,1,1,4;1,1,1,4] in EE
[1,1,2,3] --- 2nd vector: not in EE so far, append as [1,1,2,3;1,1,2,3] to EE
[1,2,1,3] ...
[2,1,1,3] ...
[1,1,3,2] --- ignore!! This is a rotation of an earlier entry (is in EE)
[1,2,2,2] 
...

This builds successively the list of vectors EE:

[1,1,1,4;1,1,1,4]
[1,1,2,3;1,1,2,3]
[1,2,1,3;1,2,1,3]
[2,1,1,3;2,1,1,3]
[1,2,2,2;1,2,2,2]

and this list, with the first N columns only, is the list of vectors I want to work with later.

Additional sanity check: for (N,S)=(4,16) I get for the unfiltered list length_unfiltered = 455 , and length_filtered=116 after rotations are deleted.

  • Only idea would be to keep a separate vector of the rotations (your `seen` vector), where you add a rotation rotated with the lowest value first. Then for each new set encountered, rotate it with lowest first and compare against the `seen` vector of rotations. You would have to check whether that is beneficial over the brute-force method. It should be if you have a large number of sets containing as the `seen` vector size would always be less than the number of sets eliminating your worst case brute-force iteration to over each to find a rotation at the end. – David C. Rankin Oct 17 '21 at 08:24
  • @David - yes, I thought about this sorting; if I could realize a good sorting-criteria the access into the `seen`-list ***EE*** could be dramatically reduced. At the moment such a sorting seems to be only partially meaningful - if I sort in the current vector ***E*** the significance of the order of the entries is ignored... Perhaps something like a code, generated by the entries, being checked against the codes in the ***EE*** list, could be sortable and accessible by a binary search... – Gottfried Helms Oct 17 '21 at 09:03
  • Well, you can get the minimum value and its index and then rotate the elements left that number of times. That would put the lowest value element in the first position while preserving the order of elements. If you stored your collection of `seen` in a normal `type**` (where you allocate pointers, and allocate storage for each set), your "insert" of a new set in sort order could be done by just shuffling pointers down by 1 index at the point of insertion. If you allocated for a "pointer-to-array" you would need to realloc and memcpy to insert. – David C. Rankin Oct 17 '21 at 09:36

2 Answers2

5

There is a well known algorithm for generating strings with rotations deleted (usually called necklaces in combinatorics). This algorithm works in constant amortized time meaning that rotated equivalents can be removed in constant time.

Frank Rusky calls this algorithm the FKM algorithm. It is described in https://people.engr.ncsu.edu/savage/AVAILABLE_FOR_MAILING/necklace_fkm.pdf. (Also several other papers and also chapter 7.2 of Rusky's book 'Combinatorial Generation').

The implementation is straight forward (but it would take me a couple of hours to code in PARI, so for now I am leaving). The additional requirement of a given sum can be incorporated into the algorithm without difficulty.

A less efficient alternative would be to generate all the (N, S) words and then filter out those that are not necklaces. For the generation part, there are built in PARI functions forpart and forperm. The filtering can be done using a simplified adaption of the FKM algorithm. Since only a test is required, backtracking and recursion can be avoided in this test.

Some PARI code follows: The following PARI can be used to generate all vectors of length n and sum s. This method avoids recursion and calls act for each solution.

forsumset(n, s, act)={
  my(w=vector(n), p=1);
  while(p > 0,
    if(s>n-p, w[p]++; s--; if(p==n-1, w[n]=s;act(w), p++), s+=w[p]; w[p]=0; p--);
   );
}

Example use:

local(total=0); forsumset(4, 16, w->total++); total

Gives 455 as expected.

The following function is a test for the necklace property using the FKM algortithm:

isnecklace(w)={
  my(d=1); 
  for(p=2, #w, my(e=w[p]-w[p-d]); if(e<0, return(0)); if(e>0, d=p));
  #w%d==0
}

Example use:

local(total=0); forsumset(4,16,w->if(isnecklace(w), total++)); total

Gives 116 as expected.

Update
The following combines the two ideas into a single algorithm that computes each new term in amortized constant time. Since the number of results is exponentially dependent on s, the overall time is still exponential. If it is only required to count the total number of entries then there are faster methods.

forneck(n, s, act)={
  my(w=vector(n), ds=vector(n), p=1, d=1);
  while(p > 0,
    if(w[p], 
      w[p]++; s--; d=p,
      my(e=if(p==1,1,w[p-d])); w[p]=e; s-=e);
    if(s>=n-p, 
       if(p==n, if(s, w[n]+=s; s=0; d=p); if(p%d==0, act(w)), p++; ds[p]=d), 
       d=ds[p]; s+=w[p]; w[p]=0; p--);
  );
}

Example use:

local(total=0); forneck(4,16,w->total++); total

Gives 116 as expected.

Andrew
  • 873
  • 4
  • 10
  • Thanks for the link to the article - at a first glance it might answer my question; unfortunately I must make myself familiar with the specific terms & jargon there, so this might need a certain time. I'll return to your answer soon! – Gottfried Helms Oct 17 '21 at 17:25
  • Wow, this is too much... Using ***(N,S)=(5,20)*** my routine needed 32 sec, and your not a single one... Uff, I'll have to analyze your solution indepth to understand how this can happen... I ran your routine in GP 13.0 in a dos-window, while I still use GP 2.4 to have it working with my selfmade GUI-interface - but that cannot have such an impact... – Gottfried Helms Oct 17 '21 at 19:24
  • Andrew - you might look at the self-answer, where I've applied your procedure to generate a triangle of results for small consecutive ***(N,S)*** – Gottfried Helms Oct 17 '21 at 20:02
  • While I celebrated the timing of your algorithm when counting the number of filtered tuples (necklaces), the motivation of my question is of course the generating of the list of the necklaces. The test of the number was originally just to have a preliminary sanity check of any proposed implementation. – Gottfried Helms Oct 18 '21 at 07:33
0

This is only a comment to Andrew's answer, not an "answer", but is too long for the comment-box.

Andrew's routine allowed to quickly produce a statistic for the filtered versions for small (N,S)=(2,2) to (16,16) It gives a combinatorical triangle of numbers which I've not seen so far. (Possibly the first column should be filled with ones)
Perhaps worth an entry in OEIS... - well, it is in OEIS :-)

I got:

       1  2   3    4    5    6    7    8    9   10   11   12  13 14 15 16 -> N
 --+----------------------------------------------------------------------
 1 |   0  .   .    .    .    .    .    .    .    .    .    .   .  .  .  .
 2 |   0  1   .    .    .    .    .    .    .    .    .    .   .  .  .  .
 3 |   0  1   1    .    .    .    .    .    .    .    .    .   .  .  .  .
 4 |   0  2   1    1    .    .    .    .    .    .    .    .   .  .  .  .
 5 |   0  2   2    1    1    .    .    .    .    .    .    .   .  .  .  .
 6 |   0  3   4    3    1    1    .    .    .    .    .    .   .  .  .  .
 7 |   0  3   5    5    3    1    1    .    .    .    .    .   .  .  .  .
 8 |   0  4   7   10    7    4    1    1    .    .    .    .   .  .  .  .
 9 |   0  4  10   14   14   10    4    1    1    .    .    .   .  .  .  .
10 |   0  5  12   22   26   22   12    5    1    1    .    .   .  .  .  .
11 |   0  5  15   30   42   42   30   15    5    1    1    .   .  .  .  .
12 |   0  6  19   43   66   80   66   43   19    6    1    1   .  .  .  .
13 |   0  6  22   55   99  132  132   99   55   22    6    1   1  .  .  .
14 |   0  7  26   73  143  217  246  217  143   73   26    7   1  1  .  .
15 |   0  7  31   91  201  335  429  429  335  201   91   31   7  1  1  .
16 |   0  8  35  116  273  504  715  810  715  504  273  116  35  8  1  1
 |
 |
 v
 S