Generate unique combinations of size N from an array of repeating characters

Question

I have an array containing repeating characters. Take ['A','B','C','C'] as example. I am writing a C++ program to output all the combinations of size N.
When N = 2, program should output AB, AC, BC and CC.
When N = 3, program should output ABC, ACC, BCC.

My approach was to treat each character of the array as unique. I generated the all combinations and saved it to a vector. I then iterated through the vector to remove duplicate combinations.
For N = 2, assuming all characters are unique gives 4C2 = 6 possibilities : AB, AC, AC, BC, BC, CC.
1 AC and 1 BC are removed because they occur twice instead of once.

Is there a more efficient way to do this?

Note:

The characters in the array are not necessarily in ascending order.
In my initial approach, using find() for vector was insufficient to locate all duplicates. For array ['T','O','M','O','R','R','O','W'], the duplicates are permuted. For N = 4, two of the duplicates are TOMO and TMOO.

See [all-combinations-of-k-elements-out-of-n](https://stackoverflow.com/questions/5095407/all-combinations-of-k-elements-out-of-n). — Jarod42, Aug 26 '21 at 08:12
The programs in given link doesn't seem to deal with duplicate elements in the array/vector. The problem of duplicated combinations will still occur. — zxc, Aug 26 '21 at 08:50
Related but python-centric question: [python: Permutations with unique values](https://stackoverflow.com/questions/6284396/permutations-with-unique-values). Also [Python combinations without repetitions](https://stackoverflow.com/a/46623112/3080723) — Stef, Aug 26 '21 at 09:13
See also [Source code for `sympy.utilities.iterables.multiset_permutations`](https://github.com/sympy/sympy/blob/46e00feeef5204d896a2fbec65390bd4145c3902/sympy/utilities/iterables.py#L1375-L1421). — Stef, Aug 26 '21 at 09:20
@Stef Last link leads to permutations, but that file contains `multiset_combinations` earlier — MBo, Aug 26 '21 at 09:28
No duplicates with code from above links: [Demo](https://coliru.stacked-crooked.com/a/1326ef3a40959666) — Jarod42, Aug 26 '21 at 09:51
In python modules, in addition to `sympy`'s `multiset_permutations`, there is also `more_itertools.distinct_permutations`: [documentation](https://more-itertools.readthedocs.io/en/stable/api.html#more_itertools.distinct_permutations), [source code](https://more-itertools.readthedocs.io/en/stable/_modules/more_itertools/more.html#distinct_permutations) — Stef, Aug 26 '21 at 10:25
Other resources: a library specifically for multiset permutations, for several languages including C++: [multipermute](https://github.com/ekg/multipermute). This library appears to be based on a research paper and explains the algorithm clearly. — Stef, Aug 26 '21 at 10:41
Similar/duplicate questions: [How to generate all the permutations of a multiset](https://stackoverflow.com/questions/19676109/how-to-generate-all-the-permutations-of-a-multiset); [C++: Efficiently process each unique permutation of a vector when number of unique elements in vector is much smaller than vector size](https://stackoverflow.com/questions/57339783/efficiently-process-each-unique-permutation-of-a-vector-when-number-of-unique-el); — Stef, Aug 26 '21 at 10:43
Similar/duplicate questions: [How can I improve my algorithm for generating combinations of a multiset](https://stackoverflow.com/questions/16514559/how-can-i-improve-my-algorithm-for-generating-combinations-of-a-multiset); — Stef, Aug 26 '21 at 10:44

score 1 · Accepted Answer · answered Aug 26 '21 at 09:19

You can sort source array and collect similar items in groups. In the code below I added index of the next group beginning to provide proper jump to the next group if next similar item is not used.

Python code and output:

a = ['T','O','M','O','R','R','O','W']
a.sort()
print(a)
n = len(a)
b = [[a[-1], n]]
next = n
for i in range(n-2,-1, -1):
    if a[i] != a[i+1]:
        next = i + 1
    b.insert(0, [a[i], next])
print(b)

def combine(lst, idx, n, k, res):
    if len(res) == k:
        print(res)
        return
    if idx >= n:
        return

    #use current item
    combine(lst, idx + 1, n, k, res + lst[idx][0])
    #omit current item and go to the next group   
    combine(lst, lst[idx][1], n, k, res)

combine(b, 0, n, 3, "")

['M', 'O', 'O', 'O', 'R', 'R', 'T', 'W']
[['M', 1], ['O', 4], ['O', 4], ['O', 4], ['R', 6], ['R', 6], ['T', 7], ['W', 8]]

MOO     MOR    MOT    MOW    MRR    MRT    MRW    MTW
OOO    OOR    OOT    OOW    ORR    ORT    ORW    OTW
RRT    RRW    RTW

Generate unique combinations of size N from an array of repeating characters

1 Answers1

Linked