4

I had a combinatorics assignment that involved getting every word with length less than or equal to 6 from a specific combination of strings.

In this case, it was S = { 'a', 'ab', 'ba' }. The professor just started listing them off, but I thought it would be easier solved with a program. The only problem is that I can't get a good algorithm that would actually compute every possible option.

If anyone could help, I'd appreciate it. I usually program in Python but really I just need help with the algorithm.

Ren
  • 1,111
  • 5
  • 15
  • 24

4 Answers4

10

Assuming you DO mean combinations (no repetitions, order does not matter):

import itertools

S = [ 'a', 'ab', 'ba' ]

for i in range(len(S)+1):
  for c in itertools.combinations(S, i):
    cc = ''.join(c)
    if len(cc) <= 6:
      print c

emits all the possibilities:

()
('a',)
('ab',)
('ba',)
('a', 'ab')
('a', 'ba')
('ab', 'ba')
('a', 'ab', 'ba')

If you mean something different than "combinations", it's just an issue of using the right iterator or generator in the for (e.g., itertools.permutations, or something else of your own devising).

Edit: if for example you mean "repetitions and order ARE important",

def reps(seq, n):
  return itertools.product(*[seq]*n)

for i in range(7):
  for c in reps(S, i):
    cc = ''.join(c)
    if len(cc) <= 6:
      print c

will give you the required 85 lines of output.

Edit again: I had the wrong loop limit (and therefore wrong output length) -- tx to the commenter who pointed that out. Also, this approach can produce a string > 1 times, if the ''.join's of different tuples are considered equivalent; e.g., it produces ('a', 'ba') as distinct from ('ab', 'a') although their ''.join is the same (same "word" from different so-called "combinations", I guess -- terminology in use not being entirely clear).

Alex Martelli
  • 854,459
  • 170
  • 1,222
  • 1,395
  • That doesn't create 6 letter combinations like 'a' 'ab' 'ba' 'a' – Unknown Sep 22 '09 at 02:24
  • That's NOT a combination, since 'a' is repeated. The term **combinations** has an extremely clear and unambiguous meaning in maths (and that's exactly what itertools.combination implements) and in a "combinatorics assignment" it beggars belief that it could be used sloppily, imprecisely, or flat out WRONG. – Alex Martelli Sep 22 '09 at 03:03
  • The question is strings from a combination; not combinations from a set. – Anthony Towns Sep 22 '09 at 03:12
  • "combination" is NOT an ambiguous term in combinatorics. If the OP wants repetitions, order-dependence, or other aspects that are NOT part of the precise, clear definition of "combination", I expect _the OP_ to edit (a) the title "get every combination", (b) the question, and (c) ideally (to give me a heads-up;-) a comment to this answer;-). The latter's clearly optional, the first two, not really;-). – Alex Martelli Sep 22 '09 at 03:16
  • 1
    The question is, why would you need to specify a length limit of 6? I think its reasonable to think that he wants something other than the mathematical "combination" because in a **word**, the order matters whereas in a **combination** it does not matter. – Unknown Sep 22 '09 at 06:17
8

You can iteratively generate all the strings made from one part, two parts, three parts and so on, until all the strings generated in a step are longer than six characters. Further steps would only generate even longer strings, so all possible short strings have already been generated. If you collect these short strings in each step you end up with a set of all possible generated short strings.

In Python:

S = set(['a', 'ab', 'ba'])

collect = set()
step = set([''])
while step:
   step = set(a+b for a in step for b in S if len(a+b) <= 6)
   collect |= step

print sorted(collect)
sth
  • 222,467
  • 53
  • 283
  • 367
4
def combos(S,n):
    if (n <= 0): return
    for s in S:
        if len(s) <= n: yield s
        for t in combos(S,n-len(s)): yield s+t

for x in combos(["a","ab","ba"],6): print x

Prints output:

a
aa
aaa
aaaa
aaaaa
aaaaaa
aaaaab
aaaaba
aaaab
aaaaba
aaaba
aaabaa
aaab
aaaba
aaabaa
aaabab
aaabba
aaba
aabaa
aabaaa
aabaab
aababa
aab
aaba
aabaa
aabaaa
aabaab
aababa
aabab
aababa
aabba
aabbaa
aba
abaa
abaaa
abaaaa
abaaab
abaaba
abaab
abaaba
ababa
ababaa
ab
aba
abaa
abaaa
abaaaa
abaaab
abaaba
abaab
abaaba
ababa
ababaa
abab
ababa
ababaa
ababab
ababba
abba
abbaa
abbaaa
abbaab
abbaba
ba
baa
baaa
baaaa
baaaaa
baaaab
baaaba
baaab
baaaba
baaba
baabaa
baab
baaba
baabaa
baabab
baabba
baba
babaa
babaaa
babaab
bababa
I. J. Kennedy
  • 24,725
  • 16
  • 62
  • 87
1

Doing it recursively is one way:

cache = {}
def words_of_length(s, n=6):
    # memoise results
    if n in cache: return cache[n]

    # base cases
    if n < 0: return []
    if n == 0: return [""]

    # inductive case
    res = set()
    for x in s:
        res |= set( (x+y for y in words_of_length(s, n-len(x))) )

    # sort and memoise result
    cache[n] = sorted(res)

    # sort results
    return cache[n]

def words_no_longer_than(s, n=6):
    return sum( [words_of_length(s, i) for i in range(n+1)], [] )
Anthony Towns
  • 2,864
  • 1
  • 19
  • 23