23

I've seen many questions on getting all the possible substrings (i.e., adjacent sets of characters), but none on generating all possible strings including the combinations of its substrings.

For example, let:

x = 'abc'

I would like the output to be something like:

['abc', 'ab', 'ac', 'bc', 'a', 'b', 'c']

The main point is that we can remove multiple characters that are not adjacent in the original string (as well as the adjacent ones).

Here is what I have tried so far:

def return_substrings(input_string):
    length = len(input_string)
    return [input_string[i:j + 1] for i in range(length) for j in range(i, length)]

print(return_substrings('abc'))

However, this only removes sets of adjacent strings from the original string, and will not return the element 'ac' from the example above.

Another example is if we use the string 'abcde', the output list should contain the elements 'ace', 'bd' etc.

martineau
  • 119,623
  • 25
  • 170
  • 301
aL_eX
  • 1,453
  • 2
  • 15
  • 30
  • 2
    Do you _really_ want to do this the hard way? It's very easy to do it with the standard `itertools.combinations` function, and it will run faster, too. – PM 2Ring Jul 26 '18 at 11:57
  • 4
    This is basically a string version of `whats a good way to find power_set`. https://stackoverflow.com/questions/1482308/whats-a-good-way-to-combinate-through-a-set – Marcus.Aurelianus Jul 26 '18 at 12:14
  • Those are combinations, not "iterations". A combination is basically a subset where order does not matter. If order matters, it's a permutation. – Mike Housky Jul 26 '18 at 12:17
  • @MikeHousky Thanks for the correction, I'll change the title. – aL_eX Jul 26 '18 at 12:17
  • 1
    If you ever want to take care of strings like `"AAAA"` and not have unnecessary duplicates, look at my answer. – scharette Jul 26 '18 at 13:01

5 Answers5

27

You can do this easily using itertools.combinations

>>> from itertools import combinations
>>> x = 'abc'
>>> [''.join(l) for i in range(len(x)) for l in combinations(x, i+1)]
['a', 'b', 'c', 'ab', 'ac', 'bc', 'abc']

If you want it in the reversed order, you can make the range function return its sequence in reversed order

>>> [''.join(l) for i in range(len(x),0,-1) for l in combinations(x, i)]
['abc', 'ab', 'ac', 'bc', 'a', 'b', 'c']
PM 2Ring
  • 54,345
  • 6
  • 82
  • 182
Sunitha
  • 11,777
  • 2
  • 20
  • 23
8

This is a fun exercise. I think other answers may use itertools.product or itertools.combinations. But just for fun, you can also do this recursively with something like

def subs(string, ret=['']):
    if len(string) == 0:
        return ret
    head, tail = string[0], string[1:]
    ret = ret + list(map(lambda x: x+head, ret))
    return subs(tail, ret)

subs('abc')
# returns ['', 'a', 'b', 'ab', 'c', 'ac', 'bc', 'abc']
davidlowryduda
  • 2,404
  • 1
  • 25
  • 29
  • Although Python comes with "batteries included", it's certainly good for people to see how to do this task recursively. Now see if you can get the output in the order shown in the question without using sorting. :evil grin: – PM 2Ring Jul 26 '18 at 12:25
  • @PM2Ring Maybe a merge between my answer and his could do it ;) – scharette Jul 26 '18 at 13:19
2

@Sunitha answer provided the right tool to use. I will just go and suggest an improved way while using your return_substrings method. Basically, my solution will take care of duplicates.


I will use "ABCA" in order to prove validity of my solution. Note that it would include a duplicate 'A' in the returned list of the accepted answer.

Python 3.7+ solution,

x= "ABCA"
def return_substrings(x):
    all_combnations = [''.join(l) for i in range(len(x)) for l in combinations(x, i+1)]
    return list(reversed(list(dict.fromkeys(all_combnations))))
    # return list(dict.fromkeys(all_combnations)) for none-reversed ordering

print(return_substrings(x))
>>>>['ABCA', 'BCA', 'ACA', 'ABA', 'ABC', 'CA', 'BA', 'BC', 'AA', 'AC', 'AB', 'C', 'B', 'A']

Python 2.7 solution,

You'll have to use OrderedDict instead of a normal dict. Therefore,

 return list(reversed(list(dict.fromkeys(all_combnations))))

becomes

return list(reversed(list(OrderedDict.fromkeys(all_combnations))))

Order is irrelevant for you ?

You can reduce code complexity if order is not relevant,

x= "ABCA"
def return_substrings(x):
    all_combnations = [''.join(l) for i in range(len(x)) for l in combinations(x, i+1)]
    return list(set(all_combnations))
scharette
  • 9,437
  • 8
  • 33
  • 67
0
def return_substrings(s):
    all_sub = set()
    recent = {s}

    while recent:
        tmp = set()
        for word in recent:
            for i in range(len(word)):
                tmp.add(word[:i] + word[i + 1:])
        all_sub.update(recent)
        recent = tmp

    return all_sub
0

For an overkill / different version of the accepted answer (expressing combinations using https://docs.python.org/3/library/itertools.html#itertools.product ):

["".join(["abc"[y[0]] for y in x if y[1]]) for x in map(enumerate, itertools.product((False, True), repeat=3))]

For a more visual interpretation, consider all substrings as a mapping of all bitstrings of length n.

bessbd
  • 570
  • 4
  • 17