2

I would like to find all subsets of a sorted string, disregarding order and which characters are next to each other. I think the best way for this to be explained is though an example. The results should also be from longest to shortest.

These are the results for bell.

bell
bel
bll
ell
be
bl
el
ll
b
e
l

I have thought of ways to do this, but none for any length of input. Thank you!

jpp
  • 159,742
  • 34
  • 281
  • 339
nedla2004
  • 1,115
  • 3
  • 14
  • 29
  • Just to clarify your question, since "bell" has to "l"s, would you want "bl" to appear twice or should duplicates be removed ? You have duplicates removed but I wanted to confirm it is intentional. – steveb Mar 18 '18 at 01:40
  • 1
    Also where is `ll`? – miradulo Mar 18 '18 at 01:40
  • @steveb This is intentional, duplicated should be removed. – nedla2004 Mar 18 '18 at 01:42
  • @miradulo This was unintentional, I fixed it. – nedla2004 Mar 18 '18 at 01:43
  • @miradulo, I think there's a subtle difference to dup you've marked. I believe user wants original order of string maintained. It's trivially done, but just thought I'd point it out (as in my answer). – jpp Mar 18 '18 at 01:52
  • 1
    @jpp That's fair actually, agreed, changes the question a bit. – miradulo Mar 18 '18 at 01:53

3 Answers3

5

There are generally two ways to approach such things: generate "everything" and weed out duplicates later, or create custom algorithms to avoid generating duplicates to begin with. The former is almost always easier, so that's what I'll show here:

def gensubsets(s):
    import itertools
    for n in reversed(range(1, len(s)+1)):
        seen = set()
        for x in itertools.combinations(s, n):
            if x not in seen:
                seen.add(x)
                yield "".join(x)

for x in gensubsets("bell"):
    print(x)

That prints precisely what you said you wanted, and how it does so should be more-than-less obvious.

Tim Peters
  • 67,464
  • 13
  • 126
  • 132
2

Here is one way using itertools.combinations.

If the order for strings of same length is important, see @TimPeters' answer.

from itertools import combinations

mystr = 'bell'

res = sorted({''.join(sorted(x, key=lambda j: mystr.index(j)))
             for i in range(1, len(mystr)+1) for x in combinations(mystr, i)},
             key=lambda k: -len(k))

# ['bell', 'ell', 'bel', 'bll', 'be', 'll', 'bl', 'el', 'l', 'e', 'b']

Explanation

  • Find all combinations of length in range(1, len(mystr)+1).
  • Sort by original string via key argument of sorted. This step may be omitted if not required.
  • Use set of ''.join on elements for unique strings.
  • Outer sorted call to go from largest to smallest.
jpp
  • 159,742
  • 34
  • 281
  • 339
-1

You can try in one line:

import itertools

data='bell'

print(set(["".join(i) for t in range(len(data)) for i in itertools.combinations(data,r=t) if "".join(i)!='']))

output:

{'bel', 'bll', 'ell', 'el', 'be', 'bl', 'e', 'b', 'l', 'll'}
Aaditya Ura
  • 12,007
  • 7
  • 50
  • 88