4

I have a list called animals,

animals = ["B_FOX", "A_CAT", "A_DOG", "A_MOUSE", 
         "B_DOG", "B_MOUSE", "C_DUCK", "C_FOX", "C_BIRD"]

and would like the following outputs:

 A = ["A_CAT", "A_DOG", "A_MOUSE"]
 B = ["B_DOG", "B_MOUSE", "B_FOX"]
 C = ["C_DUCK", "C_FOX", "C_BIRD"]

I can only get a subset list of only the letters or the animals like this:

  [species.split("_",1)[1] for species in animals]
  ['FOX', 'CAT', 'DOG', 'MOUSE', 'DOG', 'MOUSE', 'DUCK', 'FOX', 'BIRD']

  [letters.split("_",1)[0] for letters in animals]
  ['B', 'A', 'A', 'A', 'B', 'B', 'C', 'C', 'C']

Not sure if I've worded the question correctly. Any help in solving this tricky problem would be greatly appreciated!

MichaelRSF
  • 886
  • 5
  • 16
  • 40

3 Answers3

3

You could build separate lists, one for each initial letter, however, that would be tricky if you have many letters. You can use a defaultdict instead:

from collections import defaultdict

d = defaultdict(list)
animals = ["B_FOX", "A_CAT", "A_DOG", "A_MOUSE", 
     "B_DOG", "B_MOUSE", "C_DUCK", "C_FOX", "C_BIRD"]

for animal in animals:
   d[animal[0]].append(animal)
print(dict(d))

Output:

{'A': ['A_CAT', 'A_DOG', 'A_MOUSE'], 'C': ['C_DUCK', 'C_FOX', 'C_BIRD'], 'B': ['B_FOX', 'B_DOG', 'B_MOUSE']}
Ajax1234
  • 69,937
  • 8
  • 61
  • 102
3

Try an itertools.groupby according to the first letter:

import operator as op
import itertools as it


animals = [
    "B_FOX", "A_CAT", "A_DOG", "A_MOUSE", 
    "B_DOG", "B_MOUSE", "C_DUCK", "C_FOX", "C_BIRD"
]

A, B, C = [list(g) for _, g in it.groupby(sorted(animals), key=op.itemgetter(0))]

Outputs:

A
# ['A_CAT', 'A_DOG', 'A_MOUSE']

B
# ['B_DOG', 'B_FOX', 'B_MOUSE']

C
# ['C_BIRD', 'C_DUCK', 'C_FOX']

Here is a post on how groupby works.

pylang
  • 40,867
  • 14
  • 129
  • 121
  • 3
    That's very neat, I wasn't aware of the operator module til now. Thanks pylang! – MichaelRSF Aug 29 '17 at 02:55
  • The other key option is a regular function, such as `def f(x): x[0]` or `lambda x: x[0]` – pylang Aug 29 '17 at 02:57
  • This would be last on readability and ease of understanding but it definitely wins for the shortest solution ;-) – Hubert Grzeskowiak Aug 29 '17 at 03:06
  • @Hubert Grzeskowiak, I might agree on understanding, which is how `groupby` is. It is one of the tougher and trickier itertools to grok. But, I think it reads fine. I'll add a link for clarity. Thanks. – pylang Aug 29 '17 at 03:10
  • Thanks for your solution as well, Hubert! Much appreciated for all the help :) – MichaelRSF Aug 29 '17 at 03:11
2

You can unpack the values of the prefix and the name both from one call to split:

groups = {}
for animal in animals:
    prefix, name = animal.split("_")
    if prefix not in groups:
        groups[prefix] = []
    groups[prefix].append(animal)

print groups

{'A': ['A_CAT', 'A_DOG', 'A_MOUSE'], 'C': ['C_DUCK', 'C_FOX', 'C_BIRD'], 'B': ['B_FOX', 'B_DOG', 'B_MOUSE']}

If required, you can later still unpack the dict into single variables:

A = groups["A"]
B = groups["B"]
C = groups["C"]

If you want to get rid of the prefixes:

groups = {}
for animal in animals:
    prefix, name = animal.split("_")
    if prefix not in groups:
        groups[prefix] = []
    groups[prefix].append(name)
Hubert Grzeskowiak
  • 15,137
  • 5
  • 57
  • 74