Split list of strings into multiple sublists if they are the same

Question

I was wondering if it was possible in python to split a list of strings into multiple sublists if they are the same string. For example:

Input:

['Red','Green','Yellow','Blue','Blue','Green','Red','Yellow','Purple']

Output:

['Red','Red']

['Green','Green']

['Yellow','Yellow']

['Blue','Blue']

['Purple']

I need it to be able to do this with different values each time.

I can only think of comparing each string to each offer and appending it to different lists but if there are more than the 5 different values then I don't think that would work.

Hope someone can help

Does this answer your question? [How do I find the duplicates in a list and create another list with them?](https://stackoverflow.com/questions/9835762/how-do-i-find-the-duplicates-in-a-list-and-create-another-list-with-them) — Aero Blue, Jun 22 '20 at 20:16

score 1 · Accepted Answer · edited Jun 23 '20 at 07:15

1

You may use itertools.groupby() to it easily

from itertools import groupby    
values = ['Red','Green','Yellow','Blue','Blue','Green','Red','Yellow','Purple']

result = [list(v) for k,v in groupby(sorted(values))]    
print(result) 
# [['Blue', 'Blue'], ['Green', 'Green'], ['Purple'], ['Red', 'Red'], ['Yellow', 'Yellow']]

edited Jun 23 '20 at 07:15

answered Jun 22 '20 at 20:15

azro

53,056
7
34
70

Thanks very much for the help – TheAmazingHAzza Jun 22 '20 at 20:51

abc · Answer 2 · 2020-06-22T20:17:52.987

1

You can use a Counter to group elements and then build the output result. This would avoid the need of sorting the input list.

>>> from collections import Counter
>>> c = Counter(l)
>>> res = [[k]*v for k,v in c.items()]
>>> res
[['Red', 'Red'], ['Green', 'Green'], ['Yellow', 'Yellow'], ['Blue', 'Blue'], ['Purple']]

edited Jun 22 '20 at 20:17

answered Jun 22 '20 at 20:16

abc

11,579
2
26
51

score 1 · Answer 3 · answered Jun 22 '20 at 20:17

Try this, using a Counter object:

from collections import Counter

lst = ['Red','Green','Yellow','Blue','Blue','Green','Red','Yellow','Purple']
counter = Counter(lst)
[[color] * num for color, num in counter.items()]
=> [['Blue', 'Blue'], ['Purple'], ['Green', 'Green'], ['Yellow', 'Yellow'], ['Red', 'Red']]

The answer will be a list of lists, where each color is repeated as many times as it was in the original input list.

score 1 · Answer 4 · answered Jun 22 '20 at 20:17

You can use Counter() from the collections module:

from collections import Counter

lst = ['Red','Green','Yellow','Blue','Blue','Green','Red','Yellow','Purple']

c = Counter(lst)

lsts = [[l]*c[l] for l in c]

print(lsts)

Output:

[['Red', 'Red'], ['Green', 'Green'], ['Yellow', 'Yellow'], ['Blue', 'Blue'], ['Purple']]

Abhishek Bhagate · Answer 5 · 2020-06-22T20:29:23.207

Although the other answers are correct, here's another approach without using any outside package in a simple understandable manner -

lst= ['Red','Green','Yellow','Blue','Blue','Green','Red','Yellow','Purple']

dictionary = {}
# Store the frequency of each element that occurs in list
for i in lst :
    if(dictionary.get(i)==None):
        dictionary[i]=1
    else :
        dictionary[i]+=1

ans=[]
# Generate the final answer by making list with each element occurring according to their frequency
for k in dictionary.keys():
    tmp = [k]*dictionary[k]
    ans.append(tmp)
    
print(ans)

Output :

[['Red', 'Red'], ['Green', 'Green'], ['Yellow', 'Yellow'], ['Blue', 'Blue'], ['Purple']]

Or, if you don't want to generate 2-d list, you can directly print the list of each element where they occur their frequency number of times respectively as -

lst= ['Red','Green','Yellow','Blue','Blue','Green','Red','Yellow','Purple']
dictionary = {}

# Same as in last example... just a more pythonic way of doing it
for i in lst :
    dictionary[i]=dictionary.get(i,0)+1

for k in dictionary.keys():
    elements = [k]*dictionary[k]
    print(elements)

Output :

['Red', 'Red']
['Green', 'Green']
['Yellow', 'Yellow']
['Blue', 'Blue']
['Purple']

You will get the exact output as you had asked in the question. This would be the best way if you are willing to accomplish the task without any external packages.

score 0 · Answer 6 · answered Jun 23 '20 at 06:51

As of Python 3.7 Counter inherits the capability of dict to remember insertion order, so I finally have a use for the elements() method of Counter:

from collections import Counter
from itertools import islice

array = ['Red', 'Green', 'Yellow', 'Blue', 'Blue', 'Green', 'Red', 'Yellow', 'Purple']

counter = Counter(array)

elements = counter.elements()

print([[*islice(elements, count)] for count in counter.values()])

OUTPUT

> python3 test.py
[['Red', 'Red'], ['Green', 'Green'], ['Yellow', 'Yellow'], ['Blue', 'Blue'], ['Purple']]
>

Split list of strings into multiple sublists if they are the same

6 Answers6