1

I was wondering if it was possible in python to split a list of strings into multiple sublists if they are the same string. For example:

Input:

['Red','Green','Yellow','Blue','Blue','Green','Red','Yellow','Purple']

Output:

['Red','Red']

['Green','Green']

['Yellow','Yellow']

['Blue','Blue']

['Purple']

I need it to be able to do this with different values each time.

I can only think of comparing each string to each offer and appending it to different lists but if there are more than the 5 different values then I don't think that would work.

Hope someone can help

TheAmazingHAzza
  • 116
  • 1
  • 11
  • Does this answer your question? [How do I find the duplicates in a list and create another list with them?](https://stackoverflow.com/questions/9835762/how-do-i-find-the-duplicates-in-a-list-and-create-another-list-with-them) – Aero Blue Jun 22 '20 at 20:16

6 Answers6

1

You may use itertools.groupby() to it easily

from itertools import groupby    
values = ['Red','Green','Yellow','Blue','Blue','Green','Red','Yellow','Purple']

result = [list(v) for k,v in groupby(sorted(values))]    
print(result) 
# [['Blue', 'Blue'], ['Green', 'Green'], ['Purple'], ['Red', 'Red'], ['Yellow', 'Yellow']]
azro
  • 53,056
  • 7
  • 34
  • 70
1

You can use a Counter to group elements and then build the output result. This would avoid the need of sorting the input list.

>>> from collections import Counter
>>> c = Counter(l)
>>> res = [[k]*v for k,v in c.items()]
>>> res
[['Red', 'Red'], ['Green', 'Green'], ['Yellow', 'Yellow'], ['Blue', 'Blue'], ['Purple']]
abc
  • 11,579
  • 2
  • 26
  • 51
1

Try this, using a Counter object:

from collections import Counter

lst = ['Red','Green','Yellow','Blue','Blue','Green','Red','Yellow','Purple']
counter = Counter(lst)
[[color] * num for color, num in counter.items()]
=> [['Blue', 'Blue'], ['Purple'], ['Green', 'Green'], ['Yellow', 'Yellow'], ['Red', 'Red']]

The answer will be a list of lists, where each color is repeated as many times as it was in the original input list.

Óscar López
  • 232,561
  • 37
  • 312
  • 386
1

You can use Counter() from the collections module:

from collections import Counter

lst = ['Red','Green','Yellow','Blue','Blue','Green','Red','Yellow','Purple']

c = Counter(lst)

lsts = [[l]*c[l] for l in c]

print(lsts)

Output:

[['Red', 'Red'], ['Green', 'Green'], ['Yellow', 'Yellow'], ['Blue', 'Blue'], ['Purple']]
Red
  • 26,798
  • 7
  • 36
  • 58
1

Although the other answers are correct, here's another approach without using any outside package in a simple understandable manner -

lst= ['Red','Green','Yellow','Blue','Blue','Green','Red','Yellow','Purple']

dictionary = {}
# Store the frequency of each element that occurs in list
for i in lst :
    if(dictionary.get(i)==None):
        dictionary[i]=1
    else :
        dictionary[i]+=1

ans=[]
# Generate the final answer by making list with each element occurring according to their frequency
for k in dictionary.keys():
    tmp = [k]*dictionary[k]
    ans.append(tmp)
    
print(ans)

Output :

[['Red', 'Red'], ['Green', 'Green'], ['Yellow', 'Yellow'], ['Blue', 'Blue'], ['Purple']]

Or, if you don't want to generate 2-d list, you can directly print the list of each element where they occur their frequency number of times respectively as -

lst= ['Red','Green','Yellow','Blue','Blue','Green','Red','Yellow','Purple']
dictionary = {}

# Same as in last example... just a more pythonic way of doing it
for i in lst :
    dictionary[i]=dictionary.get(i,0)+1

for k in dictionary.keys():
    elements = [k]*dictionary[k]
    print(elements)
    

Output :

['Red', 'Red']
['Green', 'Green']
['Yellow', 'Yellow']
['Blue', 'Blue']
['Purple']

You will get the exact output as you had asked in the question. This would be the best way if you are willing to accomplish the task without any external packages.

Abhishek Bhagate
  • 5,583
  • 3
  • 15
  • 32
0

As of Python 3.7 Counter inherits the capability of dict to remember insertion order, so I finally have a use for the elements() method of Counter:

from collections import Counter
from itertools import islice

array = ['Red', 'Green', 'Yellow', 'Blue', 'Blue', 'Green', 'Red', 'Yellow', 'Purple']

counter = Counter(array)

elements = counter.elements()

print([[*islice(elements, count)] for count in counter.values()])

OUTPUT

> python3 test.py
[['Red', 'Red'], ['Green', 'Green'], ['Yellow', 'Yellow'], ['Blue', 'Blue'], ['Purple']]
>
cdlane
  • 40,441
  • 5
  • 32
  • 81