Create lists of anagrams from a list of words

Question

I want to find create lists of anagrams from a list of words. Should I use another loop in my code or recursion?

some_list = ['bad', 'app', 'sad', 'mad', 'dab','pge', 'bda', 'ppa', 'das', 'dba']

new_list = [some_list[0]]
i = 0
while i+1 < len(some_list):
    if (''.join(sorted(some_list[0]))) == (''.join(sorted(some_list[i+1]))):
        new_list.append(some_list[i+1])
        i = i+1
    else:
        i = i+1

print(new_list)

My output is ['bad', 'dab', 'bda', 'dba']. But I also want more lists of other anagrams from some_list.

I want the output to be: - ['app', 'ppa'] - ['bad', 'dab', 'bda', 'dba'] - ['sad', 'das']

Remember to accept an answer! http://i.stack.imgur.com/uqJeW.png — Bhargav Rao, May 15 '15 at 09:22

Stefan Pochmann · Answer 1 · 2015-05-14T22:44:23.420

6

I recommend you write Python, not Java or whatever other language you're emulating there. Here's your core code in Python, with normal looping and without all the unnecessary stuff:

new_list = [some_list[0]]
for word in some_list[1:]:
    if sorted(some_list[0]) == sorted(word):
        new_list.append(word)

I don't see use for recursion, but yes, you could wrap an outer loop around this to find the other anagram groups.

Though this is how I'd do it, using the helpful itertools.groupby:

for _, group in groupby(sorted(some_list, key=sorted), sorted):
    group = list(group)
    if len(group) > 1:
        print(group)

That prints:

['bad', 'dab', 'bda', 'dba']
['sad', 'das']
['app', 'ppa']

Alternative solution for the changed question with sorting the groups:

groups = (list(group) for _, group in groupby(sorted(some_list, key=sorted), sorted))
print([group for group in sorted(groups) if len(group) > 1])

Output:

[['app', 'ppa'], ['bad', 'dab', 'bda', 'dba'], ['sad', 'das']]

edited May 14 '15 at 22:44

answered May 14 '15 at 20:23

Stefan Pochmann

27,593
8
44
107

When it prints, it does not print the lists in an alphabetical order, i.e; ['app', 'ppa'] ['bad', 'dab', 'bda', 'dba'] ['sad', 'das'] – pk. May 14 '15 at 21:40
Yes. So? Your *"I want the output to be:"* part in your question doesn't have it sorted. – Stefan Pochmann May 14 '15 at 21:43
In general, not a good thing. You're invalidating the valid already posted answers that way. And are you sure you have the final version now? You have 'dab' before 'bda' in the second group. – Stefan Pochmann May 14 '15 at 21:59
Yes. The lists only need to be printed in alphabetical order. – pk. May 14 '15 at 22:06
Ok, I've added a solution for that. – Stefan Pochmann May 14 '15 at 22:45
I am writing the output to a file, but it still is not writing the results in alphabetical order. ['abolitionism', 'mobilisation'] ['accouterment', 'accoutrement'] ['alterability', 'bilaterality'] ['amphitheater', 'amphitheatre'] ['behaviourism', 'misbehaviour'] ['conservation', 'conversation'] ['commissioned', 'decommission'] ['discreetness', 'discreteness'] ['impressively', 'permissively'] ['microcephaly', 'pyrochemical'] ['paradisaical', 'paradisiacal'] ['restrengthen', 'strengthener'] ['unimpressive', 'unpermissive'] ['inactivation', 'vaticination'] – pk. May 15 '15 at 08:07
@azhar You must have used some wrong code. The correct one (the two-liner on the bottom of my answer) can't produce that order. I suspect you replaced my second line with your own and forgot the `sorted(...)`. – Stefan Pochmann May 15 '15 at 13:01

Mazdak · Answer 2 · 2015-05-14T20:39:43.017

Your problem is that you loop one time over your list ,since you need to loop based on all of the words.

But i suggest another way for this task,you can use itertools.groupby and sorted function using operator.itemgetter :

some_list = ['bad', 'app', 'sad', 'mad', 'dab','pge', 'bda', 'ppa', 'das', 'dba']

from operator import itemgetter
from itertools import groupby 
s=sorted([(i,''.join(sorted(j))) for i,j in enumerate(some_list)],key=itemgetter(1))
inds= [zip(*g)[0] for _,g in groupby(s,itemgetter(1))]
print [itemgetter(*i)(some_list) for i in inds]

Result :

[('bad', 'dab', 'bda', 'dba'), 'mad', ('sad', 'das'), ('app', 'ppa'), 'pge']

All that i have done here is creating a list of sorted words with those index using sorted and enumerate :

sorted([(i,''.join(sorted(j))) for i,j in enumerate(some_list)],key=itemgetter(1))
[(0, 'abd'), (4, 'abd'), (6, 'abd'), (9, 'abd'), (3, 'adm'), (2, 'ads'), (8, 'ads'), (1, 'app'), (7, 'app'), (5, 'egp')]

then we need to grouping this pairs based on the second element and get the first element (indices) so we will have the following list of tuples :

[(0, 4, 6, 9), (3,), (2, 8), (1, 7), (5,)]

that each tuple is contain the indices of the words that those sorted representations are same.

and at last all you need is picking up the elements of the main list based on the preceding indices :

[itemgetter(*i)(some_list) for i in inds]
[('bad', 'dab', 'bda', 'dba'), 'mad', ('sad', 'das'), ('app', 'ppa'), 'pge']

@StefanPochmann No its not, as you can understand from explanation the logic is simple! — Mazdak, May 14 '15 at 20:36
I mean complicated in the sense that you're doing a lot of stuff that you don't need to. Going to indexes and back for no reason. — Stefan Pochmann, May 14 '15 at 20:40
@StefanPochmann Yeah i dont need to sort and then grub the indices i could simply used `sorted` as the key,i see you code the part `key=sorted` is what i missed and you got my vote for that! — Mazdak, May 14 '15 at 20:44

score 0 · Answer 3 · answered May 14 '15 at 20:01

0

1) Create a function anagrams(word) that is returning a list of anagrams for a single word as your code does.
2) map the function over your list of words.

answered May 14 '15 at 20:01

Eugene Sh.

17,802
8
40
61

What do you mean by 'map'? – pk. May 14 '15 at 20:04

dermen · Answer 4 · 2015-05-14T20:35:37.570

Here is a solution:

from itertools import groupby
some_list = ['bad', 'app', 'sad', 'mad', 'dab','pge', 'bda', 'ppa', 'das', 'dba']
some_list_ordered = map( lambda x : "".join( sorted( x) ), some_list )
some_lists = sorted(zip( some_list_ordered, some_list ) )
anagrams = filter( lambda x : len( x ) > 1, [ zip(*v)[1]  for k,v in groupby( some_lists, lambda x : x[0] ) ] )    

for a in anagrams:
    print a

#('bad', 'bda', 'dab', 'dba')
#('das', 'sad')
#('app', 'ppa')

xnx · Answer 5 · 2015-05-14T20:57:55.310

0

The natural way to do this if you can afford the memory overhead of an additional dictionary seems to me to be:

words = ['bad', 'app', 'sad', 'mad', 'dab','pge', 'bda', 'ppa', 'das', 'dba']

anagrams = {}
for word in words:
    sword = ''.join(sorted(word))
    try:
        anagrams[sword].append(word)
    except KeyError:
        anagrams[sword] = [word]

anagrams_list = [v for v in anagrams.values() if len(v) > 1]
print anagrams_list

Output:

[['app', 'ppa'], ['bad', 'dab', 'bda', 'dba'], ['sad', 'das']]

EDIT: As mentioned in the comment below you can replace the try...except block with the dict method setdefault if the syntax doesn't bother you:

words = ['bad', 'app', 'sad', 'mad', 'dab','pge', 'bda', 'ppa', 'das', 'dba']

anagrams = {}
for word in words:
    sword = ''.join(sorted(word))
    anagrams.setdefault(sword, []).append(word)

anagrams_list = [v for v in anagrams.values() if len(v) > 1]
print anagrams_list

edited May 14 '15 at 20:57

answered May 14 '15 at 20:30

xnx

24,509
11
70
109

Do you know `list.setdefault`? That could replace your whole try/except part. – Stefan Pochmann May 14 '15 at 20:45
@StefanPochmann I find the syntax for this a bit upsetting because even though it's called `setdefault` you're actually _getting_ something, but I've given an alternative. – xnx May 14 '15 at 21:00
Yeah, originally I also found it a bit confusing, but I got used to it :-). Though I still mostly use defaultdict instead, unless I have a good reason not to. – Stefan Pochmann May 14 '15 at 21:08

Padraic Cunningham · Answer 6 · 2015-05-14T21:17:04.497

You can group the words in a dict, using the sorted word as the key, filtering out the words that values that don't have at least two elements, using an OrderedDict to keep order:

some_list = ['bad', 'app', 'sad', 'mad', 'dab','pge', 'bda', 'ppa', 'das', 'dba']


from collections import OrderedDict

od = OrderedDict()
for ele in some_list:
    srt = "".join(sorted(ele))
    od.setdefault(srt,[]).append(ele)

print(filter(lambda x: len(x) > 1, od.values()))


[['bad', 'dab', 'bda', 'dba'], ['app', 'ppa'], ['sad', 'das']]

Or using a loop and appending to a list, using a temp list to gather common words:

new_list = []
from collections import OrderedDict
for ele in OrderedDict.fromkeys("".join(sorted(ele)) for ele in some_list):
    temp = []
    for s in some_list:
        if ele == ''.join(sorted(s)):
            temp.append(s)
    if len(temp) > 1:
        new_list.append(temp)

If order does not matter a defaultdict would be more efficient:

from collections import defaultdict

d = defaultdict(list)
for ele in some_list:
    d[''.join(sorted(ele))].append(ele)

print(filter(lambda x: len(x) > 1, d.values()))
[['app', 'ppa'], ['bad', 'dab', 'bda', 'dba'], ['sad', 'das']]

score -1 · Answer 7 · answered Nov 04 '20 at 16:40

import java.util.*;

public class GroupAnagramsTogether {
    public static void main(String[] args)
 {
        String [] input = new String [] {"bad", "app", "sad", "mad", "dab","pge", "bda", "ppa", "das", "dba"};
        System.out.println("Input: " + Arrays.toString(input));

        List<List<String>> result = groupAnagram(input);
        System.out.println(result);
    }

    private static List<List<String>> groupAnagram(String[] input) 
{
        List<List<String>> list = new ArrayList<List<String>>();
        
        HashMap<String, List<String>> mp = new HashMap<String, List<String>>();
        
        for(String s : input)
        {
            char[] ch = s.toCharArray();
            Arrays.sort(ch);
            
            String key = new String(ch);
            if(mp.containsKey(key))
            {
                mp.get(key).add(s);
            }
            else
            {
                List<String> strList = new ArrayList<String>();
                strList.add(s);
                mp.put(key,strList);
            }
        }
        list.addAll(mp.values());
        return list;
    }

}

The question is using python, while this is java. – Boyan Hristov Nov 04 '20 at 17:00 — Boyan Hristov, Nov 04 '20 at 17:00

Create lists of anagrams from a list of words

7 Answers7

Linked