4

How to sort an array of strings with anagrams next to each other?

Eg:

input {god, dog, abc, cab, man}
output {abc, cab, dog, god, man}

My approach: Sort the array ( without considering the anagrams case) in O(nlogn). Next, pick up the first string & create a histogram for the string, and compare the histogram with the remaining strings histograms in the array and place the matching strings at appropriate position of the array ..repeat until it reaches array size.. this algo takes worst case of O(n^3) (if we assume that in worst case, each string is also of size n) & extra space for the histogram representation

Histogram approach taken from the ref: finding if two words are anagrams of each other

Can we do better than this?

Community
  • 1
  • 1
Jitendar M
  • 633
  • 1
  • 10
  • 26

9 Answers9

13

You can certainly do better as follows:

  1. Loop through the array of strings
  2. For each string, first sort its characters, using the sorted string as key and original string as value, put into a hash table. you will end up with a hash table that with keys as sorted strings, and values being all anagrams, meanwhile, those values are ordered. You may use map<string, set<string> > to serve for this purpose.
  3. iterate over the hash-table and output all anagrams together for a given key, they should be next to each other

Assume the length of strings are M and size of array is N the time complexity is then: O(NMlogM), M is usually smaller than N in average. So this is much more efficient than what you have said.

taocp
  • 23,276
  • 10
  • 49
  • 62
  • I think compacting the word by counting number of every chars is enough for computing the key of the map. – Mingliang Liu Nov 09 '13 at 07:19
  • @liuml07 Could you please elaborate more? – Hengameh Aug 25 '15 at 09:56
  • 1
    @Hengameh let us do this with an example: aabbcc, you will have a2b2c2 as the key (abcabc has a2b2c2 as well in this case, so they are anagram of each other). – taocp Aug 25 '15 at 16:31
  • 2
    @taocp, please, inserting in map cost logN, so the correct time complexity is O( N (MlogM + logN) ), map is not exactly a hash table, it is implemented as binary tree. your solution is not complete, since iterate through the map at step 3, will not guarantee having word with no anagram in increasing order. Remember that the key of the map is the sorted word not the original – bonpiedlaroute Apr 24 '16 at 09:43
3
#include <vector>
#include <unordered_map>
#include <string>
#include <set>
#include <algorithm>
#include <iostream>

using namespace std;

vector<string> sort_string_anagram(vector<string> array)
{
    unordered_map<string, set<string>> anagram;

    for(string word : array)
    {
      string sorted_word(word);

      sort(sorted_word.begin(),sorted_word.end());

      anagram[sorted_word].insert(word);
    }

    sort(array.begin(), array.end());

    vector<string> result;

    for(string word : array)
    {
        unordered_map<string,set<string>>::iterator iter;

        string sorted_word(word);

        sort(sorted_word.begin(), sorted_word.end());

        if( (iter = anagram.find(sorted_word)) != anagram.end() )
        {
           for(set<string>::iterator it = (iter->second).begin(); it!= (iter->second).end();++it)
           {
              result.push_back(*it);
           }
           anagram.erase(iter);
        }
    }
    return result;
}

@Jitendard, @taocp, a complete solution with time complexity: O( N(MlogM) + NlogN + N(MlogM + A) ). N is array size, M is word size, A is the maximum number of anagram that exists for a word

bonpiedlaroute
  • 173
  • 1
  • 9
2

In python this can be done by:

sorted(word_list, key=lambda word: ''.join(sorted(word.replace(" ", "").lower())))

where the key is the sorted alphabetical order of character. The key will be same for anagrams, thus keeping them together

Alexander O'Mara
  • 58,688
  • 18
  • 163
  • 171
aditya841
  • 21
  • 1
1

@Song Wang : Even I was thinking of doing it that way. But how do you know the order in which to put strings once you take them out of the hashmap ? Suppose you extract
K1 = "abc", V1 = "cab"
K2 = "abc", V2 = "abc"
How would you know which one to put first in the list 1 or 2 ?
Maybe you'll sort them again. But , then that'll be bad for the complexity.

user1925405
  • 332
  • 1
  • 5
  • 13
  • 1
    I should have made this clear, for hashmap, I did not really mean that it is hashmap used in JAVA, since hashmap has no order. Here will need kind of mapping that has order. You can use map> in C++ to serve the purpose of a hashtable. – taocp Mar 20 '13 at 13:19
  • The question does not ask to sort anagrams. It is asked to put them next together. So, there is no need to sort them. Am I right? – Hengameh Aug 26 '15 at 00:27
  • But if we really want them in order, can't we use 'TreeMap' in @taocp's solution, instead of "HashMap"? (obviously, I am talking about Java) – Hengameh Aug 26 '15 at 00:28
0

Why sort in the first place? Cant you just divide the array into subsets based on anagrams. Sort the subsets and finally merge based on the first word in each subset.

smk
  • 5,340
  • 5
  • 27
  • 41
0

putting this in a 'real' java programming context (i.e. we use some existing and basic jdk util classes, i think the following approach may give another interesting aspect to this topic (i.e. "how to sort an array of strings with anagrams next to each other"):

(a) we define a comparator to tell if two strings are anagrams; (b) we use Arrays.sort(array, comparator) to sort the array;

below is the code and result (the idea can be seen in chapter 9, "cracking the coding interview" by Gayle Laakmann, for example)

import java.util.Arrays;
import java.util.Comparator;

public class SolutionForSortArraysByAnagrams {

    public static void main(String[] args){

        String[] strArray = new String[]{"abets","mates","baste","meats", "betas","beast", "steam", "tames", "beats", "teams"}; 

        sortArraysByAnagrams(strArray);

        for(String str : strArray){
            System.out.println(str);
        }

    }

    private static void sortArraysByAnagrams(String[] strArray) {

        Arrays.sort(strArray, new AnagramComparator());

    }

}


class AnagramComparator implements Comparator<String> {

    @Override
    public int compare(String s1, String s2) {

        //check edge conditions and length
        if( s1 == null || s2 == null)
            return -1;
        if( s1.length() <  s2.length())
            return -1;
        else if ( s1.length() >  s2.length())
            return 1;

        //sort s1 and s2 to compare:
        //System.out.println(s1 + " vs  " + s2);        
        return sort(s1).compareTo(sort(s2));

    }

    private String sort(String s1) {
        char[] cArray = s1.toCharArray();
        Arrays.sort(cArray);        
        //System.out.println(" sorted:  " + new String(cArray));
        return new String(cArray);
    }


}

input: {"abets","mates","baste","meats", "betas","beast", "steam", "tames", "beats", "teams"};

output: abets baste betas beast beats mates meats steam tames teams

Pang
  • 9,564
  • 146
  • 81
  • 122
user3044236
  • 241
  • 3
  • 7
0

found the solution from the internet:

public static void sortStringWithAnagrams(String[] stringArray) {
    Arrays.sort(stringArray, new AnagramComparator());
}

public static class AnagramComparator implements Comparator<String> {

    public String getSortedString(String s) {
        char[] content = s.toCharArray();
        Arrays.sort(content);
        return new String(content);
    }

    public int compare(String s1, String s2) {
        return getSortedString(s1).compareTo(getSortedString(s2));
    }

}
shanwu
  • 1,493
  • 6
  • 35
  • 45
0
import java.util.Arrays;
import java.util.Comparator;

/**
 * Sort an array of strings so that all anagrams are next to each other
 * @author asharda
 *
 */
public class Anagram implements Comparator<String> {


  public static String  anagram(String input)
  {
    char []s=input.toCharArray();
    Arrays.sort(s);
    return new String(s);
  }
  public static void main(String[] args) {
    // TODO Auto-generated method stub

    String arr[]= {"abc","god","cab","dog"};
    Arrays.sort(arr, new Anagram());
    for(String s:arr)
    System.out.println(s);

  }
  @Override
  public int compare(String arg0, String arg1) {
    return arg0.compareTo(arg1);
  }


}

//Credit to Cracking Coding Interview by Gayle Laakmann
user7258708
  • 251
  • 2
  • 7
0
 private static String[] getSortedAnagram(String[] array) {
    Map<String, ArrayList<String>> sortedMap = new HashMap<>();
    for (String a : array) {
        String sortedString = a.chars().sorted().
                collect(StringBuilder::new, StringBuilder::appendCodePoint, StringBuilder::append).toString();
        sortedMap.computeIfAbsent(sortedString, s->new ArrayList<>()).add(a);
    }
   String[] output = new String[array.length];
   List<String> list = sortedMap.values().stream().flatMap(List::stream).collect(Collectors.toList());
   return list.toArray(output);
}