27

I am looking for a method to find if two strings are anagrams of one another.

Ex: string1 - abcde
string2 - abced
Ans = true
Ex: string1 - abcde
string2 - abcfed
Ans = false

the solution i came up with so for is to sort both the strings and compare each character from both strings till the end of either strings.It would be O(logn).I am looking for some other efficient method which doesn't change the 2 strings being compared

Preet Sangha
  • 64,563
  • 18
  • 145
  • 216
user514946
  • 1,165
  • 3
  • 12
  • 12
  • 3
    posssible duplicate of [What is an easy way to tell if a list of words are anagrams of each other?](http://stackoverflow.com/questions/522112/what-is-an-easy-way-to-tell-if-a-list-of-words-are-anagrams-of-each-other) – kennytm Nov 21 '10 at 07:53
  • Note that using a several-unique sort is O(n) for a fixed character set with O(1) (not depending on character set) stack memory use. Of course, this will still modify the two strings. – Kaganar Jul 02 '13 at 21:43

23 Answers23

55

Count the frequency of each character in the two strings. Check if the two histograms match. O(n) time, O(1) space (assuming ASCII) (Of course it is still O(1) space for Unicode but the table will become very large).

kennytm
  • 510,854
  • 105
  • 1,084
  • 1,005
  • 17
    +1 simple yet effective. Space can easily be halved by incrementing for one and decrementing for the other string and checking for 0. For unicode the Hashmap approach seems appropriate. – Eiko Nov 21 '10 at 07:52
  • 8
    Having an initial string length check would help eliminate a lot of candidate strings. – Srujan Kumar Gulla Dec 07 '12 at 23:03
  • as @Eiko mentioned I felt its right, but remember even If you think hashmap initial get created with some default value like 16 in JAVA as soon as table get increased it does rehashing which is overhead. Now for avoiding this you can create hashmap with initial capacity which is like creating array of that capacity, since underneath hashmap is array. Even setting initial capacity I would feel direct indexing based on character is better than calculating hash while putting char everytime in hashmap ....Let me know if I am wrong. – MrSham May 27 '13 at 16:22
  • 2
    Hi can you explain a bit more about "counting the frequency" and "histogram" in terms of code? – nyus2006 Mar 25 '14 at 19:56
  • @S.H. I added the code to check the anagrams by 'counting the frequency' below. – Dany Apr 29 '14 at 23:01
  • Wouldn't time be O(2n) as the hashtable has unavoidingly be traversed twice? First time to populate it, second time to verify all positions are zero. @Eiko: I am not sure about that what you state halves the hashtable size, it will store the same amount of integers anyway, don't you think so? – Francisco Carriedo Scher Dec 24 '15 at 15:35
  • @FranciscoCarriedoScher Yes, but O(2n) = O(n). – kennytm Dec 24 '15 at 16:47
  • @FranciscoCarriedoScher My point was that you don't need one histogram/table/hashtable for each string, but only one in total. – Eiko Dec 25 '15 at 17:14
  • @kennytm why is space O(1), not O(n)? – aerin Apr 13 '17 at 16:47
  • 1
    @ToussaintLouverture Because there are only 95 ASCII characters. – kennytm Apr 13 '17 at 16:48
33

Get table of prime numbers, enough to map each prime to every character. So start from 1, going through line, multiply the number by the prime representing current character. Number you'll get is only depend on characters in string but not on their order, and every unique set of characters correspond to unique number, as any number may be factored in only one way. So you can just compare two numbers to say if a strings are anagrams of each other.

Unfortunately you have to use multiple precision (arbitrary-precision) integer arithmetic to do this, or you will get overflow or rounding exceptions when using this method.
For this you may use libraries like BigInteger, GMP, MPIR or IntX.

Pseudocode:

prime[] = {2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101}

primehash(string)
    Y = 1;
    foreach character in string
        Y = Y * prime[character-'a']

    return Y

isanagram(str1, str2)
    return primehash(str1)==primehash(str2)
silkfire
  • 24,585
  • 15
  • 82
  • 105
Vovanium
  • 3,798
  • 17
  • 23
  • I don't think this will work. For instance 2 + 5 = 7 so if a = 2, b = 3, c = 5 and d = 7 if one string had a and c and the other had d then there would be a collision in your "sum". Perhaps this would hold true if you first checked that the length of each string was identical? – runamok May 16 '13 at 18:13
  • 3
    Primes multiplied, not added. primehash("ac") = 2*5 = 10, primehash("d") = 7, so they're not equal. There will be NO collisions because of one-to-one relation between number and its prime factors. – Vovanium Jun 01 '13 at 12:47
  • 3
    Lovely - Fundamental Theorem of Arithmetic! https://en.wikipedia.org/wiki/Fundamental_theorem_of_arithmetic – StackG Jul 29 '15 at 12:48
  • Great algorithm, this helped me a lot! – Tudor Oct 28 '15 at 21:07
  • 4
    The only issue here is that you might endup with a huge integer. Maybe bigger than the max int of the system. E.g. primehash("zzzzzzzzzz") is 101^10 that is greater than 2^64! – viebel Nov 03 '15 at 19:33
  • 1
    This solution really provides the time O(n) and there is no need to create an additional datastructure, as the most voted solution requires. – Francisco Carriedo Scher Dec 24 '15 at 15:42
24
  1. Create a Hashmap where key - letter and value - frequencey of letter,
  2. for first string populate the hashmap (O(n))
  3. for second string decrement count and remove element from hashmap O(n)
  4. if hashmap is empty, the string is anagram otherwise not.
Eternal Noob
  • 2,717
  • 5
  • 27
  • 41
  • nice idea to increment in the first run and decrement in the second. However, a hashmap is not needed here as the keys are known in advance and can be used as array indices. This will result in a faster implementation. – Philipp Nov 21 '10 at 07:50
  • 3
    @Philipp That might be a very large array if we are dealing with a unicode string. – user229044 Nov 21 '10 at 07:58
  • 1
    Isn't this rather wasteful if you're generally comparing small strings? If the strings are on average small (like most words are) then simply turning the words into a character arrays, sorting the arrays, then comparing them for equality will likely be faster than the overhead of creating a hashmap, doing the increments & decrements & then examining the hashmap for being empty. – Adam Parkin May 24 '13 at 05:01
  • 1
    @AdamParkin it's only "wasteful" memory-wise. Nowadays, memory is very cheap. The most important part (to most people) is runtime. Your way would take `O(n log n)` time due to sorting, the provided answer takes just `O(n)`. Besides creating two character arrays as you say will also take `O(n)` space. – ugo Jul 13 '14 at 17:07
  • Yes, in terms of ```O()``` notation, I will grant you it's more efficient. However, that's discarding the constant factors, and oftentimes with algorithms that have better big-O runtimes have poor performance when n is small. All I was alluding to is that in this case n is probably small, so the overhead of allocating expensive collections like ```HashMap```'s might be more expensive than sorting a small char array. – Adam Parkin Jul 14 '14 at 17:52
8

The steps are:

  1. check the length of of both the words/strings if they are equal then only proceed to check for anagram else do nothing
  2. sort both the words/strings and then compare

JAVA CODE TO THE SAME:

/*
 * To change this template, choose Tools | Templates
 * and open the template in the editor.
 */
package anagram;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.Arrays;

/**
 *
 * @author Sunshine
 */
public class Anagram {

    /**
     * @param args the command line arguments
     */
    public static void main(String[] args) throws IOException {
        // TODO code application logic here
        System.out.println("Enter the first string");
        BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
        String s1 = br.readLine().toLowerCase();
        System.out.println("Enter the Second string");
        BufferedReader br2 = new BufferedReader(new InputStreamReader(System.in));
        String s2 = br2.readLine().toLowerCase();
        char c1[] = null;
        char c2[] = null;
        if (s1.length() == s2.length()) {


            c1 = s1.toCharArray();
            c2 = s2.toCharArray();

            Arrays.sort(c1);
            Arrays.sort(c2);

            if (Arrays.equals(c1, c2)) {
                System.out.println("Both strings are equal and hence they have anagram");
            } else {
                System.out.println("Sorry No anagram in the strings entred");
            }

        } else {
            System.out.println("Sorry the string do not have anagram");
        }
    }
}
Bart
  • 19,692
  • 7
  • 68
  • 77
fiddle
  • 1,095
  • 5
  • 18
  • 33
3

C#

public static bool AreAnagrams(string s1, string s2)
{
  if (s1 == null) throw new ArgumentNullException("s1");
  if (s2 == null) throw new ArgumentNullException("s2");

  var chars = new Dictionary<char, int>();
  foreach (char c in s1)
  {
      if (!chars.ContainsKey(c))
          chars[c] = 0;
      chars[c]++;
  }
  foreach (char c in s2)
  {
      if (!chars.ContainsKey(c))
          return false;
      chars[c]--;
  }

  return chars.Values.All(i => i == 0);
}

Some tests:

[TestMethod]
public void TestAnagrams()
{
  Assert.IsTrue(StringUtil.AreAnagrams("anagramm", "nagaramm"));
  Assert.IsTrue(StringUtil.AreAnagrams("anzagramm", "nagarzamm"));
  Assert.IsTrue(StringUtil.AreAnagrams("anz121agramm", "nag12arz1amm"));
  Assert.IsFalse(StringUtil.AreAnagrams("anagram", "nagaramm"));
  Assert.IsFalse(StringUtil.AreAnagrams("nzagramm", "nagarzamm"));
  Assert.IsFalse(StringUtil.AreAnagrams("anzagramm", "nag12arz1amm"));
}
Ohad Schneider
  • 36,600
  • 15
  • 168
  • 198
2

Code to find whether two words are anagrams:

Logic explained already in few answers and few asking for the code. This solution produce the result in O(n) time.

This approach counts the no of occurrences of each character and store it in the respective ASCII location for each string. And then compare the two array counts. If it is not equal the given strings are not anagrams.

public boolean isAnagram(String str1, String str2)
{
    //To get the no of occurrences of each character and store it in their ASCII location
    int[] strCountArr1=getASCIICountArr(str1);
    int[] strCountArr2=getASCIICountArr(str2);

    //To Test whether the two arrays have the same count of characters. Array size 256 since ASCII 256 unique values
    for(int i=0;i<256;i++)
    {
        if(strCountArr1[i]!=strCountArr2[i])
            return false;
    }
    return true;
}

public int[] getASCIICountArr(String str)
{
    char c;
    //Array size 256 for ASCII
    int[] strCountArr=new int[256];
    for(int i=0;i<str.length();i++)
    {
        c=str.charAt(i); 
        c=Character.toUpperCase(c);// If both the cases are considered to be the same
        strCountArr[(int)c]++; //To increment the count in the character's ASCII location
    }
    return strCountArr;
}
Dany
  • 2,692
  • 7
  • 44
  • 67
  • OH!!! This is what "histogram" means in this particular case. Thank you for that. But one question tho, this solution is O(N) space complexity how come the top comment says it's O(1)??? – nyus2006 Apr 30 '14 at 23:18
  • @S.H.- Array size is 256 for all strings. If you give one million characters in a string, still it will store in a array of size 256 which is a constant. So the complexity is O(1) space. – Dany May 01 '14 at 00:25
  • yes you are correct, O(256) is asymptotically O(1). Thank you! – nyus2006 May 02 '14 at 01:37
2

Using an ASCII hash-map that allows O(1) look-up for each char.

The java example listed above is converting to lower-case that seems incomplete. I have an example in C that simply initializes a hash-map array for ASCII values to '-1'

If string2 is different in length than string 1, no anagrams

Else, we update the appropriate hash-map values to 0 for each char in string1 and string2

Then for each char in string1, we update the count in hash-map. Similarily, we decrement the value of the count for each char in string2.

The result should have values set to 0 for each char if they are anagrams. if not, some positive value set by string1 remains

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define ARRAYMAX 128

#define True        1
#define False       0

int isAnagram(const char *string1, 
            const char *string2) {

    int str1len = strlen(string1);
    int str2len = strlen(string2);

    if (str1len != str2len) /* Simple string length test */
        return False;

    int * ascii_hashtbl = (int * ) malloc((sizeof(int) * ARRAYMAX));
    if (ascii_hashtbl == NULL) {
        fprintf(stderr, "Memory allocation failed\n");
        return -1;
    }
    memset((void *)ascii_hashtbl, -1, sizeof(int) * ARRAYMAX);
    int index = 0;
    while (index < str1len) { /* Populate hash_table for each ASCII value 
                                in string1*/
        ascii_hashtbl[(int)string1[index]] = 0;
        ascii_hashtbl[(int)string2[index]] = 0;
        index++;
    }
    index = index - 1;
    while (index >= 0) {
        ascii_hashtbl[(int)string1[index]]++; /* Increment something */
        ascii_hashtbl[(int)string2[index]]--; /* Decrement something */
        index--;
    }
    /* Use hash_table to compare string2 */
    index = 0;
    while (index < str1len) {
        if (ascii_hashtbl[(int)string1[index]] != 0) {
            /* some char is missing in string2 from string1 */
            free(ascii_hashtbl);
            ascii_hashtbl = NULL;
            return False;
        }
        index++;
    }
    free(ascii_hashtbl);
    ascii_hashtbl = NULL;
    return True;
}

int main () {
    char array1[ARRAYMAX], array2[ARRAYMAX];
    int flag;

    printf("Enter the string\n");
    fgets(array1, ARRAYMAX, stdin);
    printf("Enter another string\n");
    fgets(array2, ARRAYMAX, stdin);

    array1[strcspn(array1, "\r\n")] = 0;
    array2[strcspn(array2, "\r\n")] = 0;
    flag = isAnagram(array1, array2);
    if (flag == 1)
        printf("%s and %s are anagrams.\n", array1, array2);
    else if (flag == 0)
        printf("%s and %s are not anagrams.\n", array1, array2);

    return 0;
}
Rock
  • 177
  • 3
  • 11
2

let's take a question: Given two strings s and t, write a function to determine if t is an anagram of s.

For example, s = "anagram", t = "nagaram", return true. s = "rat", t = "car", return false.

Method 1(Using HashMap ):

public class Method1 {

    public static void main(String[] args) {
        String a = "protijayi";
        String b = "jayiproti";
        System.out.println(isAnagram(a, b ));// output => true

    }

    private static boolean isAnagram(String a, String b) {
        Map<Character ,Integer> map = new HashMap<>();
        for( char c : a.toCharArray()) {
            map.put(c,    map.getOrDefault(c, 0 ) + 1 );
        }
        for(char c : b.toCharArray()) {
            int count = map.getOrDefault(c, 0);
            if(count  == 0 ) {return false ; }
            else {map.put(c, count - 1 ) ; }
        }
        
        return true;
    }

}

Method 2 :

public class Method2 {
public static void main(String[] args) {
    String a = "protijayi";
    String b = "jayiproti";

    
    System.out.println(isAnagram(a, b));// output=> true
}

private static boolean isAnagram(String a, String b) {
   
    
    int[] alphabet = new int[26];
    for(int i = 0 ; i < a.length() ;i++) {
         alphabet[a.charAt(i) - 'a']++ ;
    }
    for (int i = 0; i < b.length(); i++) {
         alphabet[b.charAt(i) - 'a']-- ;
    }
    
    for(  int w :  alphabet ) {
         if(w != 0 ) {return false;}
    }
    return true;
    
}
}

Method 3 :

public class Method3 {
public static void main(String[] args) {
    String a = "protijayi";
    String b = "jayiproti";
    
    
    System.out.println(isAnagram(a, b ));// output => true
}

private static boolean isAnagram(String a, String b) {
    char[] ca = a.toCharArray() ;
    char[] cb = b.toCharArray();
    Arrays.sort(   ca     );
    
    Arrays.sort(   cb        );
    return Arrays.equals(ca , cb );
}
}

Method 4 :

public class AnagramsOrNot {
    public static void main(String[] args) {
        String a = "Protijayi";
        String b = "jayiProti";
        isAnagram(a, b);
    }

    private static void isAnagram(String a, String b) {
        Map<Integer, Integer> map = new LinkedHashMap<>();

        a.codePoints().forEach(code -> map.put(code, map.getOrDefault(code, 0) + 1));
        System.out.println(map);
        b.codePoints().forEach(code -> map.put(code, map.getOrDefault(code, 0) - 1));
        System.out.println(map);
        if (map.values().contains(0)) {
            System.out.println("Anagrams");
        } else {
            System.out.println("Not Anagrams");
        }
    }
}

In Python:

def areAnagram(a, b):
    if len(a) != len(b): return False
    count1 = [0] * 256
    count2 = [0] * 256
    for i in a:count1[ord(i)] += 1
    for i in b:count2[ord(i)] += 1

    for i in range(256):
        if(count1[i] != count2[i]):return False    

    return True


str1 = "Giniiii"
str2 = "Protijayi"
print(areAnagram(str1, str2))

Let's take another famous Interview Question: Group the Anagrams from a given String:

public class GroupAnagrams {
    public static void main(String[] args) {
        String a = "Gini Gina Protijayi iGin aGin jayiProti Soudipta";
        Map<String, List<String>> map = Arrays.stream(a.split(" ")).collect(Collectors.groupingBy(GroupAnagrams::sortedString));
        System.out.println("MAP => " + map);
        map.forEach((k,v) -> System.out.println(k +" and the anagrams are =>" + v ));
        /*
         Look at the Map output:
        MAP => {Giin=[Gini, iGin], Paiijorty=[Protijayi, jayiProti], Sadioptu=[Soudipta], Gain=[Gina, aGin]}
        As we can see, there are multiple Lists. Hence, we have to use a flatMap(List::stream)
        Now, Look at the output:
        Paiijorty and the anagrams are =>[Protijayi, jayiProti]
       
        Now, look at this output:
        Sadioptu and the anagrams are =>[Soudipta]
        List contains only word. No anagrams.
        That means we have to work with map.values(). List contains all the anagrams.
        
                     
        */
        String stringFromMapHavingListofLists = map.values().stream().flatMap(List::stream).collect(Collectors.joining(" "));
        System.out.println(stringFromMapHavingListofLists);
    }

    public static String sortedString(String a) {
        String sortedString = a.chars().sorted()
                .collect(StringBuilder::new, StringBuilder::appendCodePoint, StringBuilder::append).toString();

        return sortedString;

    }
    
    /*
     * The output : Gini iGin Protijayi jayiProti Soudipta Gina aGin
     * All the anagrams are side by side.
     */
}

Now to Group Anagrams in Python is again easy.We have to : Sort the lists. Then, Create a dictionary. Now dictionary will tell us where are those anagrams are( Indices of Dictionary). Then values of the dictionary is the actual indices of the anagrams.

def groupAnagrams(words):
 
    # sort each word in the list
    A = [''.join(sorted(word)) for word in words]
    dict = {}
    for indexofsamewords, names in enumerate(A):
     dict.setdefault(names, []).append(indexofsamewords)
    print(dict)
    #{'AOOPR': [0, 2, 5, 11, 13], 'ABTU': [1, 3, 4], 'Sorry': [6], 'adnopr': [7], 'Sadioptu': [8, 16], ' KPaaehiklry': [9], 'Taeggllnouy': [10], 'Leov': [12], 'Paiijorty': [14, 18], 'Paaaikpr': [15], 'Saaaabhmryz': [17], ' CNaachlortttu': [19], 'Saaaaborvz': [20]}
 
    for index in dict.values():
     print([words[i] for i in index])
 

if __name__ == '__main__':
 
    # list of words
    words = ["ROOPA","TABU","OOPAR","BUTA","BUAT" , "PAROO","Soudipta",
        "Kheyali Park", "Tollygaunge", "AROOP","Love","AOORP", "Protijayi","Paikpara","dipSouta","Shyambazaar",
        "jayiProti", "North Calcutta", "Sovabazaar"]
 
    groupAnagrams(words)
 

The Output :

['ROOPA', 'OOPAR', 'PAROO', 'AROOP', 'AOORP']
['TABU', 'BUTA', 'BUAT']
['Soudipta', 'dipSouta']
['Kheyali Park']
['Tollygaunge']
['Love']
['Protijayi', 'jayiProti']
['Paikpara']
['Shyambazaar']
['North Calcutta']
['Sovabazaar']

Another Important Anagram Question : Find the Anagram occuring Max. number of times. In the Example, ROOPA is the word which has occured maximum number of times. Hence, ['ROOPA' 'OOPAR' 'PAROO' 'AROOP' 'AOORP'] will be the final output.

from sqlite3 import collections
from statistics import mode, mean

import numpy as np


# list of words
words = ["ROOPA","TABU","OOPAR","BUTA","BUAT" , "PAROO","Soudipta",
        "Kheyali Park", "Tollygaunge", "AROOP","Love","AOORP",
         "Protijayi","Paikpara","dipSouta","Shyambazaar",
        "jayiProti", "North Calcutta", "Sovabazaar"]

print(".....Method 1....... ")

sortedwords = [''.join(sorted(word)) for word in words]
print(sortedwords)
print("...........")
LongestAnagram = np.array(words)[np.array(sortedwords) == mode(sortedwords)]
# Longest anagram 
print("Longest anagram by Method 1:")
print(LongestAnagram)

print(".....................................................")  

print(".....Method 2....... ")  

A = [''.join(sorted(word)) for word in words]

dict = {}

for indexofsamewords,samewords in  enumerate(A):
    dict.setdefault(samewords,[]).append(samewords)
#print(dict)  
#{'AOOPR': ['AOOPR', 'AOOPR', 'AOOPR', 'AOOPR', 'AOOPR'], 'ABTU': ['ABTU', 'ABTU', 'ABTU'], 'Sadioptu': ['Sadioptu', 'Sadioptu'], ' KPaaehiklry': [' KPaaehiklry'], 'Taeggllnouy': ['Taeggllnouy'], 'Leov': ['Leov'], 'Paiijorty': ['Paiijorty', 'Paiijorty'], 'Paaaikpr': ['Paaaikpr'], 'Saaaabhmryz': ['Saaaabhmryz'], ' CNaachlortttu': [' CNaachlortttu'], 'Saaaaborvz': ['Saaaaborvz']}
     
aa =  max(dict.items() , key = lambda x : len(x[1]))
print("aa => " , aa)    
word, anagrams = aa 
print("Longest anagram by Method 2:")
print(" ".join(anagrams))    

The Output :

.....Method 1....... 
['AOOPR', 'ABTU', 'AOOPR', 'ABTU', 'ABTU', 'AOOPR', 'Sadioptu', ' KPaaehiklry', 'Taeggllnouy', 'AOOPR', 'Leov', 'AOOPR', 'Paiijorty', 'Paaaikpr', 'Sadioptu', 'Saaaabhmryz', 'Paiijorty', ' CNaachlortttu', 'Saaaaborvz']
...........
Longest anagram by Method 1:
['ROOPA' 'OOPAR' 'PAROO' 'AROOP' 'AOORP']
.....................................................
.....Method 2....... 
aa =>  ('AOOPR', ['AOOPR', 'AOOPR', 'AOOPR', 'AOOPR', 'AOOPR'])
Longest anagram by Method 2:
AOOPR AOOPR AOOPR AOOPR AOOPR






















     
Soudipta Dutta
  • 1,353
  • 1
  • 12
  • 7
1

How about Xor'ing both the strings??? This will definitely be of O(n)

char* arr1="ab cde";
int n1=strlen(arr1);
char* arr2="edcb a";
int n2=strlen(arr2);
// to check for anagram;
int c=0;
int i=0, j=0;   
if(n1!=n2) 
  printf("\nNot anagram");
else {
   while(i<n1 || j<n2)
   {
       c^= ((int)arr1[i] ^ (int)arr2[j]);
       i++;
       j++;
   }
}

if(c==0) {
    printf("\nAnagram");
}
else printf("\nNot anagram");

}

pshankar87
  • 27
  • 1
1

For known (and small) sets of valid letters (e.g. ASCII) use a table with counts associated with each valid letter. First string increments counts, second string decrements counts. Finally iterate through the table to see if all counts are zero (strings are anagrams) or there are non-zero values (strings are not anagrams). Make sure to convert all characters to uppercase (or lowercase, all the same) and to ignore white space.

For a large set of valid letters, such as Unicode, do not use table but rather use a hash table. It has O(1) time to add, query and remove and O(n) space. Letters from first string increment count, letters from second string decrement count. Count that becomes zero is removed form the hash table. Strings are anagrams if at the end hash table is empty. Alternatively, search terminates with negative result as soon as any count becomes negative.

Here is the detailed explanation and implementation in C#: Testing If Two Strings are Anagrams

Zoran Horvat
  • 10,924
  • 3
  • 31
  • 43
1
static bool IsAnagram(string s1, string s2)
        {

            if (s1.Length != s2.Length)
                return false;
            else
            {
                int sum1 = 0;
                for (int i = 0; i < s1.Length; i++)
                sum1 += (int)s1[i]-(int)s2[i];
                if (sum1 == 0)
                    return true;
                else
                    return false;
            }
        }
StarsSky
  • 6,721
  • 6
  • 38
  • 63
  • Best Solution to check whether 2 strings are Anagram or not. :) – user2605539 Mar 16 '14 at 17:44
  • 3
    Lets `str1 = "az"` and `str2 = "by"`. This approach will then fail. – Sinstein Oct 04 '14 at 07:11
  • Downvoted because algorithm doesn't work for all inputs. Please make sure you properly check all solutions before posting as answers. – SamAko Feb 11 '15 at 14:10
  • @user2605539 Old post... but this could be best solution if we make something like s1 = s1.toUpperCase() and same for s2 this way we make sure it work in all cases. – Ajit kohir May 27 '16 at 18:45
1

Well you can probably improve the best case and average case substantially just by checking the length first, then a quick checksum on the digits (not something complex, as that will probably be worse order than the sort, just a summation of ordinal values), then sort, then compare.

If the strings are very short the checksum expense will be not greatly dissimilar to the sort in many languages.

Orbling
  • 20,413
  • 3
  • 53
  • 64
1

If strings have only ASCII characters:

  1. create an array of 256 length
  2. traverse the first string and increment counter in the array at index = ascii value of the character. also keep counting characters to find length when you reach end of string
  3. traverse the second string and decrement counter in the array at index = ascii value of the character. If the value is ever 0 before decrementing, return false since the strings are not anagrams. also, keep track of the length of this second string.
  4. at the end of the string traversal, if lengths of the two are equal, return true, else, return false.

If string can have unicode characters, then use a hash map instead of an array to keep track of the frequency. Rest of the algorithm remains same.

Notes:

  1. calculating length while adding characters to array ensures that we traverse each string only once.
  2. Using array in case of an ASCII only string optimizes space based on the requirement.
cypher
  • 101
  • 1
  • 3
1

How about this?

a = "lai d"
b = "di al"
sorteda = []
sortedb = []
for i in a:
    if i != " ":
        sorteda.append(i)
        if c == len(b):
            for x in b:
                c -= 1
                if x != " ":
                    sortedb.append(x)
sorteda.sort(key = str.lower)
sortedb.sort(key = str.lower)

print sortedb
print sorteda

print sortedb == sorteda
AAF
  • 13
  • 1
  • 6
  • Your choices in strings might earn you enough offensive flags that your answer is removed. Please consider revising. I do, however understand _why_ they are convenient for your example. – Tim Post Apr 30 '11 at 06:01
  • ok I changed my anagram just for the sake of it deeming to be offensive! – AAF Apr 30 '11 at 09:40
0
/* Program to find the strings are anagram or not*/
/* Author Senthilkumar M*/

Eg. 
    Anagram:
    str1 = stackoverflow
    str2 = overflowstack

    Not anagram:`enter code here`
    str1 = stackforflow
    str2 = stacknotflow

int is_anagram(char *str1, char *str2)
{
        int l1 = strlen(str1);
        int l2 = strlen(str2);
        int s1 = 0, s2 = 0;
        int i = 0;

        /* if both the string are not equal it is not anagram*/
        if(l1 != l2) {
                return 0;
        }
        /* sum up the character in the strings 
           if the total sum of the two strings is not equal
           it is not anagram */
        for( i = 0; i < l1; i++) {
                s1 += str1[i];
                s2 += str2[i];
        }
        if(s1 != s2) {
                return 0;
        }
        return 1;
}
0

If both strings are of equal length proceed, if not then the strings are not anagrams.

Iterate each string while summing the ordinals of each character. If the sums are equal then the strings are anagrams.

Example:

    public Boolean AreAnagrams(String inOne, String inTwo) {

        bool result = false;

        if(inOne.Length == inTwo.Length) {

            int sumOne = 0;
            int sumTwo = 0;

            for(int i = 0; i < inOne.Length; i++) {

                sumOne += (int)inOne[i];
                sumTwo += (int)inTwo[i];
            }

            result = sumOne == sumTwo;
        }

        return result;
    }
jim
  • 8,670
  • 15
  • 78
  • 149
  • Generally, white space is ignored in anagrams. This means that length is not the criterion. One famous anagram: "William Shakespeare" is anagram with "I am a weakish speller". I suppose that spaces should be allowed if problem statement doesn't say the opposite. – Zoran Horvat Dec 18 '13 at 21:47
0

I guess your sorting algorithm is not really O(log n), is it?

The best you can get is O(n) for your algorithm, because you have to check every character.

You might use two tables to store the counts of each letter in every word, fill it with O(n) and compare it with O(1).

Philipp
  • 11,549
  • 8
  • 66
  • 126
0

implementation in Swift 3:

func areAnagrams(_ str1: String, _ str2: String) -> Bool {
    return dictionaryMap(forString: str1) == dictionaryMap(forString: str2)
}

func dictionaryMap(forString str: String) -> [String : Int] {
    var dict : [String : Int] = [:]
    for var i in 0..<str.characters.count {
        if let count = dict[str[i]] {
            dict[str[i]] = count + 1
        }else {
            dict[str[i]] = 1
        }
    }        
    return dict
}
//To easily subscript characters
extension String {
    subscript(i: Int) -> String {
        return String(self[index(startIndex, offsetBy: i)])
    }
}
Charlton Provatas
  • 2,184
  • 25
  • 18
0
import java.util.ArrayList;
import java.util.Arrays;
import java.util.LinkedHashMap;
import java.util.Map;
import java.util.Scanner;

/**
 * --------------------------------------------------------------------------
 * Finding Anagrams in the given dictionary. Anagrams are words that can be
 * formed from other words Ex :The word "words" can be formed using the word
 * "sword"
 * --------------------------------------------------------------------------
 * Input : if choose option 2 first enter no of word want to compare second
 * enter word ex:
 * 
 * Enter choice : 1:To use Test Cases 2: To give input 2 Enter the number of
 * words in dictionary 
 * 6
 * viq 
 * khan
 * zee 
 * khan
 * am
 *    
 * Dictionary : [ viq khan zee khan am]
 * Anagrams 1:[khan, khan]
 * 
 */
public class Anagrams {

    public static void main(String args[]) {
        // User Input or just use the testCases
        int choice;
        @SuppressWarnings("resource")
        Scanner scan = new Scanner(System.in);
        System.out.println("Enter choice : \n1:To use Test Cases 2: To give input");
        choice = scan.nextInt();
        switch (choice) {
        case 1:
            testCaseRunner();
            break;
        case 2:
            userInput();
        default:
            break;
        }
    }

    private static void userInput() {
        @SuppressWarnings("resource")
        Scanner scan = new Scanner(System.in);
        System.out.println("Enter the number of words in dictionary");
        int number = scan.nextInt();
        String dictionary[] = new String[number];
        //
        for (int i = 0; i < number; i++) {
            dictionary[i] = scan.nextLine();
        }
        printAnagramsIn(dictionary);

    }

    /**
     * provides a some number of dictionary of words
     */
    private static void testCaseRunner() {

        String dictionary[][] = { { "abc", "cde", "asfs", "cba", "edcs", "name" },
                { "name", "mane", "string", "trings", "embe" } };
        for (int i = 0; i < dictionary.length; i++) {
            printAnagramsIn(dictionary[i]);
        }
    }

    /**
     * Prints the set of anagrams found the give dictionary
     * 
     * logic is sorting the characters in the given word and hashing them to the
     * word. Data Structure: Hash[sortedChars] = word
     */
    private static void printAnagramsIn(String[] dictionary) {
        System.out.print("Dictionary : [");// + dictionary);
        for (String each : dictionary) {
            System.out.print(each + " ");
        }
        System.out.println("]");
        //

        Map<String, ArrayList<String>> map = new LinkedHashMap<String, ArrayList<String>>();
        // review comment: naming convention: dictionary contains 'word' not
        // 'each'
        for (String each : dictionary) {
            char[] sortedWord = each.toCharArray();
            // sort dic value
            Arrays.sort(sortedWord);
            //input word
            String sortedString = new String(sortedWord);
            //
            ArrayList<String> list = new ArrayList<String>();
            if (map.keySet().contains(sortedString)) {
                list = map.get(sortedString);
            }
            list.add(each);
            map.put(sortedString, list);
        }
        // print anagram
        int i = 1;
        for (String each : map.keySet()) {
            if (map.get(each).size() != 1) {
                System.out.println("Anagrams " + i + ":" + map.get(each));
                i++;
            }
        }
    }
}
vaquar khan
  • 10,864
  • 5
  • 72
  • 96
0

It seems that the following implementation works too, can you check?

int histogram[256] = {0};
for (int i = 0; i < strlen(str1); ++i) {
   /* Just inc and dec every char count and
    * check the histogram against 0 in the 2nd loop */
   ++histo[str1[i]];
   --histo[str2[i]];
}

for (int i = 0; i < 256; ++i) {
   if (histo[i] != 0)
     return 0; /* not an anagram */
}

return 1; /* an anagram */
0

I just had an interview and 'SolutionA' was basically my solution.

Seems to hold.

It might also work to sum all characters, or the hashCodes of each character, but it would still be at least O(n).

/**
 * Using HashMap
 * 
 * O(a + b + b + b) = O(a + 3*b) = O( 4n ) if a and b are equal. Meaning O(n) in total.
 */
public static final class SolutionA {
    //   
    private static boolean isAnagram(String a, String b) {
        if ( a.length() != b.length() ) return false; 
        
        HashMap<Character, Integer> aa = toHistogram(a);
        HashMap<Character, Integer> bb = toHistogram(b);
    
        return isHistogramsEqual(aa, bb);
    }

    private static HashMap<Character, Integer> toHistogram(String characters) {
        HashMap<Character, Integer> histogram = new HashMap<>();
        int i = -1; while ( ++i < characters.length() ) {
            histogram.compute(characters.charAt(i), (k, v) -> {
                if ( v == null ) v = 0;
                return v+1;
            });    
        }
        
        return histogram;
    }

    private static boolean isHistogramsEqual(HashMap<Character, Integer> a, HashMap<Character, Integer> b) {
        for ( Map.Entry<Character, Integer> entry : b.entrySet() ) {
            Integer aa = a.get(entry.getKey());
            Integer bb = entry.getValue();
            
            if ( !Objects.equals(aa, bb) ) {
                return false;
            }
        }
        
        return true;
    }

    public static void main(String[] args) {
        System.out.println(isAnagram("abc", "cba"));
        System.out.println(isAnagram("abc", "cbaa"));
        System.out.println(isAnagram("abcc", "cba"));
        System.out.println(isAnagram("abcd", "cba"));
        System.out.println(isAnagram("twelve plus one", "eleven plus two"));
    }
}

I've provided a hashCode() based implementation as well. Seems to hold as well.

/**
 * Using hashCode()
 * 
 * O(a + b) minimum + character.hashCode() calculation, the latter might be cheap though. Native implementation. 
 * 
 * Risk for collision albeit small. 
 */
public static final class SolutionB {
    
    public static void main(String[] args) {
        System.out.println(isAnagram("abc", "cba"));
        System.out.println(isAnagram("abc", "cbaa"));
        System.out.println(isAnagram("abcc", "cba"));
        System.out.println(isAnagram("abcd", "cba"));
        System.out.println(isAnagram("twelve plus one", "eleven plus two"));
    }
    
    private static boolean isAnagram(String a, String b) {
        if ( a.length() != b.length() ) return false;
        
        return toHashcode(a) == toHashcode(b);
    }

    private static long toHashcode(String str) {
        long sum = 0; int i = -1; while ( ++i < str.length() ) {
            sum += Objects.hashCode( str.charAt(i) );    
        }
        
        return sum;
    }
}
mjs
  • 21,431
  • 31
  • 118
  • 200
-1

in java we can also do it like this and its very simple logic

import java.util.*;

class Anagram
{
 public static void main(String args[]) throws Exception
 {
  Boolean FLAG=true;

  Scanner sc= new Scanner(System.in);

  System.out.println("Enter 1st string");

  String s1=sc.nextLine();

  System.out.println("Enter 2nd string");

  String s2=sc.nextLine();

  int i,j;
  i=s1.length();
  j=s2.length();

  if(i==j)
  {
   for(int k=0;k<i;k++)
   {
    for(int l=0;l<i;l++)
    {
     if(s1.charAt(k)==s2.charAt(l))
     {
      FLAG=true;
      break;
     }
     else
     FLAG=false;
    }
   }
  }
  else
  FLAG=false;
  if(FLAG)
  System.out.println("Given Strings are anagrams");
  else
  System.out.println("Given Strings are not anagrams");
 }
}
  • 1
    This code has a bug. It says that 12343 and 12344 are anagrams, but they are not. The inner loop should account for the fact a specific character has already been counted in comparison against the first string. – lreeder Jul 04 '13 at 15:50
-1

How about converting into the int value of the character and sum up :

If the value of sum are equals then they are anagram to each other.

def are_anagram1(s1, s2):
    return [False, True][sum([ord(x) for x in s1]) == sum([ord(x) for x in s2])]

s1 = 'james'
s2 = 'amesj'
print are_anagram1(s1,s2)

This solution works only for 'A' to 'Z' and 'a' to 'z'.

James Sapam
  • 16,036
  • 12
  • 50
  • 73