4

You can find if 2 strings are anagrams after sorting both strings in O(nlogn) time, however is it possible to find it in o(n) time and O(1) space.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
shreyasva
  • 13,126
  • 25
  • 78
  • 101

12 Answers12

6

There are couple of ways to solve it.

Method 1 - Using custom hash code function
We can have hashCode function like:

static int[] primes = {3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103};
static String alphabet = "abcdefghijklmnopqrstuvwxyz";


public static int hashCode(String s){
    int sum = 0;

    for(char c: s.toCharArray()){
      sum += primes[c-97];
    }
    return sum;
}

Generate the hash of both strings, and if the hashCode are equal strings are anagrams. This method is similar to solution mentioned by Jin, as it is in some way generating hashCode for string.
Time complexity - O(n)

Method 2 - Use hashmap of Character and Integer
Consider 2 strings as 2 arrays of character. Traverse first array, add the character to hashmap of char and count, increment the count when you find the character. Likewise traverse through second array, decrement the counter in the hashmap, or if you dont find the character, they are not anagrams. Finally, when map has all the characters and count as 0, then again 2 strings are anagrams.

Method 3 - Use a count array(my favourite)


boolean are_anagrams(string1, string2){
 
    let counts = new int[26];
 
    for each char c in lower_case(string1)
        counts[(int)c]++
 
    for each char c in lower_case(string2)
        counts[(int)c]--
 
    for each int count in counts
        if count != 0
            return false
 
    return true
}

You can get all the codes here.
kinshuk4
  • 3,241
  • 3
  • 33
  • 41
  • I wonder how you can generate same hash code for "cat" and "tac" using Method 1. That doesn't sound like an option. – videoguy Mar 18 '17 at 00:41
  • @videoguy : I totally agree with you. I have corrected from hash code to custom hashcode :) Thanks for the help :) – kinshuk4 Mar 20 '17 at 12:25
5

generate a prime number array[26] each prime number represent a character, then when you traverse the string, multiple each character's prime number, if equal, it is anagrams, otherwise not. it takes O(n) and constant space

Jin
  • 59
  • 1
  • 1
5

Absolutely no expert here...

But why not go through each string and simply count how many times each letter turns up.

Given appropriate implementation, this shouldn't take more than O(n) time.

NPSF3000
  • 2,421
  • 15
  • 20
  • 14
    well, if it uses only letters, then it takes O(1) space, because there are 26 letters and 26 = O(1). ;) – Petar Ivanov Jul 11 '11 at 05:36
  • 3
    +1 To do it quickly, I would store 2 int[256] arrays, or if you're not restricted to Extended ASCII, int[#] where # is the size of your input character set. Step through with a single unsigned int and increment the value of the int at the character's code point (cast it to an int). Then run over the two arrays and make sure every value is equal. Return NO straight away if one isn't, return YES if you reach the end. –  Jul 11 '11 at 05:37
  • 1
    @Petar, you raise a good point, what if we have an alien-language with many letter types.. we're lost! – hayesgm Jul 11 '11 at 05:39
  • @shreyasva: The character set is constant. Fair enough, it's not a tiny array, but 256 ints are going to be constant in terms of your input size, which is the length of the strings you are comparing. So it's still O(1) space because an int always takes up the same amount of space even if you use a huge input string. –  Jul 11 '11 at 05:40
  • @ghayes As long as the alien language has a fixed number of letters/symbols/???, it is still O(1). – 101100 Jul 11 '11 at 05:41
  • Agreed, .. a language with an infinite set of letters, now that's an interesting proposition.. and totally irrelevant. I think we've solved this one. – hayesgm Jul 11 '11 at 05:46
  • Good to know I wasn't taking complete nonsense :) – NPSF3000 Jul 11 '11 at 05:47
  • yes a language with constant character set is O(1) space, you are not allowed more than 1 or 2 variables. That's what i meant by O(1), sorry should have made it more clearer – shreyasva Jul 11 '11 at 05:55
  • 1
    @darvids0n Technically, no - your integers are fixed-length, so a sufficiently long input string would require more storage to store the counts, making it O(n log m). But I'm personally of the opinion that that sort of distinction is so anal as to not be worth making. ;) – Nick Johnson Aug 26 '12 at 22:46
4

Yes, use a hash and count occurences. If at the end, we have a non-zero figure, then the strings are not anagrams.

let h => hash which maps letters to occurence_count (initialized to 0)

for each letter l in string a
  h[l] = h[l] + 1
end

for each letter l in string b
  h[l] = h[l] - 1
end

for each key l in h 
  return false if h[l] != 0
end

return true

This will run in O(n) + O(n) + c = O(n). Our hash contains 26-letter spots, each with an integer associated with it. The space is therefore O(26) = O(1)

[[Edit]], same as above, but with time-analysis annotations:

let h => hash which maps letters to occurence_count (initialized to 0)

#this loop runs n times
for each letter l in string a
  #hash lookups / writes are constant time
  h[l] = h[l] + 1
end
#above function ran O(n) time

for each letter l in string b
  h[l] = h[l] - 1
end

#runs in O(alphabet) = O(c) = constant-time
for each key l in h 
  return false if h[l] != 0
end

return true
hayesgm
  • 8,678
  • 1
  • 38
  • 40
1

Run in : O(n) + O(n) = O(n)

Fix Used Space : O(256) = O(1)

Here is code in Java

private static boolean isAnagramWithOneArray(String strFirst, String strSecond) {
    int[] charsCount = new int[256];

    if (strFirst != null && strSecond != null) {
        if (strFirst.length() != strSecond.length()) {
            return false;
        }
        for (int i = 0; i < strFirst.length(); i++) {
            charsCount[strFirst.charAt(i)]++;
            charsCount[strSecond.charAt(i)]--;
        }
        for (int i = 0; i < charsCount.length; i++) {
            if (charsCount[i] != 0) {
                return false;
            }
        }
        return true;
    } else {
        return (strFirst == null && strSecond == null);
    }
}
j0k
  • 22,600
  • 28
  • 79
  • 90
1
unsigned char CharacterCheck(char item)
{

    if ((item >= 'A') && (item <= 'Z'))
        return (item - 'A');

    if ((item >= 'a') && (item <= 'z'))
        return ((item - ('a' - 'A')) - 'A');

    return -1;

}

unsigned char AnagramCheck6 (char * item1, char * item2)
{
    char *test                      = item1;
    char *test2                     = item2;
    int count                       = 0;
    unsigned char rc                = 0;
    unsigned char rslt              = 0;

    while (*test && *test2)
    {
        rslt = CharacterCheck(*test++);

        if (rslt != 0xff)
            count += rslt;

        rslt = CharacterCheck(*test2++);

        if (rslt != 0xff)
            count -= rslt;
    }

    if (*test)
    {
        while (*test)
        {
            rslt = CharacterCheck(*test++);

            if (rslt != 0xff)
                count += rslt;
        }
    }

    if (*test2)
    {
        while (*test2)
        {
            rslt = CharacterCheck(*test2++);

            if (rslt != 0xff)
                count -= rslt;
        }
    }

    if (count)
        rc = 1;

    return rc;

}

The following snippet checks for the correct character and converts case if needed. The second and third checks takes into account if the strings are different lengths

Davey
  • 11
  • 1
0

the above code would fail to work in all condition

rateher we can go for a quick sort and compare the array while elimating spaces

0

All suggestions here tend to use the same approach of sorting the input strings and then comparing the results. Being mostly interested in regular ascii letters this can be optimized by count sorting which seems to be most answerers approach. Count sort can do sorting of a limited alphabet of numbers / integers in O(n) so technically it is correct answers. If we have to account for the time to traverse the count array afterwards it will include the time for the alphabet, making O(m+n) a somewhat more correct upper bound in cases where the alphabet is UTF-32.

I tend to think the most generally correct approach would require O(n lg n) since a quicksort might prove faster in real time in case the alphabet cannot be limited sufficiently.

faester
  • 14,886
  • 5
  • 45
  • 56
0

i would do it something as below:

//is s an anagram of t?
#include <string>

bool is_anagram(const string& s, const string& t)
    {
    bool ret = false;

    //if s and t are anagrams, they must of same size
    if( s.length() != t.length() )
        {
        return ret;
        }

        //assume that s and t have valid ascii characters only
    char letters[ 256 ] = {0};
    int i;

    // find occurence of each character in string s
    for( i = 0; i < s.length(); i++ )
        {
        (letters[ s[i] ])++;
        }

    // now, find occurence of each character in string t, but decrement 
    // correspnding character
    for( i = 0; i < t.length(); i++ )
        {
        //if the count is <0 means the given character is not present in s
        if( --(letters[ t[i] ]) < 0 ) 
            {
            return ret;
            }
        }

    //out of for loop means success
    ret = true;
    return ret;
    }
Viren
  • 2,161
  • 22
  • 27
0

If you transform a words characters in sorted order and hash the String. Every String which has the same hash after sorting will be an anagram(very probable, there is always a chance of collisions) of the other.

fyr
  • 20,227
  • 7
  • 37
  • 53
0

Something like this using python?

def anagram(a,b):
    s1=list(a)
    s2=list(b)
    for i in s1:
        if i in s2:
            s2.remove(i)
    print(s2)
    return len(s2)==0

This will return false if the strings are not anagram of each other.

To avoid using built ins like remove etc. following approach can be used?

def hana(a,b):
    s1=list(a.lower())
    s2=list(b.lower())
    print(s1,s2)
    h={'a': 0, 'c': 0, 'b': 0, 'e': 0, 'd': 0, 'g': 0, 'f': 0, 'i': 0, 'h': 0, 'k': 0,'j': 0, 'm': 0, 'l': 0, 'o': 0, 'n': 0, 'q': 0, 'p': 0, 's': 0, 'r': 0, 'u': 0, 't': 0, 'w': 0, 'v': 0, 'y': 0, 'x': 0, 'z': 0}
    for i in s1:
        h[i]+=1
    for i in s2:
        h[i]-=1
    for key, val in h.items():
        if val != 0:
            return False
    return(True)

If the strings are anagram of each other then the final hash should have value 0 for all the keys. If its anything other than 0 then we have either repeated characters or characters that are not present in the other string, which means its not anagram.

ayesha ayub
  • 121
  • 1
  • 4
-1

Maybe something like:

    String str1 = "test";
    String str2 = "tzet";
    int count = 0;
    for (int i = 0; i < str1.length(); i++)
    {
        count = count + str1.charAt(i) - str2.charAt(i);
    }
    System.out.println(count);

Subtract every character from string 2 and add every character from string 1 to count (assuming ASCII characters). If they are anagrams, count will be equal to zero.

This doesn't account for anagrams that have inserted spaces, though.

SirensOfTitan
  • 799
  • 1
  • 7
  • 19