-1

I just begin to do the problems in cracking the coding interview. this is the first question: implement an algorithm to determine if a string has all unique characters. What if you cannot use additional data structures? Here is the solution: Using a boolean array to track the occurrence of each possible character in ASCII set. Initially every value in that array is false, until the corresponding character value appears. If that character value appears again, then if the corresponding value in array is already true, return false. Time: O(n), n is the length of string. Space: O(n), the array named mask. Source Code:

public class Interview {

    // Assume every character in string is in ASCII.
    public boolean uniqueChars(String s) {
        boolean[] mask = new boolean[256];
        for (int i = 0; i < s.length(); i++) {
            if (mask[s.charAt(i)])
                return false;
            mask[s.charAt(i)] = true;
        }
        return true;
    }
} 

I cannot understand this: mask[s.charAt(i)]. I think s.charAt(i) is a char but not a number, so I am don't know how this works. I know this is a simply question, but I cannot find a direct answer. Hope you guys can help me.Thanks a lot.

Madhawa Priyashantha
  • 9,633
  • 7
  • 33
  • 60
  • Relevant: http://stackoverflow.com/questions/19484406/detecting-if-a-string-has-unique-characters-comparing-my-solution-to-cracking – Sotirios Delimanolis Dec 17 '14 at 22:53
  • 1
    This will crash so spectacularly if you pass it anything in, say, Chinese! – Sergey Kalinichenko Dec 17 '14 at 22:54
  • Everything in a computer turns into bytes of 1s and 0s. Everything is an number, anything else is an illusion. http://www.asciitable.com/ So `char` is a number `boolean` is turned into 0 = false, 1 = true. Even a reference to an object is a number. – Peter Lawrey Dec 17 '14 at 23:02

1 Answers1

2

mask[s.charAt(i)] gets the character from s at index i, converts it to an int (implicitly), and then looks up the value in mask at that index in the mask array. The int value of the character is not ASCII, it's UTF-16 (because that's how Java's characters are defined). The code makes a massive assumption (mentioned in a comment) that the string only has ASCII characters, which it apparently defines as characters whose value is < 256; that's "extended" ASCII, ASCII itself only defines characters < 128. For values < 128, ASCII and UTF-16 are the same.

Your overall function is doing this: Starting with an array with all false entries (because that's the default state of a boolean[]), it loops through the string s and, for each character, looks up whether it's seen that character before. If it has, return true out of the function (we've seen at least two occurrences of the character). If it hasn't seen a character before, it sets that entry in the array to true. If it gets all the way to the end of the string without returning true, it returns false — the string didn't have the same character repeated twice.

The code will fail with an ArrayIndexOutOfBoundsException if it sees a character whose int value is >= 256, which is entirely likely.

T.J. Crowder
  • 1,031,962
  • 187
  • 1,923
  • 1,875