Cracking the Coding Interview, 4th ed, 1.1

Question

I came across the following question in Cracking the Coding Interview, 1.1:

Implement an algorithm to determine if a string has all unique characters. What if you can not use additional data structures?

Here is the book's solution:

Here is my solution:

public boolean allUnique(String input) {

HashSet<Character> set = new Hashset<Character>();
char c;

 for (int i = 0; i < input.length(); i++) {
  c = input.charAt(i);
if (set.contains(c)) return false;
set.add(c);
}
return true;

Does my solution work, and how efficient is it? I was also wondering if someone could explain ASCII and how it is relevant to this problem, since it was briefly mentioned in the book's solution. Is this why we are able to type-cast each char in the String to an integer?

Thank you!

What about the part `What if you can not use additional data structures?` — Tim Biegeleisen, Aug 04 '16 at 01:51
Your question seems to cover many possible answers, and your code looks like it should run. Maybe this belongs on [Code Review](http://codereview.stackexchange.com). — Tim Biegeleisen, Aug 04 '16 at 01:53
ASCII is just a character set. Just like Unicode is a different character set. ASCII is usually used in the 0-127 range while the 128-256 is not usually used. Also, since you can have an associative array, would it not be better to use that and just increment each location? Then if the array entry is greater than one(1) you have a duplicate. Just a thought. (Edit): Ah! A boolean array is just as good. :-) — Mark Manning, Aug 04 '16 at 01:54
You can also not use an array and just use the indexOf() function. A simple two usage method will find the first and then any other instance of a given character. See http://www.w3schools.com/jsref/jsref_indexof.asp. (Edit) I thought about it and you really only need one indexOf() function and use the optional START information. If you find a second character then there are duplicates. Again, just a thought. — Mark Manning, Aug 04 '16 at 01:59
There is also this: http://stackoverflow.com/questions/881085/count-the-number-of-occurences-of-a-character-in-a-string-in-javascript — Mark Manning, Aug 04 '16 at 02:12
Your solution is not space efficient ,as you are using Character object to store the presence of a character, you only need 16 bits to store and check if string has duplicate chars. The second solution from book is more space efficient. — ravthiru, Aug 04 '16 at 02:12
@ravthiru: That may be but the original question was NOT how to be as space efficient as possible - it was to find duplicate characters. — Mark Manning, Aug 04 '16 at 02:13

score 1 · Accepted Answer · answered Aug 04 '16 at 02:07

1

To show how not to use an array (unknown if this is really fast or not) you can:

function boolean allUnique(String input){
 for( int i=0; i<input.length(); i++ ){
  if( input.indexOf(input.charAt(i),i) > -1 ){ return false; }
  }

 return true;
 }

The above is not tested, but should work. The ",i)" may need to be ",i+1)" to put it past the current character. But I think just ",i)" should work.

answered Aug 04 '16 at 02:07

Mark Manning

1,427
12
14

1

`String#indexOf` implements a linear search, so your code will do a linear search for each character in `input`. Your solution is a nested for loop where the inner for loop is `for(int j=0; j – Jonny Henly Aug 04 '16 at 02:15
Correct, but it only tests each character once and stops if it finds a duplicate. I actually think the Stack Overflow answer I quoted would be even better because it uses a single command (match) with a regexp. It too is linear although there might be a good regexp that will return duplicates that I do not know of. – Mark Manning Aug 04 '16 at 02:19
If you're only dealing with ASCII then I think the lookup table approach would be the best solution, since you would only need to iterate over the string once. – Jonny Henly Aug 04 '16 at 02:22
Also it does not "only test each character once", *on average* it will compare the first and second character once, the third character twice, the fourth character three times, ... , the nth character n-1 times. – Jonny Henly Aug 04 '16 at 02:25
@JonnyHenly: Yes, you are right. These are just other thoughts on what the poster did originally and what the code he posted from the book did. Someone posted "What if you can not use additional data structures?" thus my answer also goes towards that because an array is a data structure. :-) – Mark Manning Aug 04 '16 at 02:25
@JonnyHenly: Correct again - but I never said it was efficient. Just that it is another way to do what he was asking about - and - as I have already said - it doesn't use any additional data structures. – Mark Manning Aug 04 '16 at 02:27
@JonnyHenly: I think it would interesting if a sorting algorithm for the incoming string were used and if you got "A == B" to just pop out of the sorting algorithm. But this would probably REALLY make it slow. :-) – Mark Manning Aug 04 '16 at 02:31
I agree, like an implementation of insert sort. – Jonny Henly Aug 04 '16 at 02:40
@JonnyHenly: What about if you could eliminate some things. If ONLY the ASCII table is being used then if the string is longer than 256 bytes it has to contain a duplicate. If it was less than that, maybe if the string length were divided by four(4) you'd only have to go through a maximum of 64 tests on each string section. Hmmmmmmm... – Mark Manning Aug 04 '16 at 02:53
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/120064/discussion-between-jonny-henly-and-mark-manning). – Jonny Henly Aug 04 '16 at 02:54
1

@JonnyHenly: I would love to but I have to leave. I will think about this some more and if I do come up with something I'll post it there. :-) – Mark Manning Aug 04 '16 at 02:56

Cracking the Coding Interview, 4th ed, 1.1

1 Answers1