2

I need to write an algorithm that checks for uniqueness of a character within a string. I've written the following algorithm:

function unique(s) {
    for (var i=0; i<s.length-1; i++) {
        for (j=i+1; j<s.length; j++) {
            if (s[i] === s[j]) {
                return false;
            }
        }
    }
    return true;
}

When calculating the number of operations for the worst case scenario, the formula is

n/2x(n-1)=(n^2-n)/2

and the complexity O(n)=n^2.

Now, I may have written the following algorithm which is not optimized:

function unique(s) {
    for (var i=0; i<s.length; i++) {
        for (j=0; j<s.length; j++) {
            if ((s[i] === s[j]) && (i!==j)) {
                return false;
            }
        }
    }
    return true;
}

but it gives the same 0(n)=n^2. So it seems that I can't tell which algorithm is better based on the 0(n) - is it correct? So why use 0(n)? And what metric can show that the first algorithm is better than the second one?

Max Koretskyi
  • 101,079
  • 60
  • 333
  • 488
  • 2
    You're looking for a hash table. – SLaks Oct 23 '16 at 18:08
  • 1
    `So why use 0(n)` I don't think that's a JS question, it's more of a programming question. At any rate, the reason is that you're not measuring _performance_ but _growth_. Imagine that you have an algorithm that has a complexity of `O(n^2)` but it takes an extra 1 ms for `n`, while you have one which is `O(n)` but it takes 1000 ms (1s) for `n`. Growth will show how they perform with more input but it's entirely possible that the former is better in some circumstances. – VLAZ Oct 23 '16 at 18:10
  • 1
    @SLaks points out a Hash Table will solve this quickly, but will use more memory. This is known as a space-time tradeoff https://simple.wikipedia.org/wiki/Space-time_tradeoff – Leon Oct 23 '16 at 18:10
  • That should be O(runtime(n)) = n² - while anything in O(n) is in O(n²), the reverse is not true. – greybeard Oct 23 '16 at 18:13
  • 2
    "*what metric can show that the first algorithm is better than the second one?*" - you're looking for runtime, not complexity. Don't do analysis, perform an actual benchmark. Use a set of input strings, choose an environment (cpu, os, compiler), and determine the average time spent on a function call. Your first implementation should be about twice as fast as your second. – Bergi Oct 23 '16 at 18:16
  • Both algorithms are O(n) if s is a string. The outer loop can never execute past index 65535 -- otherwise a duplicate would have already been found, since strings contain 16-bit values. – Paul Hankin Oct 23 '16 at 19:44

3 Answers3

2

Your algorithms are O(n), not O(n^2), despite appearances.

s is a string, and characters are limited to 16 bits. That means the outer loop will execute at most 65536 times, and this maximum case can only happen if the string is exactly 65536 characters long, and contains each code point exactly once.

With this observation, you can create an O(1) solution. Simply add an initial check to see if the string is longer than 65536 characters, and if it is then immediately return true. If it's not, then use either of your algorithms to check the string for duplicates.

Paul Hankin
  • 54,811
  • 11
  • 92
  • 118
1

The O notation is used for analyzing the asymptotic behavior of the algorithm, not a precise number of operations(yes, both algorithms are the same with respect to this notation).

If you want to show that the first algorithm is better, you can use the number of pairs characters compared (or the total number of loop iterations) as a metric.

kraskevich
  • 18,368
  • 4
  • 33
  • 45
0

You want to use a data structure that tells if a certain element is already given within that data structure.

Like a hashtable. By adding all chars of your word into that table one by one, you can easily check if that "to be added" char is already in. So you avoid the n * n complexity of comparing each character with each other character in your table.

Community
  • 1
  • 1
GhostCat
  • 137,827
  • 25
  • 176
  • 248