0

So I've recently been researching all there is to do with run-time complexity of algorithms and wanting to learn how to alter them to improve efficiency for when the scale of n is increased, so essentially my aim is to learn how to make things O (log n). Thought to myself, I know of a good little project I could do this hour and thats create an anagram checker.

I rummaged through a few SO posts and saw someone commented it could be made log n if you assigned every letter in the alphabet to a numeric thus:

final Map<Character, Integer> map;
        String str = "hello";
        String check = "olleh";

        map = new HashMap<>();
        map.put('a', 2);
        map.put('b', 3);
        map.put('c', 4);
        map.put('d', 7);
        map.put('e', 11);
        map.put('f', 13);
        map.put('g', 17);
        map.put('h', 19);
        map.put('i', 23);
        map.put('j', 29);
        map.put('k', 31);
        map.put('l', 37);
        map.put('m', 41);
        map.put('n', 43);
        map.put('o', 47);
        map.put('p', 53);
        map.put('q', 59);
        map.put('r', 61);
        map.put('s', 67);
        map.put('t', 71);
        map.put('u', 73);
        map.put('v', 79);
        map.put('w', 83);
        map.put('x', 89);
        map.put('y', 97);
        map.put('z', 101);

Then I created the method:

 public static boolean isAnagram(String s, String check,Map<Character, Integer> map) {
        int stringTotal = 0;
        int checkTotal = 0;
        for(char ch: s.toCharArray()){
            int i = map.get(ch);
            stringTotal += ch;
        }
        for(char ch: check.toCharArray()){
            int i = map.get(ch);
            checkTotal +=ch;
        }
        if (stringTotal == checkTotal){
            return true;
        }
        return false;
    }

I believe that this method is O(n^2) because it has two independent loops, I cannot think of the logic behind creating this to a O(log n) method.

Any pointers would be great

  • FYI two independent loops doesn't mean O(N^2). Think about it... you're going through an array of length N two times (2N). A quick check upfront to make sure you don't spend time in an expensive-ish operation would be to make sure the lengths of the strings are equal prior to your logic. – TayTay Nov 14 '15 at 19:20
  • oh yeah of course @Tgsmith61591 thank you for pointing that out –  Nov 14 '15 at 19:27
  • Please check this solution http://stackoverflow.com/a/42058276/379173 – Enrico Giurin Feb 05 '17 at 22:59

1 Answers1

0

It looks to be O(2n): one loop of n after another. You'd get O(n^2) if you nest a loop of n within a loop of n, like so:

int count = 0;
for ( int x = 0; x < n; x++ )
     for ( int y = 0; y < n; y++ )
         count++;

Here, count will be n*n or n2.

In the code you pasted, there are two loops:

int count = 0;
for ( int x = 0; x < n; x++ )
    count++;
for ( int y = 0; y < n; y++ )
    count++;

Here, count is n+n or 2n.

There's no way to make a O(log n) function out of this, because you will always need to check all n values (all letters).

For example, let's say that we calculated the exact time complexity of the function to be log(n) (note that this is different from the order O(log n) which does away with coefficients and lower order terms).
Now, if n = 10; then log(n) = 2.3 (approximately), and you can't be sure a ten letter word is an anagram of another 10 letter word if you only look at 2.3 letters.

Kenney
  • 9,003
  • 15
  • 21
  • I'd get rid of the last sentence, beginning "For example". If an algorithm has time complexity `O(log n)` there could be a large constant of proportionality. It doesn't mean it reads `2.3` characters. – Paul Boddington Nov 14 '15 at 19:11
  • I don't follow.. for an algorithm with complexity `O(log n)` where `n=10`, it does mean the algorithm inspects `log(10)` or `2.3` characters. It's an *example*. For `n=1000`, the `log n` is approx 6.9, so I don't see any constant of proportionality here. Could you clarify? – Kenney Nov 14 '15 at 19:18
  • `O(log n)` is about the time it takes as n gets larger and larger, so an individual example for a specific value of `n` can never prove that it does or doesn't have a particular time complexity. If it could be proved that an algorithm takes exactly `Math.floor(1000000 * log(n))` steps, the time complexity would be `O(log n)` because you are allowed to just ignore large factors like `1000000`. So the fact that you can't read a string of length 10 by checking 2.3 characters isn't really relevant. – Paul Boddington Nov 14 '15 at 19:23
  • Thank you for your answer and comments, so would the most efficient algorithm for this scenario be O (n log n) ? –  Nov 14 '15 at 19:33
  • @user3667111 No you can do it in `O(n)`. Use two `HashMap` to count occurrences, then check the maps are equal. – Paul Boddington Nov 14 '15 at 19:36
  • With the risk of @PaulBoddington's wrath (;-) Thanks for explanation - I know what you mean), `n log n` isn't necessarily faster: for the example `n = 10`, `2n = 20` but `n * log(n) = 23` (approx), and this only gets worse as `n` gets larger: for `n=1000`, `2n=2000` but `1000*log(n)=6907` (approx). – Kenney Nov 14 '15 at 19:38
  • @PaulBoddington you're cheating ;-) You'd have to increment 2 maps in the loop, which is as many operations as two loops each incrementing 1 map (barring the overhead for not unrolling the loops), and then you'd have to iterate once more over the map to check equality. That'd be slower than the `O(2n)`. Or am I mistaken? – Kenney Nov 14 '15 at 19:41
  • @Kenney There is no wrath. I thought it was a really good answer and I was trying to be constructive about how you could improve it. I was going to upvote if you'd made the edit. – Paul Boddington Nov 14 '15 at 19:41
  • @Kenney There would be 3 loops. One to populate the first map, one to populate the second, and a third to check the maps are equal. So the overall time complexity is `O(3n)`. This is the same as `O(n)` because `O` notation does not take constant proportions (here `3`) into account. – Paul Boddington Nov 14 '15 at 19:44
  • @Paul I'm just kidding - I really appreciate your feedback. I'd like to keep the example though as I find it can cut through misunderstanding and show hard numbers, but do away with the order, and reference the exact runtime. Do you have a suggestion? – Kenney Nov 14 '15 at 19:44