1

I am optimizing a piece of code which uses equalsIgnoreCase and it processes records in millions. Please can anybody give me some insight which one among equalsIgnoreCase and regionMatches is faster and efficient in java.

Balduz
  • 3,560
  • 19
  • 35
SidB
  • 55
  • 1
  • 9
  • You could try using JMH to benchmark the performance of the two. [Link here](http://java-performance.info/string-switch-performance/). I suggest using a real dataset for testing. – timato May 21 '15 at 13:21

3 Answers3

4

If you check the implementation of equalsIgnoreCase, it just relies on regionMatches:

public boolean equalsIgnoreCase(String anotherString) {
    return (this == anotherString) ? true
            : (anotherString != null)
            && (anotherString.value.length == value.length)
            && regionMatches(true, 0, anotherString, 0, value.length);
}

Therefore, if you do not need to check the length of both strings, you certainly know that they do not refer to the same memory address, and that the second one is never going to be null, regionMatches will perform slightly better since you avoid checking it millions of times. However, being realistic, you are always going to need to check this, so just stick to equalsIgnoreCase. The difference is way too small to notice it even if you have millions of strings.

Balduz
  • 3,560
  • 19
  • 35
1

equalsIgnoreCase uses regionMatches (in OpenJDK at least):

public boolean equalsIgnoreCase(String anotherString) {
    return (this == anotherString) ? true :
           (anotherString != null) && (anotherString.count == count) &&
       regionMatches(true, 0, anotherString, 0, count);
}

So I guess if one should be faster, it should be regionMatches, but it's most certainly negligible.

sp00m
  • 47,968
  • 31
  • 142
  • 252
-2
public class EqualsVsMatch {

    private static final int ROUNDS = 100000000;
    private static final String SEARCH = "SEARCH";
    private static final String SOURCE = "SOURCE";

    public static void main(String[] args) {

        long startRegionMatches = System.currentTimeMillis();
        for(int i = 0; i < ROUNDS; i++) {
            SOURCE.regionMatches(0, SEARCH, 0, 6);
        }
        long endRegionMatches = System.currentTimeMillis();

        long startEqualsIgnoreCase = System.currentTimeMillis();
        for(int i = 0; i < ROUNDS; i++) {
            SOURCE.equalsIgnoreCase(SEARCH);
        }
        long endEqualsIgnoreCase = System.currentTimeMillis();


        System.out.println("regionMatches: " + (endRegionMatches - startRegionMatches));
        System.out.println("equalsIgnoreCase: " + (endEqualsIgnoreCase - startEqualsIgnoreCase));
    }

}

I tried to test it and I got some pretty clear results:

regionMatches: 5
equalsIgnoreCase: 1021

So like the others mentioned, the equalsIgnoreCase just uses regionMatches. So I also suggest you should use regionMatches.

bachph
  • 153
  • 1
  • 12
  • I have to add, that my example relies on the fact that you check millions of Strings that are not equal, otherwise equalsIgnoreCase can be faster. – bachph May 21 '15 at 13:41
  • Benchmarks are not that easy, there are a lot of parameters to take into account (JVM warmup, compiler optimizations, etc.). See http://stackoverflow.com/q/504103/1225328 for more details. – sp00m May 21 '15 at 14:19
  • Also, the main point why your benchmark gave such different results is that you forgot to use `regionMatches` with the `ignoreCase` flag set to `true`: `SOURCE.regionMatches(true, 0, SEARCH, 0, 6);`. – sp00m May 21 '15 at 14:37
  • Yes, you are right. But if you try the example also with SOURCE.regionMatches(true, 0, SEARCH, 0, 6); then the results are also clear.. – bachph May 22 '15 at 14:26