122

I have a string buffer of about 2000 characters and need to check the buffer if it contains a specific string.
Will do the check in a ASP.NET 2.0 webapp for every webrequest.

Does anyone know if the String.Contains method performs better than String.IndexOf method?

    // 2000 characters in s1, search token in s2
    string s1 = "Many characters. The quick brown fox jumps over the lazy dog"; 
    string s2 = "fox";
    bool b;
    b = s1.Contains(s2);
    int i;
    i = s1.IndexOf(s2);

Fun fact

Kb.
  • 7,240
  • 13
  • 56
  • 75
  • 16
    If you need to do this a billion times per web request, I would begin to take a look at stuff like this. In any other case, I would not bother, since the time spent in either method will most likely be incredibly insignificant compared to receiving the HTTP request in the first place. – mookid8000 Jan 31 '09 at 11:49
  • 2
    One of the keys to optimization is to test instead of assuming, because it can depend on a lot of factors such as .NET version, operating system, hardware, variation in the input, etc. In a lot of cases test results done by others can be very different on your system. – Slai Nov 28 '16 at 14:28

10 Answers10

187

Contains calls IndexOf:

public bool Contains(string value)
{
    return (this.IndexOf(value, StringComparison.Ordinal) >= 0);
}

Which calls CompareInfo.IndexOf, which ultimately uses a CLR implementation.

If you want to see how strings are compared in the CLR this will show you (look for CaseInsensitiveCompHelper).

IndexOf(string) has no options and Contains()uses an Ordinal compare (a byte-by-byte comparison rather than trying to perform a smart compare, for example, e with é).

So IndexOf will be marginally faster (in theory) as IndexOf goes straight to a string search using FindNLSString from kernel32.dll (the power of reflector!).

Updated for .NET 4.0 - IndexOf no longer uses Ordinal Comparison and so Contains can be faster. See comment below.

ProVega
  • 5,864
  • 2
  • 36
  • 34
Chris S
  • 64,770
  • 52
  • 221
  • 239
  • 3
    This answer is nowhere near correct, just take a look here http://stackoverflow.com/posts/498880/revisions for the explanation – pzaj Nov 12 '15 at 08:01
  • 64
    My answer is 7 years old and based on the .NET 2 framework. Version 4 `IndexOf()` does indeed use `StringComparison.CurrentCulture` and `Contains()` uses `StringComparison.Ordinal` which will be faster. But really the speed differences we're talking about are minute - the point is one calls the other, and Contains is more readable if you don't need the index. In other words don't worry about it. – Chris S Nov 15 '15 at 18:17
  • 4
    Tried it today on a 1.3 GB text file. Amongst others every line is checked for existence of a '@' char. 17.000.000 calls to Contains/IndexOf are made. Result: 12.5 sec for all Contains() calls, 2.5 sec for all IndexOf() calls. => IndexOf performs 5 times faster!! (.Net 4.8) – CSharper Sep 21 '20 at 07:55
  • 1
    @CSharper can you please share the source code of this benchmark? – Diomedes Domínguez Apr 13 '21 at 20:52
12

Contains(s2) is many times (in my computer 10 times) faster than IndexOf(s2) because Contains uses StringComparison.Ordinal that is faster than the culture sensitive search that IndexOf does by default (but that may change in .net 4.0 http://davesbox.com/archive/2008/11/12/breaking-changes-to-the-string-class.aspx).

Contains has exactly the same performance as IndexOf(s2,StringComparison.Ordinal) >= 0 in my tests but it's shorter and makes your intent clear.

ggf31416
  • 3,582
  • 1
  • 25
  • 26
  • 2
    The changes in .NET 4.0 were apparently reverted before it went RTM so I wouldn't rely on that article too much http://blogs.msdn.com/bclteam/archive/2008/11/04/what-s-new-in-the-bcl-in-net-4-0-justin-van-patten.aspx – Stephen Kennedy Mar 01 '12 at 16:08
7

I am running a real case (in opposite to a synthetic benchmark)

 if("=,<=,=>,<>,<,>,!=,==,".IndexOf(tmps)>=0) {

versus

 if("=,<=,=>,<>,<,>,!=,==,".Contains(tmps)) {

It is a vital part of my system and it is executed 131,953 times (thanks DotTrace).

However shocking surprise, the result is the opposite that expected

  • IndexOf 533ms.
  • Contains 266ms.

:-/

net framework 4.0 (updated as for 13-02-2012)

magallanes
  • 6,583
  • 4
  • 54
  • 55
6

By using Reflector, you can see, that Contains is implemented using IndexOf. Here's the implementation.

public bool Contains(string value)
{
   return (this.IndexOf(value, StringComparison.Ordinal) >= 0);
}

So Contains is likely a wee bit slower than calling IndexOf directly, but I doubt that it will have any significance for the actual performance.

Brian Rasmussen
  • 114,645
  • 34
  • 221
  • 317
  • 1
    Yes, but to use indexof as a bool, he would have to do the comparison outside the function. That would most likely give the same result as Contains, wouldn't it? – Gonzalo Quero Jan 31 '09 at 12:03
  • 1
    Probably, but you do save one method call (unless it can be inlined). As I said, it will probably never be significant. – Brian Rasmussen Jan 31 '09 at 12:26
6

If you really want to micro optimise your code your best approach is always benchmarking.

The .net framework has an excellent stopwatch implementation - System.Diagnostics.Stopwatch

Rui Jarimba
  • 11,166
  • 11
  • 56
  • 86
Andrew Harry
  • 13,773
  • 18
  • 67
  • 102
  • It's the best **but** if you want a quick approach just press the pause button in a debug session. The code control is likely to halt in the slowest part *roughly 50% of the time*. – Jeremy Thompson Aug 28 '19 at 06:11
  • 2
    @JeremyThompson repeat the "pause debug" method like 10 times and you got yourself a profiler – Default Aug 14 '20 at 05:23
4

From a little reading, it appears that under the hood the String.Contains method simply calls String.IndexOf. The difference is String.Contains returns a boolean while String.IndexOf returns an integer with (-1) representing that the substring was not found.

I would suggest writing a little test with 100,000 or so iterations and see for yourself. If I were to guess, I'd say that IndexOf may be slightly faster but like I said it just a guess.

Jeff Atwood has a good article on strings at his blog. It's more about concatenation but may be helpful nonetheless.

Mike Roosa
  • 4,752
  • 11
  • 40
  • 52
3

Just as an update to this I've been doing some testing and providing your input string is fairly large then parallel Regex is the fastest C# method I've found (providing you have more than one core I imagine)

Getting the total amount of matches for example -

needles.AsParallel ( ).Sum ( l => Regex.IsMatch ( haystack , Regex.Escape ( l ) ) ? 1 : 0 );

Hope this helps!

gary
  • 624
  • 6
  • 10
  • 1
    Hi phild on a separate thread updated this with a version from http://tomasp.net/articles/ahocorasick.aspx which, providing your keywords (needles) don't change is a lot quicker. – gary Apr 17 '10 at 12:40
3

Tried it today on a 1.3 GB text file. Amongst others every line is checked for existence of a '@' char. 17.000.000 calls to Contains/IndexOf are made. Result: 12.5 sec for all Contains('@') calls, 2.5 sec for all IndexOf('@') calls. => IndexOf performs 5 times faster!! (.Net 4.8)

CSharper
  • 298
  • 2
  • 13
  • I doubt this is a useful benchmark, especially without the code. File I/O time dwarfs string searches. 1.3GB / 12.5 sec is about 100MB/sec, about the speed of a common hard drive. I suspect that you wrote a program to read a file and execute Contains() against it, then read the file again and execute IndexOf(), except now the file is mostly cached, so read time will be far lower. Perhaps about 5 times lower? Benchmarking is hard. – Charles Burns Aug 03 '23 at 20:41
1

Use a benchmark library, like this recent foray from Jon Skeet to measure it.

Caveat Emptor

As all (micro-)performance questions, this depends on the versions of software you are using, the details of the data inspected and the code surrounding the call.

As all (micro-)performance questions, the first step has to be to get a running version which is easily maintainable. Then benchmarking, profiling and tuning can be applied to the measured bottlenecks instead of guessing.

tckmn
  • 57,719
  • 27
  • 114
  • 156
David Schmitt
  • 58,259
  • 26
  • 121
  • 165
  • While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. – Mike Stockdale May 22 '14 at 01:45
  • the linked library is only one of many, and not the main thrust of the answer. I do not think that posting the libraries source or description would improve the answer, this site or the world. – David Schmitt May 29 '14 at 06:57
  • 4
    -1 ; the question was "Does anyone know if the String.Contains method performs better than String.IndexOf method?" - your answer is "use a benchmark library", which basically means "I don't know, do it yourself", "this depends", which means "I don't know", and "get a running version and profile", which also means "I don't know, do it yourself". This isn't 'Jeopardy' - please provide *an answer* **to the question asked**, not *how-to* **ideas** - their place is in *comments*. –  Jun 20 '14 at 13:29
-10

For anyone still reading this, indexOf() will probably perform better on most enterprise systems, as contains() is not compatible with IE!

Zargontapel
  • 298
  • 1
  • 2
  • 12