104

I am trying to get the number of occurrences of a certain character such as & in the following string.

string test = "key1=value1&key2=value2&key3=value3";

How do I determine that there are 2 ampersands (&) in the above test string variable?

dotnet-practitioner
  • 13,968
  • 36
  • 127
  • 200
  • 2
    @CodeInChaos Because some people, when confronted with a problem, think "I know, I'll use regular expressions." – Tanzelax Apr 30 '12 at 22:44
  • @Tanzelax. [Like this one](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) ? **:-)** – gdoron Apr 30 '12 at 22:47
  • More fun answers [here](http://stackoverflow.com/questions/541954), though they handle chars **and** strings in strings. Benchmarks, etc., included. – ruffin Oct 11 '12 at 13:27
  • Have a look at http://stackoverflow.com/questions/541954/how-would-you-count-occurences-of-a-string-within-a-string-c – Ian G May 31 '13 at 19:15
  • 7
    'Obviously NOT a duplicate as this post wants to count a Character not a String. That notwithstanding, it should be noted that most answers in the the linked post, including the accepted one are WRONG. (In that they don't count a string occurance but characters.) Wrong + wrong = right, but still one of SO's darkest and most embarrssing spots.. – TaW Sep 04 '14 at 14:16
  • @Tanzelax And those same folks also often find themselves thinking, "Crap, now I've got two problems." – arkon Mar 05 '15 at 05:54

6 Answers6

227

You could do this:

int count = test.Split('&').Length - 1;

Or with LINQ:

test.Count(x => x == '&');
Michael Frederick
  • 16,664
  • 3
  • 43
  • 58
  • 10
    It's worth noting the first approach could be incredibly expensive, if the string is large. Worst case if the string is large and (almost) entirely made up of repeated delimiters (&), it could allocate 12-24x the original size of the string due to object overheads in .Net. I would go with the second approach, and if that's not fast enough then write a for loop. – Niall Connaughton Dec 01 '16 at 01:45
28

Because LINQ can do everything...:

string test = "key1=value1&key2=value2&key3=value3";
var count = test.Where(x => x == '&').Count();

Or if you like, you can use the Count overload that takes a predicate :

var count = test.Count(x => x == '&');
gdoron
  • 147,333
  • 58
  • 291
  • 367
  • 1
    LINQ is also *slower* at doing everything. [Check out this webpage for benchmarks](http://blogs.davelozinski.com/curiousconsultant/csharp-net-fastest-way-to-check-if-a-string-occurs-within-a-string) if you want *fast* code. – Free Coder 24 Apr 13 '14 at 09:56
  • @FreeCoder24 that's not a problem of LINQ, but rather a bad compiler. E.g. the example should be inlined to a simple loop *(like it does in C++ and Haskell)*. – Hi-Angel Aug 12 '15 at 14:52
  • @FreeCoder24, just as C# is slower than Assembly in everything. If you want _fast_ code, use Assembly. And BTW, LINQ is faster on sorting than the "native" framework methods. – gdoron Jan 20 '16 at 21:03
12

The most straightforward, and most efficient, would be to simply loop through the characters in the string:

int cnt = 0;
foreach (char c in test) {
  if (c == '&') cnt++;
}

You can use LINQ extensions to make a simpler, and almost as efficient version. There is a bit more overhead, but it's still surprisingly close to the loop in performance:

int cnt = test.Count(c => c == '&');

Then there is the old Replace trick. However, that is better suited for languages where looping is awkward (SQL) or slow (VBScript):

int cnt = test.Length - test.Replace("&", "").Length;
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Guffa
  • 687,336
  • 108
  • 737
  • 1,005
  • _surprisingly close to the loop in performance_ only with rather small haystacks. – TaW Sep 04 '14 at 14:17
  • @TaW: I don't see a significant rate difference between short and long (1MB) strings, but for some reason there is a bigger difference in x64 mode than in x86 mode. – Guffa Sep 04 '14 at 17:45
  • I didn't test the char count version, but the linq string count slows down more and more with longer strings and finally dies with a oom exception. 1MB is not yet a problem though. – TaW Sep 04 '14 at 17:58
  • @TaW: I tried it with a 2TB string, and that works. Any larger and I get an oom exception when creating the string. – Guffa Sep 04 '14 at 18:36
  • OK, then I guess it is only the string count that breaks down.. See [here](http://stackoverflow.com/questions/541954/how-would-you-count-occurrences-of-a-string-within-a-string) – TaW Sep 04 '14 at 18:40
  • @TaW: It's not a surprise if some of those breaks with larger strings. The solution using `Replace` creates another string that can be as large as the original, `Split` creates an array of strings that together can be as large as the original, `Regex.Matches` creates a `Match` object for each string found, and `Select(Substring())` creates a terrible number of strings. – Guffa Sep 04 '14 at 19:52
  • And according to [these performance tests](http://cc.davelozinski.com/c-sharp/fastest-way-count-number-times-character-occurs-string) a straight for-loop is the best performer over any of the linq methods. –  Dec 12 '14 at 08:13
  • "most efficient", I disagree, the longer the string gets the slower it is. Yet your last answer "old Replace trick" is "almost" the best. 10 seconds vs 5 seconds for a long string. The approach with string split is just milliseconds shorter. – Pawel Cioch Oct 19 '15 at 13:28
  • 1
    @PawelCioch: It will always be slower the longer the string is. There is no magic way to process all of the string without processing all of the string. – Guffa Oct 19 '15 at 13:39
  • @Guffa Yes, but in long run, since I'm working on performance important code, I tested all combinations, and Replace, Split methods or using IndexOf algorithm will be always faster than traversing the string or char array and checking each char equality. Also my intention of my comment was not to put you in bad light, but statement that loop is "most efficient" is not true and as a heads up for other who may we looking for performance solution, yet I agree it is "The most straight forward". Thanks! – Pawel Cioch Oct 19 '15 at 14:18
  • 3
    @PawelCioch: There has to be something wrong with your performance test. The `Replace`, `Split` or `IndexOf` can't be faster than traversing the string and checking each character, as that is exactly what they are doing, only adding extra overhead. – Guffa Oct 19 '15 at 14:24
10

Why use regex for that. String implements IEnumerable<char>, so you can just use LINQ.

test.Count(c => c == '&')
Brian Rasmussen
  • 114,645
  • 34
  • 221
  • 317
8

Your string example looks like the query string part of a GET. If so, note that HttpContext has some help for you

int numberOfArgs = HttpContext.Current.QueryString.Count;

For more of what you can do with QueryString, see NameValueCollection

payo
  • 4,501
  • 1
  • 24
  • 32
6

Here is the most inefficient way to get the count in all answers. But you'll get a Dictionary that contains key-value pairs as a bonus.

string test = "key1=value1&key2=value2&key3=value3";

var keyValues = Regex.Matches(test, @"([\w\d]+)=([\w\d]+)[&$]*")
                     .Cast<Match>()
                     .ToDictionary(m => m.Groups[1].Value, m => m.Groups[2].Value);

var count = keyValues.Count - 1;
L.B
  • 114,136
  • 19
  • 178
  • 224