Number of occurrences of a character in a string

Question

I am trying to get the number of occurrences of a certain character such as & in the following string.

string test = "key1=value1&key2=value2&key3=value3";

How do I determine that there are 2 ampersands (&) in the above test string variable?

@CodeInChaos Because some people, when confronted with a problem, think "I know, I'll use regular expressions." — Tanzelax, Apr 30 '12 at 22:44
@Tanzelax. [Like this one](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) ? **:-)** — gdoron, Apr 30 '12 at 22:47
More fun answers [here](http://stackoverflow.com/questions/541954), though they handle chars **and** strings in strings. Benchmarks, etc., included. — ruffin, Oct 11 '12 at 13:27
Have a look at http://stackoverflow.com/questions/541954/how-would-you-count-occurences-of-a-string-within-a-string-c — Ian G, May 31 '13 at 19:15
'Obviously NOT a duplicate as this post wants to count a Character not a String. That notwithstanding, it should be noted that most answers in the the linked post, including the accepted one are WRONG. (In that they don't count a string occurance but characters.) Wrong + wrong = right, but still one of SO's darkest and most embarrssing spots.. — TaW, Sep 04 '14 at 14:16
@Tanzelax And those same folks also often find themselves thinking, "Crap, now I've got two problems." — arkon, Mar 05 '15 at 05:54

score 227 · Accepted Answer · answered Apr 30 '12 at 22:41

227

You could do this:

int count = test.Split('&').Length - 1;

Or with LINQ:

test.Count(x => x == '&');

answered Apr 30 '12 at 22:41

Michael Frederick

16,664
3
43
58

10

It's worth noting the first approach could be incredibly expensive, if the string is large. Worst case if the string is large and (almost) entirely made up of repeated delimiters (&), it could allocate 12-24x the original size of the string due to object overheads in .Net. I would go with the second approach, and if that's not fast enough then write a for loop. – Niall Connaughton Dec 01 '16 at 01:45

gdoron · Answer 2 · 2013-10-07T13:10:39.090

28

Because LINQ can do everything...:

string test = "key1=value1&key2=value2&key3=value3";
var count = test.Where(x => x == '&').Count();

Or if you like, you can use the Count overload that takes a predicate :

var count = test.Count(x => x == '&');

edited Oct 07 '13 at 13:10

answered Apr 30 '12 at 22:41

gdoron

147,333
58
291
367

1

LINQ is also *slower* at doing everything. [Check out this webpage for benchmarks](http://blogs.davelozinski.com/curiousconsultant/csharp-net-fastest-way-to-check-if-a-string-occurs-within-a-string) if you want *fast* code. – Free Coder 24 Apr 13 '14 at 09:56
@FreeCoder24 that's not a problem of LINQ, but rather a bad compiler. E.g. the example should be inlined to a simple loop *(like it does in C++ and Haskell)*. – Hi-Angel Aug 12 '15 at 14:52
@FreeCoder24, just as C# is slower than Assembly in everything. If you want _fast_ code, use Assembly. And BTW, LINQ is faster on sorting than the "native" framework methods. – gdoron Jan 20 '16 at 21:03

score 12 · Answer 3 · edited Jul 08 '23 at 23:03

12

The most straightforward, and most efficient, would be to simply loop through the characters in the string:

int cnt = 0;
foreach (char c in test) {
  if (c == '&') cnt++;
}

You can use LINQ extensions to make a simpler, and almost as efficient version. There is a bit more overhead, but it's still surprisingly close to the loop in performance:

int cnt = test.Count(c => c == '&');

Then there is the old Replace trick. However, that is better suited for languages where looping is awkward (SQL) or slow (VBScript):

int cnt = test.Length - test.Replace("&", "").Length;

edited Jul 08 '23 at 23:03

Peter Mortensen

30,738
21
105
131

answered Apr 30 '12 at 22:48

Guffa

687,336
108
737
1,005

_surprisingly close to the loop in performance_ only with rather small haystacks. – TaW Sep 04 '14 at 14:17
@TaW: I don't see a significant rate difference between short and long (1MB) strings, but for some reason there is a bigger difference in x64 mode than in x86 mode. – Guffa Sep 04 '14 at 17:45
I didn't test the char count version, but the linq string count slows down more and more with longer strings and finally dies with a oom exception. 1MB is not yet a problem though. – TaW Sep 04 '14 at 17:58
@TaW: I tried it with a 2TB string, and that works. Any larger and I get an oom exception when creating the string. – Guffa Sep 04 '14 at 18:36
OK, then I guess it is only the string count that breaks down.. See [here](http://stackoverflow.com/questions/541954/how-would-you-count-occurrences-of-a-string-within-a-string) – TaW Sep 04 '14 at 18:40
@TaW: It's not a surprise if some of those breaks with larger strings. The solution using `Replace` creates another string that can be as large as the original, `Split` creates an array of strings that together can be as large as the original, `Regex.Matches` creates a `Match` object for each string found, and `Select(Substring())` creates a terrible number of strings. – Guffa Sep 04 '14 at 19:52
And according to [these performance tests](http://cc.davelozinski.com/c-sharp/fastest-way-count-number-times-character-occurs-string) a straight for-loop is the best performer over any of the linq methods. – Dec 12 '14 at 08:13
"most efficient", I disagree, the longer the string gets the slower it is. Yet your last answer "old Replace trick" is "almost" the best. 10 seconds vs 5 seconds for a long string. The approach with string split is just milliseconds shorter. – Pawel Cioch Oct 19 '15 at 13:28
1

@PawelCioch: It will always be slower the longer the string is. There is no magic way to process all of the string without processing all of the string. – Guffa Oct 19 '15 at 13:39
@Guffa Yes, but in long run, since I'm working on performance important code, I tested all combinations, and Replace, Split methods or using IndexOf algorithm will be always faster than traversing the string or char array and checking each char equality. Also my intention of my comment was not to put you in bad light, but statement that loop is "most efficient" is not true and as a heads up for other who may we looking for performance solution, yet I agree it is "The most straight forward". Thanks! – Pawel Cioch Oct 19 '15 at 14:18
3

@PawelCioch: There has to be something wrong with your performance test. The `Replace`, `Split` or `IndexOf` can't be faster than traversing the string and checking each character, as that is exactly what they are doing, only adding extra overhead. – Guffa Oct 19 '15 at 14:24

score 10 · Answer 4 · answered Apr 30 '12 at 22:41

10

Why use regex for that. String implements IEnumerable<char>, so you can just use LINQ.

test.Count(c => c == '&')

answered Apr 30 '12 at 22:41

Brian Rasmussen

114,645
34
221
317

score 8 · Answer 5 · answered Apr 30 '12 at 22:51

8

Your string example looks like the query string part of a GET. If so, note that HttpContext has some help for you

int numberOfArgs = HttpContext.Current.QueryString.Count;

For more of what you can do with QueryString, see NameValueCollection

answered Apr 30 '12 at 22:51

payo

4,501
1
24
32

score 6 · Answer 6 · answered Apr 30 '12 at 23:02

6

Here is the most inefficient way to get the count in all answers. But you'll get a Dictionary that contains key-value pairs as a bonus.

string test = "key1=value1&key2=value2&key3=value3";

var keyValues = Regex.Matches(test, @"([\w\d]+)=([\w\d]+)[&$]*")
                     .Cast<Match>()
                     .ToDictionary(m => m.Groups[1].Value, m => m.Groups[2].Value);

var count = keyValues.Count - 1;

answered Apr 30 '12 at 23:02

L.B

114,136
19
178
224

9

haha, "most inefficient way", love it! – payo Apr 30 '12 at 23:09
1

Put this as an Q&A tagged `code-trolling` on http://codegolf.stackexchange.com – This company is turning evil. Jan 14 '14 at 17:18

Number of occurrences of a character in a string

6 Answers6

Linked

Related