1033

I am doing something where I realised I wanted to count how many /s I could find in a string, and then it struck me, that there were several ways to do it, but couldn't decide on what the best (or easiest) was.

At the moment I'm going with something like:

string source = "/once/upon/a/time/";
int count = source.Length - source.Replace("/", "").Length;

But I don't like it at all, any takers?

I don't really want to dig out RegEx for this, do I?

I know my string is going to have the term I'm searching for, so you can assume that...

Of course for strings where length > 1,

string haystack = "/once/upon/a/time";
string needle = "/";
int needleCount = ( haystack.Length - haystack.Replace(needle,"").Length ) / needle.Length;
alelom
  • 2,130
  • 3
  • 26
  • 38
Ian G
  • 29,468
  • 21
  • 78
  • 92
  • 42
    +1: i must say that its a very different way of doing count. i am surprised at the bench mark test results :) – naveen Mar 15 '12 at 15:26
  • 5
    It's not so different... it's the typical way to implement this functionality in SQL: `LEN(ColumnToCheck) - LEN(REPLACE(ColumnToCheck,"N",""))`. – Sheridan Jan 15 '13 at 16:02
  • 7
    As a matter of fact you should divide by "/".Length – Gerard Mar 20 '13 at 20:13
  • 4
    May I ask, what would your requirements say the count should be for the number of occurrences of "//" within "/////"? 2 or 4? – Les Jun 02 '14 at 13:11
  • 1
    using regex is probably the best way to go about it – Adam Higgins May 07 '15 at 22:52
  • what do you mean "dig out RegEx", I guess you would mean the same regarding Linq as that may be more obscure and might not have any less overhead if any? – barlop Jan 14 '19 at 14:03

35 Answers35

1205

If you're using .NET 3.5 you can do this in a one-liner with LINQ:

int count = source.Count(f => f == '/');

If you don't want to use LINQ you can do it with:

int count = source.Split('/').Length - 1;

You might be surprised to learn that your original technique seems to be about 30% faster than either of these! I've just done a quick benchmark with "/once/upon/a/time/" and the results are as follows:

Your original = 12s
source.Count = 19s
source.Split = 17s
foreach (from bobwienholt's answer) = 10s

(The times are for 50,000,000 iterations so you're unlikely to notice much difference in the real world.)

Community
  • 1
  • 1
LukeH
  • 263,068
  • 57
  • 365
  • 409
  • 1
    Did you know you can call Count(predicate) on a string without having to convert it to a character array? See my answer above. – Judah Gabriel Himango Feb 12 '09 at 16:04
  • @Judah, You're right, but weirdly VS2008 isn't giving me intellisense for source.Count - it compiles and runs fine though, so +1 for your answer. – LukeH Feb 12 '09 at 16:10
  • 1
    @in.spite, Your original code also has the advantage that with a small tweak (as in ZombieSheep's answer) you can search for strings of arbitrary length rather than just single character. – LukeH Feb 12 '09 at 17:36
  • 7
    Yeah, VS hides LINQ extension methods on the string class. I guess they figured devs wouldn't want all those extension methods to show up on the string class. Probably a wise decision. – Judah Gabriel Himango Feb 15 '09 at 23:27
  • String implements IEnumerable. Not sure how many people realize that. That is strange about the IntelliSense. VS2008 doesn't show it, but VS2010 does. – Bryan Jun 17 '11 at 16:11
  • 11
    It's possible this behaviour is because VS2010 automatically includes System.Linq in new class files, VS2008 probably does not. The namespace needs to be in for the intellisense to work. – Sprague Jul 12 '12 at 08:13
  • 37
    Note that the Count and Split solutions will only work when you're counting characters. They will not work with strings, like the OP's solution does. – Peter Lillevold May 07 '14 at 09:03
  • 2
    Also worth noting that if you monitor memory usage with System.GC.GetTotalMemory(false).. Repeating the 50 million iteration test, I see about 2,000,000 bytes ready for garbage collection after LINQ. With the foreach loop... zero. LINQ might look slick but go old school if you are in high repetition areas of code. – user922020 Mar 27 '15 at 21:22
  • 1
    @PeterLillevold Actually, there is an overload of Split that accepts strings: int count = source.Split(new string[] {"asdf"}, StringSplitOptions.None ).Length - 1; – heringer Nov 13 '15 at 16:21
  • 6
    `f == '\' ` is about chars in a string, not strings in a string – Thomas Weller Aug 26 '16 at 14:13
  • 11
    This seems like the answer to a different question: "How would you count occurrences of a char within a string?" – Ben Aaronson Nov 28 '16 at 13:34
211
string source = "/once/upon/a/time/";
int count = 0;
foreach (char c in source) 
  if (c == '/') count++;

Has to be faster than the source.Replace() by itself.

BartoszKP
  • 34,786
  • 15
  • 102
  • 130
bobwienholt
  • 17,420
  • 3
  • 40
  • 48
  • 20
    You could gain a marginal improvement by switching to a for instead of a foreach, but only a tiny, tiny bit. – Mark Feb 12 '09 at 18:13
  • 19
    No. The question asks to count occurence of string, not character. – YukiSakura Dec 07 '15 at 09:47
  • @Mark it should be faster - foreach creates an enumerator object and invokes some methods per iteration. And we're only talking about tiny, tiny bits of improvement anyway –  Apr 08 '16 at 15:23
  • 4
    This is counting characters in a string. The title is about counting strings in a string – Thomas Weller Aug 26 '16 at 14:11
  • 3
    @Mark Just tested it with a for loop and it was actually slower than using foreach. Could be because of bounds-checking? (Time was 1.65 sec vs 2.05 on 5 mil iterations.) – Measurity Dec 13 '16 at 09:05
  • 12
    While the question is asking for a string within a string, the example problem OP posted is actually just one character, in which case I would call this answer still a valid solution, as it shows a better way (char search instead of string search) to address the problem at hand. – Chad Feb 23 '17 at 21:39
148
int count = new Regex(Regex.Escape(needle)).Matches(haystack).Count;
Gabe
  • 84,912
  • 12
  • 139
  • 238
Yet Another Code Maker
  • 1,625
  • 1
  • 10
  • 5
90

If you want to be able to search for whole strings, and not just characters:

src.Select((c, i) => src.Substring(i))
    .Count(sub => sub.StartsWith(target))

Read as "for each character in the string, take the rest of the string starting from that character as a substring; count it if it starts with the target string."

idbrii
  • 10,975
  • 5
  • 66
  • 107
mqp
  • 70,359
  • 14
  • 95
  • 123
  • 1
    Not sure how I can explain it in a clearer way than the description given. What is confusing? – mqp Mar 08 '12 at 20:17
  • 69
    **SUPER SLOW!** *Tried it on a page of html and it took about 2 minutes as versus other methods on this page that took 2 seconds. The answer was correct; it was just too slow to be usable.* – JohnB Jun 20 '12 at 21:51
  • 2
    agreed, too slow. i'm a big fan of linq-style solutions but this one is just not viable. – Sprague Jul 12 '12 at 08:09
  • 5
    Note that the reason this is so slow is that it creates n strings, thus allocating roughly n^2/2 bytes. – Peter Crabtree Feb 07 '13 at 19:32
  • Super slow but answers the question title (question body is different though) – nawfal Apr 25 '13 at 14:40
  • 7
    OutOfMemoryException is thrown for my 210000 chars of string. – ender Sep 13 '13 at 08:47
  • this seems to be the slowest **BUT** the only one that fits my needs! e.g. searching for the string "lol" in "lolololololol" should result in 6 occurences whereas all the other methods here return 3. as they count like this "**lol** o **lol** o **lol** ol" – cmxl Oct 30 '15 at 14:31
  • 1
    `src.Where((c, i) => src.Skip(i).Take(target.Length).SequenceEqual(target)).Count()` consumes less memory and runs faster :) – EriF89 Feb 22 '19 at 15:00
  • 2
    Now, with the new `Span` APIs, you can do this m̶o̶r̶e̶ ̶e̶f̶f̶i̶c̶i̶e̶n̶t̶l̶y̵ _less inefficiently_. :) First prep two variables, `srcSpan = src.AsSpan()` and `targetSpan = target.AsSpan()`. Then replace `src.Substring(i)` by `srcSpan.Slice(i)`, and replace `sub.StartsWith(target)` by `sub.StartsWith(targetSpan)`. This avoids the ridiculous number of heap allocations, but not the O(N^2) time complexity. – Timo Jul 02 '19 at 12:17
  • @Timo. I think time is close to `O(N)`. REASON: There are `N` `StartsWith` calls. Most of the time, the first character will be different, thus a well-written StartsWith should return `false` quickly, regardless of substring length. – ToolmakerSteve Jan 07 '23 at 00:36
  • @ToolmakerSteve `StartsWith` is not the issue. For each char in the string, _all remaining chars are copied_, from that char up to the end. That is N chars * (N/2 average length), copying `O(N^2)` chars in total. Other commenters have observed the time (and space) implications of this in practice. – Timo Jan 09 '23 at 12:07
  • @Timo - I was responding to the last line of your comment. I don't understand how your suggestion (to use `Slice`) can *avoid the ridiculous number of heap allocations*, yet NOT avoid *copying all remaining characters*. If `Slice` copies characters, then its still doing allocation ber index. How can it be one (avoid allocations), but not the other (still N^2)? I assumed from your comment that `Slice` used pointers "under the hood", to avoid copy. – ToolmakerSteve Jan 10 '23 at 01:26
  • Indeed, `Slice` returns a new struct that points to a segment of the original sequence. – Timo Jan 10 '23 at 12:42
76

I've made some research and found that Richard Watson's solution is fastest in most cases. That's the table with results of every solution in the post (except those use Regex because it throws exceptions while parsing string like "test{test")

    Name      | Short/char |  Long/char | Short/short| Long/short |  Long/long |
    Inspite   |         134|        1853|          95|        1146|         671|
    LukeH_1   |         346|        4490|         N/A|         N/A|         N/A|
    LukeH_2   |         152|        1569|         197|        2425|        2171|
Bobwienholt   |         230|        3269|         N/A|         N/A|         N/A|
Richard Watson|          33|         298|         146|         737|         543|
StefanosKargas|         N/A|         N/A|         681|       11884|       12486|

You can see that in case of finding number of occurences of short substrings (1-5 characters) in short string(10-50 characters) the original algorithm is preferred.

Also, for multicharacter substring you should use the following code (based on Richard Watson's solution)

int count = 0, n = 0;

if(substring != "")
{
    while ((n = source.IndexOf(substring, n, StringComparison.InvariantCulture)) != -1)
    {
        n += substring.Length;
        ++count;
    }
}
Community
  • 1
  • 1
tsionyx
  • 1,629
  • 1
  • 17
  • 34
  • I was about to add my own 'low level' solution (without creating substrings, using replace/split, or any Regex/Linq), but yours is possibly even better than mine (and at least shorter). Thanks! – Dan W Aug 03 '12 at 20:03
  • For the Regex solutions, add in a `Regex.Escape(needle)` – Thymine Jun 14 '13 at 14:57
  • 2
    Just to point out for others, search value needs to be checked if empty, otherwise you will get into an infinite loop. – WhoIsRich May 30 '14 at 11:43
  • 4
    Maybe it's just me, but for `source="aaa" substring="aa"` I expected to get back 2, not 1. To "fix" this, change `n += substring.Length` to `n++` – ytoledano Sep 01 '16 at 20:35
  • you can add the `overlapped` flag to meet your case like this: `overlapped=True;.... if(overlapped) {++n;} else {n += substring.Length;}` – tsionyx Sep 02 '16 at 11:40
  • Did some performance testing as well and found that for multicharacter substring Ben's solution using string.Replace is about three times faster than Richard Watson's solution. – Chronicle Dec 09 '19 at 19:49
75

LINQ works on all collections, and since strings are just a collection of characters, how about this nice little one-liner:

var count = source.Count(c => c == '/');

Make sure you have using System.Linq; at the top of your code file, as .Count is an extension method from that namespace.

Chad
  • 1,531
  • 3
  • 20
  • 46
Judah Gabriel Himango
  • 58,906
  • 38
  • 158
  • 212
  • 5
    Is it really worth using var there? Is there any chance Count will be replaced with something that doesn't return an int? – Whatsit Feb 12 '09 at 19:01
  • 80
    @Whatsit: you can type 'var' with just your left hand while 'int' requires both hands ;) – Sean Bright Feb 12 '09 at 22:05
  • 1
    Whatsit, you may prefer int there. I'll often use var for locals even if the variable type is obvious. Personal preference. – Judah Gabriel Himango Feb 15 '09 at 23:24
  • 8
    `int` letters all reside in home keys, while `var` doesn't. uh.. wait, i'm using Dvorak – Michael Buen May 07 '10 at 14:40
  • I can't get it to compile :( Does nlt understand Count on Strings – Bohn Jan 27 '12 at 16:36
  • 2
    @BDotA Make sure you have a 'using System.Linq;' at the top of your file. Also, intellisense might hide the .Count call from you since it's a string. Even so, it will compile and run just fine. – Judah Gabriel Himango Jan 27 '12 at 19:25
  • 4
    @JudahGabrielHimango I would argue that var should be used _especially_ when the variable type is obvious (and for brevity and consistency) – EriF89 Feb 22 '19 at 15:03
61
string source = "/once/upon/a/time/";
int count = 0;
int n = 0;

while ((n = source.IndexOf('/', n)) != -1)
{
   n++;
   count++;
}

On my computer it's about 2 seconds faster than the for-every-character solution for 50 million iterations.

2013 revision:

Change the string to a char[] and iterate through that. Cuts a further second or two off the total time for 50m iterations!

char[] testchars = source.ToCharArray();
foreach (char c in testchars)
{
     if (c == '/')
         count++;
}

This is quicker still:

char[] testchars = source.ToCharArray();
int length = testchars.Length;
for (int n = 0; n < length; n++)
{
    if (testchars[n] == '/')
        count++;
}

For good measure, iterating from the end of the array to 0 seems to be the fastest, by about 5%.

int length = testchars.Length;
for (int n = length-1; n >= 0; n--)
{
    if (testchars[n] == '/')
        count++;
}

I was wondering why this could be and was Googling around (I recall something about reverse iterating being quicker), and came upon this SO question which annoyingly uses the string to char[] technique already. I think the reversal trick is new in this context, though.

What is the fastest way to iterate through individual characters in a string in C#?

Community
  • 1
  • 1
Richard Watson
  • 2,584
  • 1
  • 23
  • 30
  • 2
    You could put `source.IndexOf('/', n + 1)` and lose the `n++` and the brackets of the while :) Also, put a variable `string word = "/"` instead of the character. – neeKo Dec 13 '12 at 04:59
  • 1
    Hey Niko, checkout new answers. Might be harder to make variable-length substring, though. – Richard Watson Feb 19 '13 at 12:14
  • I used something similar by stepping through the subtring; that's until I realized indexOf has a startIndex. I like the first solution the most as it's a good balance between speed and memory footprint. – Samir Banjanovic Sep 30 '13 at 18:39
  • 2
    I read somewhere that it's faster to iterate backwards because it's faster to compare a value to 0 – reggaeguitar Feb 25 '15 at 22:46
  • @RichardWatson Is `ToCharArray` a cheap operation? – shitpoet Dec 16 '18 at 09:51
  • 1
    @shitpoet yup. If you look at the underlying code, it's a native call. public char[] toCharArray() {... System.arraycopy(value, 0, result, 0, value.length); ... } – Richard Watson Dec 18 '18 at 12:29
  • As this answer seems to be technically wrong, but performance oriented, i'd suggest that you could make it faster yet by doing `for (int n = testchars.Length; --n >= 0;)` and dumping the first line. That'll save a variable assignment that only gets used once. It would also save 3 characters in the for construct. – Takophiliac Feb 04 '21 at 15:56
  • NOTE: If need to search for a string (rather than char), use the FIRST code snippet. `IndexOf` has an overload that takes a substring to look for. – ToolmakerSteve Jan 07 '23 at 00:23
47

These both only work for single-character search terms...

countOccurences("the", "the answer is the answer");

int countOccurences(string needle, string haystack)
{
    return (haystack.Length - haystack.Replace(needle,"").Length) / needle.Length;
}

may turn out to be better for longer needles...

But there has to be a more elegant way. :)

ZombieSheep
  • 29,603
  • 12
  • 67
  • 114
22

Edit:

source.Split('/').Length-1
Cody Guldner
  • 2,888
  • 1
  • 25
  • 36
Brian Rudolph
  • 6,142
  • 2
  • 23
  • 19
  • 3
    This is what I do. And `source.Split(new[]{"//"}, StringSplitOptions.None).Count - 1` for multi-character separators. – bzlm Oct 12 '09 at 10:05
  • 4
    This would perform at least n string allocations on the heap, plus (possibly) few array re-sizes - and all this just to get the count? Extremely inefficient, doesn't scale well and should never be used in any important code. – Zar Shardan Dec 13 '12 at 04:16
  • Note: use `Lenght` if `Count` problems. – Gray Programmerz Jul 10 '22 at 06:35
20
Regex.Matches(input,  Regex.Escape("stringToMatch")).Count
cederlof
  • 7,206
  • 4
  • 45
  • 62
16

In C#, a nice String SubString counter is this unexpectedly tricky fellow:

public static int CCount(String haystack, String needle)
{
    return haystack.Split(new[] { needle }, StringSplitOptions.None).Length - 1;
}
Dave
  • 3,093
  • 35
  • 32
  • 1
    Nice solution - and working for string too (not just char)! – ChriPf Apr 19 '16 at 06:20
  • Thanks, it's all too easy to forget some of the subtleties of string handling when swapping languages - like most of us have to these days! – Dave Apr 28 '16 at 13:12
  • 2
    -1 because: Do you know the difference between Count() and Count or Length ? If someone is using Count() instead of Count or Length I get triggered. Count() creates IEnumerator then goes thru all occurences of the IEnumerable whereas Count or Length are already set properties of the object which already hold the count you want without the need to iterate over all the elements. – aeroson Feb 13 '17 at 19:04
  • Good spot, and what's weird is that in my library, from where I took the function, I am using "Length". Edited! – Dave Feb 14 '17 at 08:51
  • This solution only finds `aa` three times in `aaaaaa` while it actually occurs 5 times – ProfK Dec 28 '21 at 09:37
  • 3 is the answer I'd like - to behave like a word count - but you're right too, of course. Neither answer is "wrong" but the developer needs to be know which is correct for their situation. – Dave Jan 10 '22 at 09:55
  • @aeroson - This isn't entirely true. As the [documentation of Enumerable.Count](https://learn.microsoft.com/en-us/dotnet/api/system.linq.enumerable.count?view=net-6.0#system-linq-enumerable-count-1(system-collections-generic-ienumerable((-0)))) specifies in the "Remarks" section, it will use the `Count` property if the enumerable implements the `ICollection` interface. This is the case for most (all?) collections providing a `Count` property (including arrays and lists): _If the type of source implements ICollection, that implementation is used to obtain the count of elements._ – gehho Jan 26 '22 at 14:12
14
private int CountWords(string text, string word) {
    int count = (text.Length - text.Replace(word, "").Length) / word.Length;
    return count;
}

Because the original solution, was the fastest for chars, I suppose it will also be for strings. So here is my contribution.

For the context: I was looking for words like 'failed' and 'succeeded' in a log file.

Gr, Ben

Ben
  • 141
  • 1
  • 2
  • 3
    Just don't pass an empty string for the "word" variable (division by zero error). – Andrew Jens Mar 09 '18 at 02:15
  • @AndrewJens - OTOH, searching for the number of occurrences of an empty string is UNDEFINED. There is NO CORRECT answer to return in that case. Thus, the caller has supplied an invalid parameter, and throwing SOME exception is appropriate. Granted, it would be better to test for this, and throw the more informative `ArgumentException`. – ToolmakerSteve Jan 07 '23 at 00:31
  • @ToolmakerSteve All that matters is that we write robust code. This function might be called thousands of times from a function that tests values in a database field, some of which might be empty strings. The example given above would simply crash at some point. It depends on context and specification, and returning 0 for an empty string could be the required spec (in that it didn't find a specific match). – Andrew Jens Jan 08 '23 at 06:20
13
string s = "65 fght 6565 4665 hjk";
int count = 0;
foreach (Match m in Regex.Matches(s, "65"))
  count++;
LarsTech
  • 80,625
  • 14
  • 153
  • 225
preetham
  • 131
  • 1
  • 2
8

Well as of .NET 5 (Net core 2.1+ & NetStandard 2.1) we have a new iteration speed king.

"Span<T>" https://learn.microsoft.com/en-us/dotnet/api/system.span-1?view=net-5.0

and String has a built-in member that returns us a Span<Char>

int count = 0;
foreach( var c in source.AsSpan())
{
    if (c == '/')
        count++;
}

My tests show 62% faster than a straight foreach. I also compared to a for() loop on a Span<T>[i], as well as a few others posted here. Note that the reverse for() iteration on a String seems to run slower now than a straight foreach.

Starting test, 10000000 iterations
(base) foreach =   673 ms

fastest to slowest
foreach Span =   252 ms   62.6%
  Span [i--] =   282 ms   58.1%
  Span [i++] =   402 ms   40.3%
   for [i++] =   454 ms   32.5%
   for [i--] =   867 ms  -28.8%
     Replace =  1905 ms -183.1%
       Split =  2109 ms -213.4%
  Linq.Count =  3797 ms -464.2%

UPDATE: Dec 2021, Visual Studio 2022, .NET 5 & 6

.NET 5
Starting test, 100000000 iterations set
(base) foreach =  7658 ms
fastest to slowest
  foreach Span =   3710 ms     51.6%
    Span [i--] =   3745 ms     51.1%
    Span [i++] =   3932 ms     48.7%
     for [i++] =   4593 ms     40.0%
     for [i--] =   7042 ms      8.0%
(base) foreach =   7658 ms      0.0%
       Replace =  18641 ms   -143.4%
         Split =  21469 ms   -180.3%
          Linq =  39726 ms   -418.8%
Regex Compiled = 128422 ms -1,577.0%
         Regex = 179603 ms -2,245.3%
         
         
.NET 6
Starting test, 100000000 iterations set
(base) foreach =  7343 ms
fastest to slowest
  foreach Span =   2918 ms     60.3%
     for [i++] =   2945 ms     59.9%
    Span [i++] =   3105 ms     57.7%
    Span [i--] =   5076 ms     30.9%
(base) foreach =   7343 ms      0.0%
     for [i--] =   8645 ms    -17.7%
       Replace =  18307 ms   -149.3%
         Split =  21440 ms   -192.0%
          Linq =  39354 ms   -435.9%
Regex Compiled = 114178 ms -1,454.9%
         Regex = 186493 ms -2,439.7%

I added more loops and threw in RegEx so we can see what a disaster it is to use in a lot of iterations. I think the for(++) loop comparison may have been optimized in .NET 6 to use Span internally - since it's almost the same speed as the foreach span.

Code Link

bmiller
  • 1,454
  • 1
  • 14
  • 14
  • Nice! That's really cool, I almost feel this should be the new accepted answer! – Ian G Jul 28 '21 at 09:01
  • 1
    @inspite thanks for the vote, I guess this is what you get answering a 12 year old question. I came here first before finding Span, thought I'd update it. – bmiller Aug 01 '21 at 17:33
  • why on earth is the Linq method so slow? I'd be curious how this changes with long vs short strings. – Garr Godfrey Sep 16 '21 at 19:05
  • @GarrGodfrey, I wasn't 'that' shocked. I don't think Linq is designed for super tight loops of 10,000,000 iterations... In any case I left a code link if you want to test it out. – bmiller Sep 17 '21 at 15:44
  • slower than `Split` surprises me, since that creates a bunch of new strings and Linq should be just reading. Must be the function call for each character. – Garr Godfrey Sep 17 '21 at 18:45
  • I too found span to be the solution to this problem but I needed to search for substring in a string with multiple characters so I came up with an extension method https://gist.github.com/MiddleTommy/5a571dd3787837c6a83a30806907979f – MiddleTommy Dec 03 '21 at 08:47
  • If the string is a bit larger and there is a relatively small number of occurrences compared to the string length, using `IndexOf` is actually the fastest solution (yielding 4x improvements over the loop for me) because its implementation is vectorized. Here's a gist: https://gist.github.com/Neme12/dfebe11d09909f8d7bbb6463d194c2a9 – Neme Dec 12 '21 at 22:36
7
public static int GetNumSubstringOccurrences(string text, string search)
{
    int num = 0;
    int pos = 0;

    if (!string.IsNullOrEmpty(text) && !string.IsNullOrEmpty(search))
    {
        while ((pos = text.IndexOf(search, pos)) > -1)
        {
            num ++;
            pos += search.Length;
        }
    }
    return num;
}
WhoIsRich
  • 4,053
  • 1
  • 33
  • 19
user460847
  • 1,578
  • 6
  • 25
  • 43
7

For anyone wanting a ready to use String extension method,

here is what I use which was based on the best of the posted answers:

public static class StringExtension
{    
    /// <summary> Returns the number of occurences of a string within a string, optional comparison allows case and culture control. </summary>
    public static int Occurrences(this System.String input, string value, StringComparison stringComparisonType = StringComparison.Ordinal)
    {
        if (String.IsNullOrEmpty(value)) return 0;

        int count    = 0;
        int position = 0;

        while ((position = input.IndexOf(value, position, stringComparisonType)) != -1)
        {
            position += value.Length;
            count    += 1;
        }

        return count;
    }

    /// <summary> Returns the number of occurences of a single character within a string. </summary>
    public static int Occurrences(this System.String input, char value)
    {
        int count = 0;
        foreach (char c in input) if (c == value) count += 1;
        return count;
    }
}
WhoIsRich
  • 4,053
  • 1
  • 33
  • 19
  • Won't the second method go boom if the string passed in is null or empty? Purely from a style point of view, what are you defining the input as System.String rather than just string? – Nodoid Apr 29 '19 at 11:25
6

I think the easiest way to do this is to use the Regular Expressions. This way you can get the same split count as you could using myVar.Split('x') but in a multiple character setting.

string myVar = "do this to count the number of words in my wording so that I can word it up!";
int count = Regex.Split(myVar, "word").Length;
Beroc
  • 61
  • 1
  • 1
5

As of .NET 7, we have allocation-free (and highly optimized) Regex APIs. Counting is especially easy and efficient.

    var input = "abcd abcabc ababc";
    var result = Regex.Count(input: input, pattern: "abc"); // 4

When matching dynamic patterns, remember to escape them:

public static int CountOccurences(string input, string pattern)
{
    pattern = Regex.Escape(pattern); // Aww, no way to avoid heap allocations here

    var result = Regex.Count(input: input, pattern: pattern);
    return result;
}

And, as a bonus for fixed patterns, .NET 7 introduces analyzers that help convert the regex string to source-generated code. Not only does this avoid the runtime compilation overhead for the regex, but it also provides very readable code that shows how it is implemented. In fact, that code is generally at least as efficient as any alternative you would have written manually.

If your regex call is eligible, the analyzer will give a hint. Simply choose "Convert to 'GeneratedRegexAttribute'" and enjoy the result:

[GeneratedRegex("abc")]
private static partial Regex MyRegex(); // Go To Definition to see the generated code
Timo
  • 7,992
  • 4
  • 49
  • 67
4

I felt that we were lacking certain kinds of sub string counting, like unsafe byte-by-byte comparisons. I put together the original poster's method and any methods I could think of.

These are the string extensions I made.

namespace Example
{
    using System;
    using System.Text;

    public static class StringExtensions
    {
        public static int CountSubstr(this string str, string substr)
        {
            return (str.Length - str.Replace(substr, "").Length) / substr.Length;
        }

        public static int CountSubstr(this string str, char substr)
        {
            return (str.Length - str.Replace(substr.ToString(), "").Length);
        }

        public static int CountSubstr2(this string str, string substr)
        {
            int substrlen = substr.Length;
            int lastIndex = str.IndexOf(substr, 0, StringComparison.Ordinal);
            int count = 0;
            while (lastIndex != -1)
            {
                ++count;
                lastIndex = str.IndexOf(substr, lastIndex + substrlen, StringComparison.Ordinal);
            }

            return count;
        }

        public static int CountSubstr2(this string str, char substr)
        {
            int lastIndex = str.IndexOf(substr, 0);
            int count = 0;
            while (lastIndex != -1)
            {
                ++count;
                lastIndex = str.IndexOf(substr, lastIndex + 1);
            }

            return count;
        }

        public static int CountChar(this string str, char substr)
        {
            int length = str.Length;
            int count = 0;
            for (int i = 0; i < length; ++i)
                if (str[i] == substr)
                    ++count;

            return count;
        }

        public static int CountChar2(this string str, char substr)
        {
            int count = 0;
            foreach (var c in str)
                if (c == substr)
                    ++count;

            return count;
        }

        public static unsafe int CountChar3(this string str, char substr)
        {
            int length = str.Length;
            int count = 0;
            fixed (char* chars = str)
            {
                for (int i = 0; i < length; ++i)
                    if (*(chars + i) == substr)
                        ++count;
            }

            return count;
        }

        public static unsafe int CountChar4(this string str, char substr)
        {
            int length = str.Length;
            int count = 0;
            fixed (char* chars = str)
            {
                for (int i = length - 1; i >= 0; --i)
                    if (*(chars + i) == substr)
                        ++count;
            }

            return count;
        }

        public static unsafe int CountSubstr3(this string str, string substr)
        {
            int length = str.Length;
            int substrlen = substr.Length;
            int count = 0;
            fixed (char* strc = str)
            {
                fixed (char* substrc = substr)
                {
                    int n = 0;

                    for (int i = 0; i < length; ++i)
                    {
                        if (*(strc + i) == *(substrc + n))
                        {
                            ++n;
                            if (n == substrlen)
                            {
                                ++count;
                                n = 0;
                            }
                        }
                        else
                            n = 0;
                    }
                }
            }

            return count;
        }

        public static int CountSubstr3(this string str, char substr)
        {
            return CountSubstr3(str, substr.ToString());
        }

        public static unsafe int CountSubstr4(this string str, string substr)
        {
            int length = str.Length;
            int substrLastIndex = substr.Length - 1;
            int count = 0;
            fixed (char* strc = str)
            {
                fixed (char* substrc = substr)
                {
                    int n = substrLastIndex;

                    for (int i = length - 1; i >= 0; --i)
                    {
                        if (*(strc + i) == *(substrc + n))
                        {
                            if (--n == -1)
                            {
                                ++count;
                                n = substrLastIndex;
                            }
                        }
                        else
                            n = substrLastIndex;
                    }
                }
            }

            return count;
        }

        public static int CountSubstr4(this string str, char substr)
        {
            return CountSubstr4(str, substr.ToString());
        }
    }
}

Followed by the test code...

static void Main()
{
    const char matchA = '_';
    const string matchB = "and";
    const string matchC = "muchlongerword";
    const string testStrA = "_and_d_e_banna_i_o___pfasd__and_d_e_banna_i_o___pfasd_";
    const string testStrB = "and sdf and ans andeians andano ip and and sdf and ans andeians andano ip and";
    const string testStrC =
        "muchlongerword amuchlongerworsdfmuchlongerwordsdf jmuchlongerworijv muchlongerword sdmuchlongerword dsmuchlongerword";
    const int testSize = 1000000;
    Console.WriteLine(testStrA.CountSubstr('_'));
    Console.WriteLine(testStrA.CountSubstr2('_'));
    Console.WriteLine(testStrA.CountSubstr3('_'));
    Console.WriteLine(testStrA.CountSubstr4('_'));
    Console.WriteLine(testStrA.CountChar('_'));
    Console.WriteLine(testStrA.CountChar2('_'));
    Console.WriteLine(testStrA.CountChar3('_'));
    Console.WriteLine(testStrA.CountChar4('_'));
    Console.WriteLine(testStrB.CountSubstr("and"));
    Console.WriteLine(testStrB.CountSubstr2("and"));
    Console.WriteLine(testStrB.CountSubstr3("and"));
    Console.WriteLine(testStrB.CountSubstr4("and"));
    Console.WriteLine(testStrC.CountSubstr("muchlongerword"));
    Console.WriteLine(testStrC.CountSubstr2("muchlongerword"));
    Console.WriteLine(testStrC.CountSubstr3("muchlongerword"));
    Console.WriteLine(testStrC.CountSubstr4("muchlongerword"));
    var timer = new Stopwatch();
    timer.Start();
    for (int i = 0; i < testSize; ++i)
        testStrA.CountSubstr(matchA);
    timer.Stop();
    Console.WriteLine("CS1 chr: " + timer.Elapsed.TotalMilliseconds + "ms");

    timer.Restart();
    for (int i = 0; i < testSize; ++i)
        testStrB.CountSubstr(matchB);
    timer.Stop();
    Console.WriteLine("CS1 and: " + timer.Elapsed.TotalMilliseconds + "ms");

    timer.Restart();
    for (int i = 0; i < testSize; ++i)
        testStrC.CountSubstr(matchC);
    timer.Stop();
    Console.WriteLine("CS1 mlw: " + timer.Elapsed.TotalMilliseconds + "ms");

    timer.Restart();
    for (int i = 0; i < testSize; ++i)
        testStrA.CountSubstr2(matchA);
    timer.Stop();
    Console.WriteLine("CS2 chr: " + timer.Elapsed.TotalMilliseconds + "ms");

    timer.Restart();
    for (int i = 0; i < testSize; ++i)
        testStrB.CountSubstr2(matchB);
    timer.Stop();
    Console.WriteLine("CS2 and: " + timer.Elapsed.TotalMilliseconds + "ms");

    timer.Restart();
    for (int i = 0; i < testSize; ++i)
        testStrC.CountSubstr2(matchC);
    timer.Stop();
    Console.WriteLine("CS2 mlw: " + timer.Elapsed.TotalMilliseconds + "ms");

    timer.Restart();
    for (int i = 0; i < testSize; ++i)
        testStrA.CountSubstr3(matchA);
    timer.Stop();
    Console.WriteLine("CS3 chr: " + timer.Elapsed.TotalMilliseconds + "ms");

    timer.Restart();
    for (int i = 0; i < testSize; ++i)
        testStrB.CountSubstr3(matchB);
    timer.Stop();
    Console.WriteLine("CS3 and: " + timer.Elapsed.TotalMilliseconds + "ms");

    timer.Restart();
    for (int i = 0; i < testSize; ++i)
        testStrC.CountSubstr3(matchC);
    timer.Stop();
    Console.WriteLine("CS3 mlw: " + timer.Elapsed.TotalMilliseconds + "ms");

    timer.Restart();
    for (int i = 0; i < testSize; ++i)
        testStrA.CountSubstr4(matchA);
    timer.Stop();
    Console.WriteLine("CS4 chr: " + timer.Elapsed.TotalMilliseconds + "ms");

    timer.Restart();
    for (int i = 0; i < testSize; ++i)
        testStrB.CountSubstr4(matchB);
    timer.Stop();
    Console.WriteLine("CS4 and: " + timer.Elapsed.TotalMilliseconds + "ms");

    timer.Restart();
    for (int i = 0; i < testSize; ++i)
        testStrC.CountSubstr4(matchC);
    timer.Stop();
    Console.WriteLine("CS4 mlw: " + timer.Elapsed.TotalMilliseconds + "ms");

    timer.Restart();
    for (int i = 0; i < testSize; ++i)
        testStrA.CountChar(matchA);
    timer.Stop();
    Console.WriteLine("CC1 chr: " + timer.Elapsed.TotalMilliseconds + "ms");

    timer.Restart();
    for (int i = 0; i < testSize; ++i)
        testStrA.CountChar2(matchA);
    timer.Stop();
    Console.WriteLine("CC2 chr: " + timer.Elapsed.TotalMilliseconds + "ms");

    timer.Restart();
    for (int i = 0; i < testSize; ++i)
        testStrA.CountChar3(matchA);
    timer.Stop();
    Console.WriteLine("CC3 chr: " + timer.Elapsed.TotalMilliseconds + "ms");

    timer.Restart();
    for (int i = 0; i < testSize; ++i)
        testStrA.CountChar4(matchA);
    timer.Stop();
    Console.WriteLine("CC4 chr: " + timer.Elapsed.TotalMilliseconds + "ms");
}

Results: CSX corresponds with CountSubstrX and CCX corresponds with CountCharX. "chr" searches a string for '_', "and" searches a string for "and", and "mlw" searches a string for "muchlongerword"

CS1 chr: 824.123ms
CS1 and: 586.1893ms
CS1 mlw: 486.5414ms
CS2 chr: 127.8941ms
CS2 and: 806.3918ms
CS2 mlw: 497.318ms
CS3 chr: 201.8896ms
CS3 and: 124.0675ms
CS3 mlw: 212.8341ms
CS4 chr: 81.5183ms
CS4 and: 92.0615ms
CS4 mlw: 116.2197ms
CC1 chr: 66.4078ms
CC2 chr: 64.0161ms
CC3 chr: 65.9013ms
CC4 chr: 65.8206ms

And finally, I had a file with 3.6 million characters. It was "derp adfderdserp dfaerpderp deasderp" repeated 100,000 times. I searched for "derp" inside the file with the above methods 100 times these results.

CS1Derp: 1501.3444ms
CS2Derp: 1585.797ms
CS3Derp: 376.0937ms
CS4Derp: 271.1663ms

So my 4th method is definitely the winner, but, realistically, if a 3.6 million character file 100 times only took 1586ms as the worse case, then all of this is quite negligible.

By the way, I also scanned for the 'd' char in the 3.6 million character file with 100 times CountSubstr and CountChar methods. Results...

CS1  d : 2606.9513ms
CS2  d : 339.7942ms
CS3  d : 960.281ms
CS4  d : 233.3442ms
CC1  d : 302.4122ms
CC2  d : 280.7719ms
CC3  d : 299.1125ms
CC4  d : 292.9365ms

The original posters method is very bad for single character needles in a large haystack according to this.

Note: All values were updated to Release version output. I accidentally forgot to build on Release mode upon the first time I posted this. Some of my statements have been amended.

  • 1
    Thank you for performance results. A factor difference in speed of 10 might be a reason to not consider a linq or other neatly written solution but go with an extension method. – Andreas Reiff Dec 18 '19 at 09:27
3
string search = "/string";
var occurrences = (regex.Match(search, @"\/")).Count;

This will count each time the program finds "/s" exactly (case sensitive) and the number of occurrences of this will be stored in the variable "occurrences"

Gordon Bell
  • 13,337
  • 3
  • 45
  • 64
Adam Higgins
  • 705
  • 1
  • 10
  • 25
2
string source = "/once/upon/a/time/";
int count = 0, n = 0;
while ((n = source.IndexOf('/', n) + 1) != 0) count++;

A variation on Richard Watson's answer, slightly faster with improving efficiency the more times the char occurs in the string, and less code!

Though I must say, without extensively testing every scenario, I did see a very significant speed improvement by using:

int count = 0;
for (int n = 0; n < source.Length; n++) if (source[n] == '/') count++;
2
            var conditionalStatement = conditionSetting.Value;

            //order of replace matters, remove == before =, incase of ===
            conditionalStatement = conditionalStatement.Replace("==", "~").Replace("!=", "~").Replace('=', '~').Replace('!', '~').Replace('>', '~').Replace('<', '~').Replace(">=", "~").Replace("<=", "~");

            var listOfValidConditions = new List<string>() { "!=", "==", ">", "<", ">=", "<=" };

            if (conditionalStatement.Count(x => x == '~') != 1)
            {
                result.InvalidFieldList.Add(new KeyFieldData(batch.DECurrentField, "The IsDoubleKeyCondition does not contain a supported conditional statement. Contact System Administrator."));
                result.Status = ValidatorStatus.Fail;
                return result;
            }

Needed to do something similar to test conditional statements from a string.

Replaced what i was looking for with a single character and counted the instances of the single character.

Obviously the single character you're using will need to be checked to not exist in the string before this happens to avoid incorrect counts.

bizah
  • 227
  • 1
  • 2
  • 9
2

String in string:

Find "etc" in " .. JD JD JD JD etc. and etc. JDJDJDJDJDJDJDJD and etc."

var strOrigin = " .. JD JD JD JD etc. and etc. JDJDJDJDJDJDJDJD and etc.";
var searchStr = "etc";
int count = (strOrigin.Length - strOrigin.Replace(searchStr, "").Length)/searchStr.Length.

Check performance before discarding this one as unsound/clumsy...

2

Thought I would throw my extension method into the ring (see comments for more info). I have not done any formal bench marking, but I think it has to be very fast for most scenarios.

EDIT: OK - so this SO question got me to wondering how the performance of our current implementation would stack up against some of the solutions presented here. I decided to do a little bench marking and found that our solution was very much in line with the performance of the solution provided by Richard Watson up until you are doing aggressive searching with large strings (100 Kb +), large substrings (32 Kb +) and many embedded repetitions (10K +). At that point our solution was around 2X to 4X slower. Given this and the fact that we really like the solution presented by Richard Watson, we have refactored our solution accordingly. I just wanted to make this available for anyone that might benefit from it.

Our original solution:

    /// <summary>
    /// Counts the number of occurrences of the specified substring within
    /// the current string.
    /// </summary>
    /// <param name="s">The current string.</param>
    /// <param name="substring">The substring we are searching for.</param>
    /// <param name="aggressiveSearch">Indicates whether or not the algorithm 
    /// should be aggressive in its search behavior (see Remarks). Default 
    /// behavior is non-aggressive.</param>
    /// <remarks>This algorithm has two search modes - aggressive and 
    /// non-aggressive. When in aggressive search mode (aggressiveSearch = 
    /// true), the algorithm will try to match at every possible starting 
    /// character index within the string. When false, all subsequent 
    /// character indexes within a substring match will not be evaluated. 
    /// For example, if the string was 'abbbc' and we were searching for 
    /// the substring 'bb', then aggressive search would find 2 matches 
    /// with starting indexes of 1 and 2. Non aggressive search would find 
    /// just 1 match with starting index at 1. After the match was made, 
    /// the non aggressive search would attempt to make it's next match 
    /// starting at index 3 instead of 2.</remarks>
    /// <returns>The count of occurrences of the substring within the string.</returns>
    public static int CountOccurrences(this string s, string substring, 
        bool aggressiveSearch = false)
    {
        // if s or substring is null or empty, substring cannot be found in s
        if (string.IsNullOrEmpty(s) || string.IsNullOrEmpty(substring))
            return 0;

        // if the length of substring is greater than the length of s,
        // substring cannot be found in s
        if (substring.Length > s.Length)
            return 0;

        var sChars = s.ToCharArray();
        var substringChars = substring.ToCharArray();
        var count = 0;
        var sCharsIndex = 0;

        // substring cannot start in s beyond following index
        var lastStartIndex = sChars.Length - substringChars.Length;

        while (sCharsIndex <= lastStartIndex)
        {
            if (sChars[sCharsIndex] == substringChars[0])
            {
                // potential match checking
                var match = true;
                var offset = 1;
                while (offset < substringChars.Length)
                {
                    if (sChars[sCharsIndex + offset] != substringChars[offset])
                    {
                        match = false;
                        break;
                    }
                    offset++;
                }
                if (match)
                {
                    count++;
                    // if aggressive, just advance to next char in s, otherwise, 
                    // skip past the match just found in s
                    sCharsIndex += aggressiveSearch ? 1 : substringChars.Length;
                }
                else
                {
                    // no match found, just move to next char in s
                    sCharsIndex++;
                }
            }
            else
            {
                // no match at current index, move along
                sCharsIndex++;
            }
        }

        return count;
    }

And here is our revised solution:

    /// <summary>
    /// Counts the number of occurrences of the specified substring within
    /// the current string.
    /// </summary>
    /// <param name="s">The current string.</param>
    /// <param name="substring">The substring we are searching for.</param>
    /// <param name="aggressiveSearch">Indicates whether or not the algorithm 
    /// should be aggressive in its search behavior (see Remarks). Default 
    /// behavior is non-aggressive.</param>
    /// <remarks>This algorithm has two search modes - aggressive and 
    /// non-aggressive. When in aggressive search mode (aggressiveSearch = 
    /// true), the algorithm will try to match at every possible starting 
    /// character index within the string. When false, all subsequent 
    /// character indexes within a substring match will not be evaluated. 
    /// For example, if the string was 'abbbc' and we were searching for 
    /// the substring 'bb', then aggressive search would find 2 matches 
    /// with starting indexes of 1 and 2. Non aggressive search would find 
    /// just 1 match with starting index at 1. After the match was made, 
    /// the non aggressive search would attempt to make it's next match 
    /// starting at index 3 instead of 2.</remarks>
    /// <returns>The count of occurrences of the substring within the string.</returns>
    public static int CountOccurrences(this string s, string substring, 
        bool aggressiveSearch = false)
    {
        // if s or substring is null or empty, substring cannot be found in s
        if (string.IsNullOrEmpty(s) || string.IsNullOrEmpty(substring))
            return 0;

        // if the length of substring is greater than the length of s,
        // substring cannot be found in s
        if (substring.Length > s.Length)
            return 0;

        int count = 0, n = 0;
        while ((n = s.IndexOf(substring, n, StringComparison.InvariantCulture)) != -1)
        {
            if (aggressiveSearch)
                n++;
            else
                n += substring.Length;
            count++;
        }

        return count;
    }
John Doe
  • 165
  • 1
  • 4
  • 15
Casey Chester
  • 268
  • 3
  • 11
2

My initial take gave me something like:

public static int CountOccurrences(string original, string substring)
{
    if (string.IsNullOrEmpty(substring))
        return 0;
    if (substring.Length == 1)
        return CountOccurrences(original, substring[0]);
    if (string.IsNullOrEmpty(original) ||
        substring.Length > original.Length)
        return 0;
    int substringCount = 0;
    for (int charIndex = 0; charIndex < original.Length; charIndex++)
    {
        for (int subCharIndex = 0, secondaryCharIndex = charIndex; subCharIndex < substring.Length && secondaryCharIndex < original.Length; subCharIndex++, secondaryCharIndex++)
        {
            if (substring[subCharIndex] != original[secondaryCharIndex])
                goto continueOuter;
        }
        if (charIndex + substring.Length > original.Length)
            break;
        charIndex += substring.Length - 1;
        substringCount++;
    continueOuter:
        ;
    }
    return substringCount;
}

public static int CountOccurrences(string original, char @char)
{
    if (string.IsNullOrEmpty(original))
        return 0;
    int substringCount = 0;
    for (int charIndex = 0; charIndex < original.Length; charIndex++)
        if (@char == original[charIndex])
            substringCount++;
    return substringCount;
}

The needle in a haystack approach using replace and division yields 21+ seconds whereas this takes about 15.2.

Edit after adding a bit which would add substring.Length - 1 to the charIndex (like it should), it's at 11.6 seconds.

Edit 2: I used a string which had 26 two-character strings, here are the times updated to the same sample texts:

Needle in a haystack (OP's version): 7.8 Seconds

Suggested mechanism: 4.6 seconds.

Edit 3: Adding the single character corner-case, it went to 1.2 seconds.

Edit 4: For context: 50 million iterations were used.

2

Split (may) wins over IndexOf (for strings).

The benchmark above seems to indicate that Richard Watson is the fastest for string which is wrong (maybe the difference comes from our test data but it seems strange anyway for the reasons below).

If we look a bit deeper in the implementation of these methods in .NET (for Luke H, Richard Watson methods),

  • IndexOf is culture depending, it will try to retrieve/create ReadOnlySpan, check if it has to ignore case etc.. and then finally do the unsafe / native call.
  • Split is able to handle several separators and has some StringSplitOptions and has to create the string[] array and fill it with the split result (so do some substring). Depending on the number of string occurrence Split may be faster than IndexOf.

By the way, I made a simplified version of IndexOf (which could be faster if I used pointer and unsafe but unchecked should be ok for most) which is faster by at least a 4 order of magnitude.

Benchmark (source on GitHub)

Done by searching either a common word (the) or a small sentence in Shakespeare Richard III.

Method Mean Error StdDev Ratio
Richard_LongInLong 67.721 us 1.0278 us 0.9614 us 1.00
Luke_LongInLong 1.960 us 0.0381 us 0.0637 us 0.03
Fab_LongInLong 1.198 us 0.0160 us 0.0142 us 0.02
-------------------- -----------: ----------: ----------: ------:
Richard_ShortInLong 104.771 us 2.8117 us 7.9304 us 1.00
Luke_ShortInLong 2.971 us 0.0594 us 0.0813 us 0.03
Fab_ShortInLong 2.206 us 0.0419 us 0.0411 us 0.02
--------------------- ----------: ---------: ---------: ------:
Richard_ShortInShort 115.53 ns 1.359 ns 1.135 ns 1.00
Luke_ShortInShort 52.46 ns 0.970 ns 0.908 ns 0.45
Fab_ShortInShort 28.47 ns 0.552 ns 0.542 ns 0.25
public int GetOccurrences(string input, string needle)
{
    int count = 0;
    unchecked
    {
        if (string.IsNullOrEmpty(input) || string.IsNullOrEmpty(needle))
        {
            return 0;
        }

        for (var i = 0; i < input.Length - needle.Length + 1; i++)
        {
            var c = input[i];
            if (c == needle[0])
            {
                for (var index = 0; index < needle.Length; index++)
                {
                    c = input[i + index];
                    var n = needle[index];

                    if (c != n)
                    {
                        break;
                    }
                    else if (index == needle.Length - 1)
                    {
                        count++;
                    }
                }
            }
        }
    }

    return count;
}
Fab
  • 14,327
  • 5
  • 49
  • 68
2

A generic function for occurrences of strings:

public int getNumberOfOccurencies(String inputString, String checkString)
{
    if (checkString.Length > inputString.Length || checkString.Equals("")) { return 0; }
    int lengthDifference = inputString.Length - checkString.Length;
    int occurencies = 0;
    for (int i = 0; i < lengthDifference; i++) {
        if (inputString.Substring(i, checkString.Length).Equals(checkString)) { occurencies++; i += checkString.Length - 1; } }
    return occurencies;
}
Stefanos Kargas
  • 10,547
  • 22
  • 76
  • 101
  • 2
    This creates a HUGE number of temporary strings and makes the garbage collector work very hard. – EricLaw Jun 29 '15 at 16:42
1
string Name = "Very good nice one is very good but is very good nice one this is called the term";
bool valid=true;
int count = 0;
int k=0;
int m = 0;
while (valid)
{
    k = Name.Substring(m,Name.Length-m).IndexOf("good");
    if (k != -1)
    {
        count++;
        m = m + k + 4;
    }
    else
        valid = false;
}
Console.WriteLine(count + " Times accures");
Gilles 'SO- stop being evil'
  • 104,111
  • 38
  • 209
  • 254
Prashanth
  • 11
  • 1
1

If you check out this webpage, 15 different ways of doing this are benchmarked, including using parallel loops.

The fastest way appears to be using either a single threaded for-loop (if you have .Net version < 4.0) or a parallel.for loop (if using .Net > 4.0 with thousands of checks).

Assuming "ss" is your Search String, "ch" is your character array (if you have more than one char you're looking for), here's the basic gist of the code that had the fastest run time single threaded:

for (int x = 0; x < ss.Length; x++)
{
    for (int y = 0; y < ch.Length; y++)
    {
        for (int a = 0; a < ss[x].Length; a++ )
        {
        if (ss[x][a] == ch[y])
            //it's found. DO what you need to here.
        }
    }
}

The benchmark source code is provided too so you can run your own tests.

1

**to count char or string **

 string st = "asdfasdfasdfsadfasdf/asdfasdfas/dfsdfsdafsdfsd/fsadfasdf/dff";
        int count = 0;
        int location = 0;
       
        while (st.IndexOf("/", location + 1) > 0)
        {
                count++;
                location = st.IndexOf("/", location + 1);
        }
        MessageBox.Show(count.ToString());
Ehsan KHAN
  • 141
  • 1
  • 6
1
str="aaabbbbjjja";
int count = 0;
int size = str.Length;

string[] strarray = new string[size];
for (int i = 0; i < str.Length; i++)
{
    strarray[i] = str.Substring(i, 1);
}
Array.Sort(strarray);
str = "";
for (int i = 0; i < strarray.Length - 1; i++)
{

    if (strarray[i] == strarray[i + 1])
    {

        count++;
    }
    else
    {
        count++;
        str = str + strarray[i] + count;
        count = 0;
    }

}
count++;
str = str + strarray[strarray.Length - 1] + count;

This is for counting the character occurance. For this example output will be "a4b4j3"

slavoo
  • 5,798
  • 64
  • 37
  • 39
  • 2
    Not quite 'counting occurrences of a string' more counting characters - how about a way of specifying what the string to match was Narenda? – Paul Sullivan Dec 09 '11 at 13:51
  • 1
    int count = 0; string str = "we have foo and foo please count foo in this"; string stroccurance="foo"; string[] strarray = str.Split(' '); Array.Sort(strarray); str = ""; for (int i = 0; i < strarray.Length - 1; i++) { if (strarray[i] == stroccurance) { count++; } } str = "Number of occurenance for " +stroccurance + " is " + count; Through this you can count any string occurance in this example I am counting the occurance of "foo" and it will give me the output 3. – Narendra Kumar Dec 15 '11 at 07:10
0
string s = "HOWLYH THIS ACTUALLY WORKSH WOWH";
int count = 0;
for (int i = 0; i < s.Length; i++)
   if (s[i] == 'H') count++;

It just checks every character in the string, if the character is the character you are searching for, add one to count.

joppiesaus
  • 5,471
  • 3
  • 26
  • 36
0

For the case of a string delimiter (not for the char case, as the subject says):
string source = "@@@once@@@upon@@@a@@@time@@@";
int count = source.Split(new[] { "@@@" }, StringSplitOptions.RemoveEmptyEntries).Length - 1;

The poster's original source value's ("/once/upon/a/time/") natural delimiter is a char '/' and responses do explain source.Split(char[]) option though...

Sam Saarian
  • 992
  • 10
  • 13
0

Looking for char counts is a lot different than looking for string counts. Also it depends if you want to be able to check more than one or not. If you want to check a variety of different char counts, something like this can work:

var charCounts =
   haystack
   .GroupBy(c => c)
   .ToDictionary(g => g.Key, g => g.Count());

var needleCount = charCounts.ContainsKey(needle) ? charCounts[needle] : 0;

Note 1: grouping into a dictionary is useful enough that it makes a lot of sense to write a GroupToDictionary extension method for it.

Note 2: it can also be useful to have your own implementation of a dictionary that allows for default values and then you could get 0 for non-existent keys automatically.

Dave Cousineau
  • 12,154
  • 8
  • 64
  • 80
0

string charters= "aabcbd":

char[] char = charters.TocharArary(); List lst = new List();

for(int i=0; i< char.Length; i++)
{
    if(!lst.Contains(char[0]))
    {
        int count = charters.Where(x=> x.value == char[0]).Count;
        lst.Add(char[0]);
        Console.WriteLine($"{char[0]} occures {count} times");
    }
}

Output will be as below: a occures 2 times b occures 2 times c occures 1 times d occures 1 times

Dhiraj Ghode
  • 81
  • 1
  • 3