22

I'm using Delphi 2007 and wonder if there is a simple way of counting the number of times a string occurs in another string. Any builtin function I can use?

Examples:

  • "How" occurs once in the string "How are you?"
  • "do" occurs twice in the string "How do you do?"
RRUZ
  • 134,889
  • 20
  • 356
  • 483
Jonas
  • 4,107
  • 3
  • 21
  • 25

6 Answers6

40
function Occurrences(const Substring, Text: string): integer;
var
  offset: integer;
begin
  result := 0;
  offset := PosEx(Substring, Text, 1);
  while offset <> 0 do
  begin
    inc(result);
    offset := PosEx(Substring, Text, offset + length(Substring));
  end;
end;
Andreas Rejbrand
  • 105,602
  • 8
  • 282
  • 384
  • 2
    2nd PosEx could be written " offset := PosEx(SubString, Text, offset + length(SubString)); " if you don't care about recurrent substrings. ;) – Arnaud Bouchez Mar 10 '11 at 20:27
  • 1
    @A.Bouchez: Yes, that is very true. I would even say that you should use the actual length of the substring *especially* if you care about (pathological) input like `Occurrences('ddd', 'dddddddd')`. I have changed that. Of course, for performance it is a good idea to save `len := length(Substring)` prior to the loop (or is the compiler smart enough to do this optimisation by itself?). – Andreas Rejbrand Mar 10 '11 at 20:34
  • 2
    +1 I'd probably use a var and store the length of SubString in it to avoid the repeated calls to Length, but since Length really only reads a negative offset from @SubString, it probably isn't much of a performance hit anyway. :) – Ken White Mar 10 '11 at 20:38
  • @Ken White: Yes, compared to the workings of `PosEx`, reading a single integer should be rather negligible. – Andreas Rejbrand Mar 10 '11 at 20:39
11

One of the most clever ways I've ever seen to do this:

{ Returns a count of the number of occurences of SubText in Text }
function CountOccurences( const SubText: string;
                          const Text: string): Integer;
begin
  if (SubText = '') OR (Text = '') OR (Pos(SubText, Text) = 0) then
    Result := 0
  else
    Result := (Length(Text) - Length(StringReplace(Text, SubText, '', [rfReplaceAll]))) div  Length(subtext);
end;  { CountOccurences }
RobertFrank
  • 7,332
  • 11
  • 53
  • 99
  • Yes, this is a very clever way. Actually, RRUZ posted this method as an answer yesterday, but then, for some reason, he deleted it. The comment I gave to his question was *Well, this is a "special edition", indeed. Performance-wise, however, it is far from ideal, I am afraid...* To be honest, I think there is no reason to use this method when there are far more performant ones that aren't more complecated. Still, I would've given you a +1 if I hadn't felt bad for RRUZ by doing so... – Andreas Rejbrand Mar 11 '11 at 14:25
  • Well, I give you a +1 anyway. Since you "only" got 1683 in rep., you cannot see deleted posts. But if RRUZ undeletes his question, I promise I will give him a +1 as well. – Andreas Rejbrand Mar 11 '11 at 14:31
  • 4
    Clever, but slow, and wastes memory. – Warren P Mar 11 '11 at 21:32
  • He's at 7k rep since @AndreasRejbrand's blessing. – user30478 Dec 18 '22 at 18:10
4

If you find yourself frequently searching occurences in a large body of text and performance becomes an issue, you could try the Boyer-Moore search algorithm.

the worst-case to find all occurrences in a text needs approximately 3n comparisons

An implementation in Delphi can be found at our very own SO here

I need three fast-on-large-strings functions: fast search, fast search and replace, and fast count of substrings in a string.

Community
  • 1
  • 1
Lieven Keersmaekers
  • 57,207
  • 13
  • 112
  • 146
2

Newer Delphi versions have a CountChar string helper, counting occurrences of a single character in a string:

var
  MyText: String;
begin
  MyText := 'How are you?';
  ShowMessage(MyText.CountChar('o').ToString);
end;
Anse
  • 1,573
  • 12
  • 27
  • 2
    The question does not ask about counting characters, it asks about counting occurrences of a string within another string. You may want to read it again, paying attention to the examples given. – Ken White Apr 14 '22 at 04:17
  • Yes I have read the question and I am aware it asked for a string. But a single character is quite similar to a string, isn't it? – Anse Apr 15 '22 at 06:31
  • No, obviously it's not. There's a big difference between looking for an `o` in a string and the word `jello` or `roam`. `CountChar` counts the number of times a single character appears, and does not work with more than one character. Therefore, your answer is simply and absolutely wrong. – Ken White Apr 15 '22 at 13:38
  • It's **not an answer**, because it does not answer the question asked. It's just some meaningless code that you've put in an answer's space. Counting **words** is not the same as counting **characters**. Try it yourself: Use your code to count the number of times the word *lion* appears in the sentence *The mighty lion roared loudly.* Your **wrong** code would return 3 (the number of `o` characters), while the correct code would return `1` (he number of times *lion* appears in the sentence. The downvote is warranted, because your answer is totally wrong. – Ken White Apr 16 '22 at 18:50
  • 1
    You seem to misunderstand how SO works. What is put in the spot that is headed **Your Answer** is **only** a direct answer to the question asked, which you have not provided. It's not a spot to just dump some code that does something totally different than what was asked. – Ken White Apr 16 '22 at 18:51
  • 1
    Special case string with a single letter: this is fast. – Giorgio Calzolato May 16 '22 at 09:27
  • @GiorgioCalzolato that's some fast reasoning in SO wars you mean (code is faster too). – user30478 Dec 18 '22 at 18:14
1
function stringcount(pBefore: String; pSubstring: String; pFlags: TReplaceFlags): Integer;
begin
  result:= round((pBefore.Length - stringreplace(pBefore, pSubstring, '', pFlags).Length) / pSubstring.Length);
end;
Johan
  • 3,667
  • 6
  • 20
  • 25
guest
  • 11
  • 1
0

uses
  StrUtils;    

function Occurrences(const Substring, Text: string;
  const ignoreUppercase: Boolean = false): Integer;
var
  inSubstring, inText: string;
  inPos: Integer;
begin
  Result:= 0;

  if (Substring = '') or (Text = '') then
    Exit;

  if ignoreUppercase then
  begin
    inSubstring:= AnsiLowerCase(Substring);
    inText:=  AnsiLowerCase(Text);
  end
  else
  begin
    inSubstring:= Substring;
    inText:=  Text;
  end;

  inPos:= 1;

  repeat
    inPos:= posEx(inSubstring, inText, inPos);
    if inPos > 0 then
    begin
      Inc(Result);
      inPos:= inPos + Length(inSubstring);
    end;
  until inPos = 0;
end;

GoodMan
  • 9
  • 2
  • 1
    Please include some description, or explanation of your code rather than only answering with a code snippet. – davidcondrey Jul 15 '14 at 21:22
  • 1
    This looks like a variation on the accepted answer. Could you explain why this is different, and what the benefits are of doing it this way? – andrewsi Jul 15 '14 at 21:39
  • Using loop "repeat" in this case is more elegant, because the search must be held at least once. – GoodMan Jul 28 '14 at 17:22