1
 <br>1<br>text <br>2<br>text <br>3<br>
 <br>4<br>
 <br>5<br>text <br>6<br>text <br>7<br>text <br>8<br>

What is the formula to count the number of times <br>*<br> appears in this table A1:A3? Keep in mind there is a space in the front of each cell.

The total count should be 8. =COUNTIF(A1:A3," <br>*<br>") only returns 3 however.

Edit: Have made data simpler so it is easier to read

asd213e1
  • 43
  • 5

3 Answers3

1

This Microsoft documentation link can help you out with multiple variations of what you're trying to achieve - https://learn.microsoft.com/en-us/office/troubleshoot/excel/formulas-to-count-occurrences-in-excel

To count the occurrences of a substring in a range of cells, you can use this:

SUM(LEN(range)-LEN(SUBSTITUTE(range,"text","")))/LEN("text")

If the above does not work for you you can try this:

enter image description here

Numbers:

enter image description here

To count the occurrences of a substring in one cell, a combination of formulas in excel will do the trick.

Short answer (just plug in your specific text or cell references into this):

=(LEN("original_text")-LEN(SUBSTITUTE("original_text","text_to_count","")))/LEN("text_to_count")

Long answer if you're interested:

  1. SUBSTITUTE(x, y, z) - takes the original text (x) and the text you want to find (y) and replaces all occurrences of y in x with z and returns the string. In your case (I've chosen a smaller string for brevity):

=SUBSTITUTE("123331<br>asdsdas <br>","<br>","")

This will substitute all occurrences of
with an empty string in your original string - 123331asdsdas

  1. LEN(x) - simply returns length of the x. In your case:

=LEN(SUBSTITUTE("123331<br>asdsdas <br>","<br>",""))

This will return 14 which is the length of 123331asdsdas.

  1. =LEN("123331<br>asdsdas <br>") which gives you 22.

  2. =LEN("123331<br>asdsdas <br>") - LEN(SUBSTITUTE("123331<br>asdsdas <br>","<br>","")) which subtracts length of the substituted string from length of the original string giving you 8.

  3. (LEN("123331<br>asdsdas <br>") - LEN(SUBSTITUTE("123331<br>asdsdas <br>","<br>","")))/LEN("<br>") divides the above answer from the subtraction by the length of your original string, giving you 2.

This is the number of times
occurred in one cell.

  1. In plain English, we did (length_of_original_string - length_of_substituted_string)/length_of_string_to_find which translates to (22-14)/4 = 2 in your case.
jaimish11
  • 536
  • 4
  • 15
  • It seems LEN(range) isn't working on Excel for Mac OS atleast. Unsure about windows. If it doesn't work for you you can calculate the lengths individually and then sum them up. That'll work. – jaimish11 Jul 18 '20 at 17:17
  • What I am trying to find, is the text in between the first
    with a space in front and up until/including the second
    . If you highlight the first formula you will see that there are spaces in front of the data in each cell, and spaces in front of some of the
    strings inside the cell. The number groups in the original question are the total, so 8. Your answer seems to ignore the spaces
    – asd213e1 Jul 18 '20 at 17:48
  • I imagine you would need a **mid** function for that. – David Wooley - AST Jul 18 '20 at 17:54
  • My bad, I didn't understand your question right. You need the occurrences of space
    num
    If you do know VBA, you can easily do this using Regular Expressions.
    – jaimish11 Jul 18 '20 at 18:14
  • This might be helpful if you're willing to go the VBA route - https://stackoverflow.com/questions/22542834/how-to-use-regular-expressions-regex-in-microsoft-excel-both-in-cell-and-loops – jaimish11 Jul 18 '20 at 18:31
  • 1
    I believe the your first formula always works in this case: `=SUMPRODUCT(INT((LEN(A1:A3)-LEN(SUBSTITUTE(A1:A3,"
    ","")))/LEN("
    ")/2))`
    – Dang D. Khanh Jul 18 '20 at 20:26
1

This seems like XML/HTML so why not parse it through FILTERXML with a XPATH construct to count values that do not contain a space and are not empty. For example with your data in A1:

=COUNTA(FILTERXML("<t><br>"&SUBSTITUTE(CONCAT(A1:A3),"<br>","</br><br>")&"</br></t>","//br[not(contains(., ' '))][node()]"))

Or when your valid data could hold spaces we would need to filter out the ones that end with a space (and the empty ones again):

=COUNTA(FILTERXML("<t><br>"&SUBSTITUTE(CONCAT(A1:A3),"<br>","</br><br>")&"</br></t>","//br[not(substring(., string-length(.), 1)=' ')][node()]"))

If however what you are after are the numeric nodes only to count then use:

=COUNT(FILTERXML("<t><br>"&SUBSTITUTE(CONCAT(A1:A3),"<br>","</br><br>")&"</br></t>","//br[.*0=0]"))

The cool thing is FILTERXML will actually return an array for further analyses, in your case you can count the elements in the array.

JvdV
  • 70,606
  • 8
  • 39
  • 70
1

Or,

In B1 enter formula :

=SUMPRODUCT(LEN(A1:A3)-LEN(SUBSTITUTE(A1:A3," ","")))

enter image description here

bosco_yip
  • 3,762
  • 2
  • 5
  • 10