0

So given the following sentence in one cell: "Hi my name is Bernhard. I am thirty five years old and I prefer old televisions." I'd like to have formula that return the number of all words that have more than six letters in them. So in case for this sentence it would return: 4 (Bernhard, thirty, prefer, televisions) I found the substitute function and its following implementation which is:

=len(A1)-len(SUBSTITUDE(A1;".";))

Which returns the number of all dots in the sentence. I've tried the following

=len(A1)-len(SUBSTITUDE(A1;??????;))

But that doesn*t work.

Thanks a lot for your help in advance!

JvdV
  • 70,606
  • 8
  • 39
  • 70
Difio
  • 145
  • 8

2 Answers2

1

If you have the newest version of Excel you can use FILTERXML.

For example:

=FILTERXML("<t><s>"&SUBSTITUTE(H1, " ", "</s><s>")&"</s></t>", 
"//s[string-length()>5]")

Would get you the actual words that are longer than 5 characters. This will also take into account punctuation though, so be careful. If you want to be really picky, you can replace all punctuation in the string with a blank before using the function.

Then you can wrap this in a COUNTA to get the count.

=COUNTA(
FILTERXML("<t><s>"&SUBSTITUTE(H1, " ", "</s><s>")&"</s></t>", 
"//s[string-length()>5]"))

enter image description here

For more info on the FILTERXML function, see here.

EDS
  • 2,155
  • 1
  • 6
  • 21
1

For those versions that do not have access to FILTERXML:

=SUMPRODUCT(--(LEN(TRIM(MID(SUBSTITUTE(A1," ",REPT(" ",999)),(ROW($ZZ$1:INDEX($ZZ:$ZZ,LEN(A1)-LEN(SUBSTITUTE(A1," ",""))+1))-1)*999+1,999)))>5))

enter image description here

Scott Craner
  • 148,073
  • 10
  • 49
  • 81
  • Hi Scott! Thanks a lot for your solution and please excuse the late reply. I had a bit of trouble getting this to work as this formula would just throw a generic error on my mac until I realized that I had to change all the "," to ";" for it to work. Additionally this formula in its current state counts all punctuation as part of the word. Meaning "Peter." counts as a letter word. I solved this by substituting all punctuation like this `SUBSTITUDE(SUBSTITUDE(SUBSTITUDE(SUBSTITUDE(A1;".";);"!";);"?";);":";)` After that your code worked lovely! Thanks again! – Difio Aug 23 '21 at 18:00