0

This is a slightly more complicated issue than a simple =COUNTIF(rng,"*"&value&"*"), as found here.

I have a 2D array with cells containing data such as:

  • abc
  • def
  • abc def
  • ghi
  • abc,def,ghi
  • abcdef
  • ghi; def

..... and several other variations of this. I am trying to count exact matches of "abc", but I want the count to be inclusive of cells containing "abc def" and other like variations, however I can't just use the above simple COUNTIF formula since "abcdef" is not an acceptable match. The target string must stand alone or be separated from other text by an acceptable character in chars.

I think I've got this one 90% done, but the bit I need help with is combining all the possible acceptable variations of a target "name" into a flat range that I can then check my data source against for the COUNTIF. I've tried INDEX(r_1:r_8,idxRow,idxCol) and other familiar solutions that work on the sheet when referencing other ranges, but I'm new to using the =LET function. All of this works well when broken out into separate components on my spreadsheet, but I'm looking for a cleaner solution with =LET. See below for current formula:

=LET(rg, DataTable[[Q14_1]:[Q14_9]],
   name, AU38,
   chars, {" ",",",";"},
   r, 8,
   r_1, CONCATENATE(name,chars),
   r_2, CONCATENATE(chars,name),
   r_3, CONCATENATE(chars,name,chars),
   r_4, CONCATENATE(name,chars,"*"),
   r_5, CONCATENATE("*",chars,name),
   r_6, CONCATENATE(chars,name,chars,"*"),
   r_7, CONCATENATE("*",chars,name,chars),
   r_8, CONCATENATE("*",chars,name,chars,"*"),
   c, COUNTA(chars),
   mSeq, SEQUENCE(r*c),
   idxRow, 1+MOD(mSeq,r),
   idxCol, INT((SEQUENCE(r*c)-1)/r)+1,
   X, INDEX(**NeedHelpHere**,idxRow,idxCol),
   SUM(COUNTIF(rg,name),COUNTIF(rg,X))
)
dholt
  • 33
  • 7
  • So do a countif() for “*abc*” then do a countif() for “abcdef” and subtract from the total. – Solar Mike Jul 30 '21 at 22:06
  • @Solar Mike Unfortunately the data is less black and white than that. My example “abcdef” could be expanded to ANY combination of characters as long as the string a-b-c is consecutive - which would still not be acceptable in this scenario. There are too many variations of the “not target” condition. – dholt Jul 30 '21 at 22:42

2 Answers2

1

Give a try on below formula. If you have more delimiter like space, comma & others then you need to use more SUBSTITUTE() function.

=LET(x,FILTERXML("<t><s>"&SUBSTITUTE(SUBSTITUTE(A1:A7," ","</s><s>"),",","</s><s>")&"</s></t>","//s"),y,FILTER(x,x="abc"),SUM(--(y<>"")))

enter image description here

To learn about FILTERXML() please read this article from JvdV.

Harun24hr
  • 30,391
  • 4
  • 21
  • 36
  • Thanks for this answer. While this does seem to work for a single column of values, in reality my data span several rows and columns, which it seems this formula cannot handle. Can you share an edit that would search over multiple rows and columns and removes/ignores blanks? – dholt Aug 02 '21 at 11:52
  • I think you may need textjoin() then with Filterxml(). Can you post data with multiple column. – Harun24hr Aug 02 '21 at 14:46
  • 1
    Thanks for your help - I've posted my own solution to this. – dholt Aug 02 '21 at 15:14
0

I've thought about this again and am posting a solution that fits my needs.

I don't need to index a single column of potential matches to then COUNTIF, I can just COUNTIF multiple times. Additionally, I was not taking into account different combinations of chars, I was only searching for the same chars on either side of the target (e.g. ",abc," when I should have also been looking for ",abc;"). Transposing the chars array on one side is a simple way of fixing this. It also turns out that "*"&target&"*" searches for "*target*" AND "target" (duh!), so I simplified further, removing duplicative possibilities.

My final formula is below, which counts the number of times target (by itself or surrounded by any acceptable chars) is present in a given rng:

=LET(rng, DataTable[Q14_1]:[Q14_9]]),
  name, $A6,
  chars, {" " , "," , ";"},
  r_1, CONCATENATE(name,chars,"*"),
  r_2, CONCATENATE("*",chars,name),
  r_3, CONCATENATE("*",chars,name,TRANSPOSE(chars),"*"),
  SUM(COUNTIF(rng,name),COUNTIF(rng,r_1),COUNTIF(rng,r_2),COUNTIF(rng,r_3))
  )
dholt
  • 33
  • 7