12

How can I get the number of occurrences for some range based on

  1. A regular expression

  2. 2+ conditions; let's say cells that contain "yes" and / or "no"

What I've got for the moment:

COUNTIF(B5:O5; "*yes*")

I tried to use COUNTIF(B5:O5; {"*yes*", "*no*"}) or COUNTIF(B5:O5; "(*yes*)|(*no*)"), but neither of them worked.

Or, how do I count cells that contain some domain names—yahoo.com, hotmail.com, and gmail.com—using regexp? e.g.:

(\W|^)[\w.+\-]{0,25}@(yahoo|hotmail|gmail)\.com(\W|$)
Michael
  • 8,362
  • 6
  • 61
  • 88
NGix
  • 2,542
  • 8
  • 28
  • 39

5 Answers5

15

The most pedestrian solution to your problem (tested in Excel and Google Docs) is to simply add the result of several countif formulas:

=COUNTIF(B5:O5, "*yes*") + COUNTIF(B5:O5, "*no*")

This expression will count the total of cells with "yes" or "no". It will double count a cell with "yesno" or "noyes" since it matches both expressions. You could try to take out the doubles with

=COUNTIF(B5:O5, "*yes*") + COUNTIF(B5:O5, "*no*") - COUNTIF(B5:O5, "*no*yes*") - COUNTIF(B5:O5, "*yes*no*")

But that will still get you in trouble with a string like noyesno.

However there is a rather clever trick in Google Docs that may just be a hint of the solution you are looking for:

=COUNTA(QUERY(A1:A9, "select A where A matches '(.*yes.*)|(.*no.*)'"))

The QUERY function is like a mini database thing. In this case it looks at the table in range A1:A9, and selects only elements in column A where the corresponding element in column A matches (in the preg regex sense of the word) the expression that follows - in this case, "anything followed by yes followed by anything, or anything followed by no followed by anything". In a simple example I made, this counts a yesnoyes only once - making it exactly what you were asking for (I think...)

Right now your range B5:O5 is several columns wide, and only one row high; that makes it hard to use the QUERY trick. Something rather less elegant (but that works regardless of the shape of the range) is this:

=countif(arrayformula(isnumber(find("yes",A1:A9))+isnumber(find("no",A1:A9))),">0")

The sum of the isnumber functions acts as an element-wise OR - unfortunately, the regular OR function doesn't seem to work on individual elements of an array. As before, this finds cells that contain either "yes" or "no", and counts the ones that have either of these strings contained within.

Floris
  • 45,857
  • 6
  • 70
  • 122
  • Can you `TRANSPOSE(B5:O5)` before sending it to `QUERY`? Or does `QUERY` have problems with the column identifiers on anonymous matrices? – Michael Mar 23 '19 at 22:14
  • @Michael have you tried? I can’t test it right now but I would be quite interested in the answer. – Floris Mar 24 '19 at 23:05
  • I can't get it to work. Not with `QUERY(TRANSPOSE(B5:O5), "select A")`, with `…, "select col1")`, nor with `QUERY({"foo"; TRANSPOSE(B5:O5)}, "select foo", 1)`. It always complains that the column identifier is missing. `select *` works, but you can't use `*` in the `where` clause. – Michael Mar 24 '19 at 23:45
  • I tried again and got it working. See [below](https://stackoverflow.com/a/55506375/241211): Apparently `Col1` is the "magic" identifier, and it is _case sensitive._ Note that it does not have the right case in the documentation's sample [identifiers](https://developers.google.com/chart/interactive/docs/querylanguage#identifiers). – Michael Apr 04 '19 at 01:04
2

This is heavily inspired by Floris's answer. See the comments, in particular. If you TRANSPOSE the row of items to compare against, QUERY works fine for horizontal data as well:

=COUNTA(QUERY(TRANSPOSE(B5:O5), "select * where Col1 matches '.*(yes|no).*'"))

As far as I can tell, Col1 is "special" and case sensitive!

Michael
  • 8,362
  • 6
  • 61
  • 88
0

The easiest way NGix found is to create a custom function for these purposes. They added 2 that work perfectly for them and hope it'll help someone too:

/**
 * Count if cell value matches any condition ( supports regexp also )
 */

function countCustomMatchOr(data){
  var count = 0;
  for(var i=0; i < data.length; i++){
    for(var j=0; j<data[i].length; j++){
        var cell = data[i][j];
      for(var k=1;k<arguments.length;k++){
        if(typeof arguments[k] == "number"){
          if(arguments[k] == cell) count++;
        } else if(cell.toString().match(arguments[k])) count++;
      }
    }
  }
  return count;
}

And

/**
 * Counts value in data range if matches regular expression
 */

function countCustomRegExp(data, reg, flag){
  var rows = data.length, count = 0, re = flag?new RegExp(reg, flag):new RegExp(reg);
  for( var i = 0; i < rows; i++){
    for( var j = 0; j < data[i].length; j++){
      if( data[i][j] != "" && data[i][j].toString().match(re) ){
        count++;
      }
    }
  }
  return count;
}

In order to use them just apply countCustomRegExp(A2:G3;"yes.*") or countCustomMatchOr(A2:G3;"yes";"no")

Michael
  • 8,362
  • 6
  • 61
  • 88
0

Recently I wanted this functionality in a Google Sheet. I came up with a different solution that also looks very readable.

=COUNTIF(ARRAYFORMULA(REGEXMATCH(O36:R36,"(yes|no)")),TRUE)

Basically what it does is, run REGEXMATCH over a range using ArrayFormula which returns either TRUE or FALSE. And then use COUNTIF to count the occurances of TRUE.

MWaheed
  • 306
  • 3
  • 9
-1

I've had the same problem, if you don't want to use VBA, try to add a SUM to your formula: =SUM(COUNTIF(B5:O5; {"*yes*", "*no*"}))

Ali
  • 1,357
  • 2
  • 12
  • 18
Leandro
  • 7
  • 1