If you only want to extract and count factor values that have exactly 4 letters (any letters, not necessarily the same), then you can do this:
Step 1--Define a pattern to match:
pattern <- "\\w{4}"
Step 2--Define a function to extract only the raw matches:
extract <- function(x) unlist(regmatches(x, gregexpr(pattern, x, perl = T)))
Step 3--Apply the function to the data of interest:
extract(data1$y)
And that's the result:
[1] "AAAA" "BBBB"
Step 4--To count the number of matches you can use length
:
length(extract(data1$y))
[1] 2
EDIT:
Alternatively you can use str_extract
from the package stringr
:
STEP 1: store the result in a vector extr
:
extr <- str_extract(data1$y, "\\w{4}")
STEP 2: using length
, the negation operator !
and is.na
, a function that tests for NA and evaluates to TRUE and FALSE, you can count the number of times that test evaluates to FALSE:
length(extr[!is.na(extr)])
[1] 2