3

I have a vector of patterns, and need to use agrep on them. The problem is that agrep seems to take only one pattern at a time.

patt <- c("test","10 Barrel")
lut  <- c("1 Barrel","10 Barrel Brewing","Harpoon 100 Barrel Series","resr","rest","tesr")

for (i in 1:length(patt)) {
  print(agrep(patt[i],lut,max=1,v=T))
}

Result:

[1] "rest" "tesr"
[1] "1 Barrel"                  "10 Barrel Brewing"         "Harpoon 100 Barrel Series"

for is slow on long patterns, thus attempting to do it in vectorized form:

VecMatch1 = function(string, stringVector){
  stringVector[agrep(string, stringVector, max = 1)]
}
a = VecMatch1(patt,lut)

Warning message:
In agrep(string, stringVector, max = 1) :
  argument 'pattern' has length > 1 and only the first element will be used

May be functions like lapply etc can help? Thanks!!

Alexey Ferapontov
  • 5,029
  • 4
  • 22
  • 39

1 Answers1

6

Using lapply:

lapply(patt, agrep, x=lut, max.distance=c(cost=1, all=1), value=TRUE)

[[1]]
[1] "rest" "tesr"

[[2]]
[1] "1 Barrel"                  "10 Barrel Brewing"         "Harpoon 100 Barrel Series"

You can probably get faster performance with dplyr or data.table.

Serban Tanasa
  • 3,592
  • 2
  • 23
  • 45