2

I am trying to find the index of a list for which one element matches a specific pattern and has n characters>1, for example:

i = "a"
ll = list(c("a"), c("b", "abc"), c("cc", "b"), c("c", "b"), c("ac", "c"), c("a", "bc"))

I would like to extract ll[[2]] and ll[[5]]. This kind of gets me to the right path, but not quite because I only want the elements that contain the pattern and have nchar>1...

sapply(ll, function(x) sapply(x, function(x) nchar(x)>1 & grep(i, x)))

Thank you!!

user971102
  • 3,005
  • 4
  • 30
  • 37
  • Nice, I did not know about Filter! It still considers the ll[[3]] though that I am trying to exclude because it does not have nchar>1 for the element that matches "a"... – user971102 Aug 09 '16 at 05:58
  • I am trying to get only ll[[2]] and ll[[5]], i.e. "b" "abc" and "ac" "c", but not "a" "bc" – user971102 Aug 09 '16 at 06:01

4 Answers4

3

We can use Filter. The idea would be to find those elements that have a, subset those, check whether those elements have nchar greater than 1, wrap it with any (if there are more elements in each list element) and Filter it.

Filter(function(x) any(nchar(x[grep(i, x)])>1), ll)
#[[1]]
#[1] "b"   "abc"

#[[2]]
#[1] "ac" "c" 

Or

Filter(function(x) any(nchar(grep(i, x, value = TRUE))>1), ll)
akrun
  • 874,273
  • 37
  • 540
  • 662
2
ll[sapply(ll, function(x) any(ifelse(nchar(x) > 1, grepl("a", x), FALSE)))]
tchakravarty
  • 10,736
  • 12
  • 72
  • 116
1

A slightly different approach without using nchar would be leveraging the regex more.

Has n characters>1

Means that there can be some characters and then "a" or "a" and then some characters. As Regex: "[[:alpha:]]a | a[[:alpha:]]". The | means "or".
Which leads to:

Filter(function(x) any(grepl("[[:alpha:]]a|a[[:alpha:]]", x)), ll)

Another, more general option would be use using lookahaeds (~AND) as follows:

Filter(function(x) any(grepl("(?=[[:alpha:]]{2,})(?=a)", x, perl=TRUE)), ll)

Where [[:alpha:]]{2,} means at least 2 Alphabetic characters = A-Za-z and a means it needs to have an a. See Regular Expressions: Is there an AND operator?

Edit:

You can always "build" these regex using paste or sprintf as follows:

i="a"
sprintf("(?=[[:alpha:]]{2,})(?=%s)", i) # Second solution

In total this looks like

Filter(function(x) any(grepl(sprintf("(?=[[:alpha:]]{2,})(?=%s)", i), x, perl=TRUE)), ll)
Community
  • 1
  • 1
Rentrop
  • 20,979
  • 10
  • 72
  • 100
0

A solution using the tidyverse packages.

library(purr)
library(stringr)
library(magrittr)

ll %>% map(function(x) x[any(nchar(x) > 1) & str_detect(x,"a")]) %>%
 Filter(length,.)
shayaa
  • 2,787
  • 13
  • 19
  • Thank you...I get an error though after library(dplyr): Error in function_list[[k]](value) : could not find function "map" – user971102 Aug 09 '16 at 05:55
  • Thank you Shayaa, I like akrun's solution because it doesn't use other packages but knowing how to do this in other ways is great too, thanks! – user971102 Aug 09 '16 at 06:19