0

I'm trying to subselect only those rows that contain a specific word in one column, namely a "text"-column that for every row contains a certain text. While I do know that str_subset from stringr-package will select those rows, however I want the full dataframe with all columns included. Any ideas on how I could achieve that? As for in my example, I could be looking for those rows that contain the word 'this', therefore I'd want an output that cuts out rows 1, 5 and 8. Thanks for any tips.

names <- c("Richard", "Mortimer", "Elizabeth", "Gerald", "Summer", "Marc", "Ben", "Emma")
text <- c("I have this.", "I have that.", "Is that cool?", "How about that?", "How about this?", "How do you get that?", "Where can I get that?", "When do I need this?")
it1 <- c(0.6, 0.3, 0.8, 0.8, 0.5, 0.5, 0.3, 0.7)
it2 <- c(0.5, 0.4, 0.4, 0.5, 0.8, 0.6, 0.5, 0.4)

myframe <- data.frame(names, text, it1, it2)
psyph
  • 291
  • 1
  • 10

1 Answers1

2

You can subset with a logical vector from grepl in base R:

names <- c("Richard", "Mortimer", "Elizabeth", "Gerald", "Summer", "Marc", "Ben", "Emma")
text <- c("I have this.", "I have that.", "Is that cool?", "How about that?", "How about this?", "How do you get that?", "Where can I get that?", "When do I need this?")
it1 <- c(0.6, 0.3, 0.8, 0.8, 0.5, 0.5, 0.3, 0.7)
it2 <- c(0.5, 0.4, 0.4, 0.5, 0.8, 0.6, 0.5, 0.4)
myframe <- data.frame(names, text, it1, it2)

myframe[grepl("this", myframe$text),]
#>     names                 text it1 it2
#> 1 Richard         I have this. 0.6 0.5
#> 5  Summer      How about this? 0.5 0.8
#> 8    Emma When do I need this? 0.7 0.4

Or equivalently if you already use tidyverse tools like stringr and dplyr:

library(tidyverse)
myframe %>%
  filter(str_detect(text, "this"))
#>     names                 text it1 it2
#> 1 Richard         I have this. 0.6 0.5
#> 2  Summer      How about this? 0.5 0.8
#> 3    Emma When do I need this? 0.7 0.4

Created on 2019-08-08 by the reprex package (v0.3.0)

Calum You
  • 14,687
  • 4
  • 23
  • 42