0

Using this example answer

I try to use it to keep words into a whole column.

I use this:

df <- data.frame(text = c ("Hi this is an example", "Hi this is an example", "Hi this is an example", "Hi this is an example"))
words <- c("this", "is", "an", "example")
paste(intersect(strsplit(df$text, "\\s")[[1]], words), collapse=" ")

But I receive this error:

Error in strsplit(df$text, "\\s") : non-character argument

What can I do?

user8831872
  • 383
  • 1
  • 14
  • 1
    You need to wrap with `as.character` as tthe 'text' is `factor` (check the `str(df)`- or specify `stringsAsFactors = FALSE` while creating the dataset) i.e. `paste(intersect(strsplit(as.character(df$text), "\\s")[[1]], words), collapse=",")` Note that here are subsetting the 1st list element only – akrun May 22 '18 at 19:36

2 Answers2

1

df$text is a factor. Try:

df <- data.frame(text = c ("Hi this is an example", "Hi this is an example", "Hi this is an example", "Hi this is an example"))
words <- c("this", "is", "an", "example")
paste(intersect(strsplit(as.character(df$text), "\\s")[[1]], words), collapse=" ")
Kerry Jackson
  • 1,821
  • 12
  • 20
1

Considering you want to apply the function for each text in your data frame, the following code should do what you need:

df <- data.frame(text = c ("Hi this is an example", "Hi this is an example", "Hi this is an example", "Hi this is an example"))
words <- c("this", "is", "an", "example")

df$new_column <- sapply(as.character(df$text), function(x) {
  return(paste(intersect(strsplit(x, "\\s")[[1]], words), collapse=" "))
})

print(df$new_column)

And with a different data.frame example:

df <- data.frame(text = c ("Hi this is an example", "Hi this was an example", "Hi this still is an example", "Hi this is another example"))
words <- c("this", "is", "an", "example")

df$new_column <- sapply(as.character(df$text), function(x) {
  return(paste(intersect(strsplit(x, "\\s")[[1]], words), collapse=" "))
})

print(df$new_column)

Hope it helps! :)

tk3
  • 990
  • 1
  • 13
  • 18