1

I have a dataframe, with a lot of trigrams and their frequencies.

How can I add a third column (let's called it finalWord) where only the last word of the trigram is displayed?

Here is an example of the dataframe:

x <- data.frame(trigrams = c("I have to", "I need to"), freq = c(10, 7))

The output should be:

x <- data.frame(trigrams = c("I have to", "I need to"), freq = c(10, 7), finalWord = c("to", "to"))
feder80
  • 1,195
  • 3
  • 13
  • 34

1 Answers1

1

We can use sub

x$finalword <- sub(".*\\s+", '', x$trigrams)
x$finalword
#[1] "to" "to"

library(stringi)
stri_extract_last(x$trigrams, regex="\\w+")
akrun
  • 874,273
  • 37
  • 540
  • 662