0

I want to count the total number of words in text group by id:

df <- data.frame(id=rep(1:3, 2), tx=c("test one. test two", "this is a test. again test", "test two", "test three times", 
                                      "test, in a future time point", "test has completed, at the final time point"))

How do I achieve a result like this:

id  word count
1   7
2   12
3   10

I looked at the other post Count the number of all words in a string but it doesn't show how to count words by groups.

cliu
  • 933
  • 6
  • 13
  • Please reopen the question. I looked at the other question but it doesn't show how to count words by grouping – cliu Mar 02 '23 at 18:13
  • 1
    Just count the words, and then group_by/summarize like you would with any other dplyr pipleine: `df %>% mutate(words=stringi::stri_count_words(tx)) %>% group_by(id) %>% summarise(sum(words))`. Counting words and summarizing sums are two separate problems. – MrFlick Mar 02 '23 at 18:15
  • 1
    you can do the following `df %>% mutate(count = stringr::str_count(tx, boundary("word"))) %>% group_by(id) %>% summarise(count = sum(count))` – Jamie Mar 02 '23 at 18:16
  • 1
    I added the FAQ for "How to sum a variable by group?" as a reference for summarizing the data. – Gregor Thomas Mar 02 '23 at 18:17

0 Answers0