I have a list of 200 twitter usernames (username_list) and I want a dataframe of how many times they are retweeted. The original data (overly simplified) looks like this:
screen_name retweet_screen_name
1 screen_name1 retweet_screen_name1
2 screen_name1 retweet_screen_name2
3 screen_name2 retweet_screen_name1
4 screen_name1 retweet_screen_name1
5 screen_name3 retweet_screen_name2
The end dataframe will look something like below and would be interpreted as screen_name1 has retweeted retweet_screen_name1 two times.
retweet_screen_name1 retweet_screen_name2 .......... retweet_screen_name200
screen_name1 2 1 etc
screen_name2 1 0 etc
screen_name3 0 1 etc
The code below is my start...
## maybe add in a loop .... for (username in username_list) ...
retweet.counts <- function(username) {
countCol <- all_text %>%
select(screen_name, created_at, retweet_screen_name) %>%
mutate(year = substr(created_at, 1, 4)) %>%
filter(year > 2017 & year < 2021) %>%
group_by(screen_name) %>%
summarise(username = sum(retweet_screen_name == username, na.rm = TRUE))
return(countCol)
}
I also found this code and think it could potentially be helpful.
library(dplyr)
library(purrr)
all_text %>%
map(~table(.x)) %>%
lapply(as_tibble) %>%
bind_rows(.id = "var")
Help is needed!