I have a dataset with citations and authors by group:
Group | Citations | Authors |
---|---|---|
1 | das baker | evans jumper |
1 | remmert biegert hauser | wang bryson |
2 | morcos pagnani | baker |
2 | mcguffin bryson jones | trinu |
For each group, I would like to check whether any (and if so, how many) of the names in the "Authors" column of other groups are contained in its "Citations column. For instance, for Group 1, the author "baker" from group 2 appears in the citations column of group 1, in row 1.
I think if I could obtain a dataframe like that, I would be able to answer the question:
Group | Citations | Authors_all_except_focal | Present | Occurrences |
---|---|---|---|---|
1 | das baker | baker trinu | 1 | 1 |
1 | remmert biegert hauser | baker trinu | 0 | 0 |
2 | morcos pagnani | evans jumper wang bryson | 0 | 0 |
2 | mcguffin bryson jones | evans jumper wang bryson | 1 | 1 |
I was thinking about concatenating the authors column into one string excluding the authors of the focal group and then use str_detect, but I am having trouble constructing this dataset (I have tried colSum but without success, apparently because it does not like strings).