My data is a list of academic institutions affiliated with authors of articles, and the piece I'm working with looks something like this:
1 MIT
2 NBER; NBER
3 U MI; Cornell U; U VA
4 Harvard U; U Chicago
5 U OR; U CA, Davis; U British Columbia
6 World Bank; Dartmouth College; EDHEC Business School; Harvard U
7 Columbia U and IZA; Columbia U and IZA
8 World Bank; Yale U and Abdul Latif Jameel Poverty Action Lab; Dartmouth College
9 Carnegie Mellon U; Carnegie Mellon U; Carnegie Mellon U
10 Columbia U; U CA, San Diego
11 U CA, Berkeley; McMaster U; McMaster U
12 ETH Zurich and CESifo; U Copenhagen and CESifo
I want to separate the rows at the semicolons (and preferrably at the "and"s) so that I can sort out which academic institutions are unique.
I try doing this by using the separate_rows-function from the tidyr package:
Affiliation<-separate_rows(Affiliation, sep=";")
Alternatively:
Affiliation<-separate_rows(Affiliation, sep="; | and")
None of these methods work, and my data looks exactly the same. What am I doing wrong?
Attaching dput output below:
structure(list(AF = c("MIT", "NBER; NBER", "U MI; Cornell U; U VA",
"Harvard U; U Chicago", "U OR; U CA, Davis; U British Columbia",
"World Bank; Dartmouth College; EDHEC Business School; Harvard U",
"Columbia U and IZA; Columbia U and IZA", "World Bank; Yale U and Abdul Latif Jameel Poverty Action Lab; Dartmouth College",
"Carnegie Mellon U; Carnegie Mellon U; Carnegie Mellon U", "Columbia U; U CA, San Diego",
"U CA, Berkeley; McMaster U; McMaster U", "ETH Zurich and CESifo; U Copenhagen and CESifo",
"U MN, St Paul; Compass Lexecon, Washington, DC; Harvard U",
"U WI", "U Chicago and IZA; Harvard U; Harvard U")), row.names = c(NA,
15L), class = "data.frame")