I am working with a dataset that has a column with country codes named "ccode":
When I create another column to create country names with the name "country", I use the function "countrycode"from the countrycode package that I downloaded form CRAN and have the following results:
votes_processed <- votes %>%
filter(vote <= 3) %>%
mutate(year = session + 1945,
country = countrycode(ccode,"cown","country.name"))
and the following warning message:
Warning message:
In countrycode(ccode, "cown", "country.name") :
Some values were not matched unambiguously: 260, 816
Since these country codes cannot be assigned a country name, I filtered them out of the dataframe:
> table(is.na(votes_processed$country))
FALSE TRUE
350844 2703
> votes_processed <- filter(votes_processed,!is.na(country))
> table(is.na(votes_processed$country))
FALSE
350844
Afterwards I run the following commands to create another tibble that gives me grouped information regarding the total votes and the proportion of "yes" (1-yes) votes by year and country:
# Group by year and country: by_year_country
by_year_country <- votes_processed %>%
group_by(year,country) %>%
summarize(total = n(),
percent_yes = mean(vote == 1))
Then I run the following command to nest the data by country and the console sends the following warning and erases my country column:
> nested <- by_year_country %>%
+ nest(-country)
Warning message:
Unknown or uninitialised column: 'country'.
> nested$country
NULL
Warning messages:
1: Unknown or uninitialised column: 'country'.
2: Unknown or uninitialised column: 'country'.
Could someone explain me what is happening with this "country" column and why R is not recognizing it and what can I do about it?
I am a beginner in this platform. I got a comment asking for a sample of the data, I paste it here:
rcid<-c(5168,4317,3598,2314,1220,5024,3151,2042,2513,238,4171,3748,2595,
5160,4476,308,3621,874,2025,3793,3595,1191,987,1207,2255,211,
2585,2319,3590,189)
session<- c(66,56,46,36,26,64,42,34,38,4,54,48,38,66,58,6,46,18,34,
48,46,26,22,26,36,4,38,36,46,4)
vote<- c(1,8,1,8,9,1,3,2,2,9,2,1,3,1,1,1,1,1,1,1,1,1,9,2,1,9,1,1,1,2)
ccode<-as.integer(c(816,816,816,816,816,816,260,260,260,260,2,42,2,20,
31,41,20,42,41,31,70,95,80,93,58,51,53,90,55,90))
sample_data_votes<-data.frame("rcid"=rcid,"session"=session, "vote"= vote,
"ccode"=ccode)
Thank you very much for your time and advice.