Problem
I have been working on merging and standardizing several survey datasets. One problem that I'm running across is that there is inconsistent use of punctuation. Sometimes, the research is coded with a standard '
, and other times is coded with ’
.
For example, the names of the Ivory Coast in French is Côte d'Ivoire. Unfortunately, the data are not uniformly coded across time. As a result, when I run a crosstab, I get this:
country 2008 2009
------- ---- ----
Cote d'Ivoire 498 0
Cote d’Ivoire 0 502
What I want to get is this:
country 2008 2009
------- ---- ----
Cote d'Ivoire 498 502
When I try to standardize these to use the '
rather than the ’
, I have absolutely no luck. It just doesn't seem to do anything. Here is the code I would use:
data$country[data$country == "Cote d’Ivoire"] <- Cote d'Ivoire
For some reason, I can't seem to figure this out, and it's driving me nuts. Does anyone know what I'm doing wrong?
Thank you!