stataData = the orginal data
country |
---|
"246" |
"246" |
"246" |
"752" |
"752" |
"752" |
"643" |
"643" |
"643" |
"840" |
"840" |
"840" |
my goal is to add a new column which get a value of 1 if country is 246 or 752; value 2 if country is 643 or 840. goal stataData looks like -->
country | area |
---|---|
"246" | 1 |
"246" | 1 |
"246" | 1 |
"752" | 1 |
"752" | 1 |
"752" | 1 |
"643" | 2 |
"643" | 2 |
"643" | 2 |
"840" | 2 |
"840" | 2 |
"840" | 2 |
In this case countries is a list of two data frames: (In real case there can be more data frames in the list and there will be more countries)
countries <- list(data.frame(c("FIN", "SWE"), c("246", "752")), data.frame(c("RUS", "USA"), c("643", "840")
for(i in length(countries)){
stataData$alue <- ifelse(stataData$country %in% countries[[i]][,2], i, stataData$country )
}
However this code returns a data frame (stataData) where there is only number for the second set of countries, like this:
country | area |
---|---|
"246" | "246" |
"246" | "246" |
"246" | "246" |
"752" | "752" |
"752" | "752" |
"752" | "752" |
"643" | 2 |
"643" | 2 |
"643" | 2 |
"840" | 2 |
"840" | 2 |
"840" | 2 |
I have tried using is_empty() but haven't find the way to eliminate the problem. So I'm asking how to solve the problem. In the real case in the stataData there is over 1.5milj observations.
edit. Clarifying the problem.
lets say that I have
ifelse(df$x1 %in% delta, i, NA)
where delta is an element of a list and i is a number. Is there way to increase the i when delta goes forward.
Cheers!