0

so this is my main data

 Country Consumption Rank
Belarus        17.5    1
 Moldova        16.8    2
Lithuania        15.4    3
  Russia        15.1    4
 Romania        14.4    5
 Ukraine        13.9    6

I have also collected these another data frames of continents like:

 europe
Albania
 Andorra
Armenia
 Austria
Azerbaijan
Belarus

or another data frame like

  asia
Afghanistan
 Bahrain
 Bangladesh
  Bhutan
  Brunei

6 Burma (Myanmar)

I want to match the countries of my data with the continents countries data frames I have and then label them with the continents like Europe or Asia

here is the code I have managed but does not match them so the else if executes only:

 if ( data$Country %in% europe$europe) {
 data$con<-c("Europe")
} else if ( data$Country %in% asia$asia) {
 data$con<-c("asia")
 } else if ( data$Country %in% africa$africa) {
data$con<-c("africa")
    } else
    data$con<-c("ridi")

thank you in advance.

hanif
  • 3
  • 2

2 Answers2

1

First, build the map from countries to continents:

continent_map = stack(c(europe, asia))
names(continent_map) <- c("Country", "Continent")

Then, use match:

dat["Continent"] = continent_map$Continent[ match(dat$Country, continent_map$Country) ]

    Country Consumption Rank Continent
1   Belarus        17.5    1    europe
2   Moldova        16.8    2      <NA>
3 Lithuania        15.4    3      <NA>
4    Russia        15.1    4      <NA>
5   Romania        14.4    5      <NA>
6   Ukraine        13.9    6      <NA>

Generally, you should keep related data in a single structure like continent_map (instead of many separate places like the OP's asia and europe).


Data used:

dat = structure(list(Country = c("Belarus", "Moldova", "Lithuania", 
"Russia", "Romania", "Ukraine"), Consumption = c(17.5, 16.8, 
15.4, 15.1, 14.4, 13.9), Rank = 1:6), .Names = c("Country", "Consumption", 
"Rank"), row.names = c(NA, -6L), class = "data.frame")
europe = structure(list(europe = c("Albania", "Andorra", "Armenia", "Austria", 
"Azerbaijan", "Belarus")), .Names = "europe", row.names = c(NA, 
-6L), class = "data.frame")
asia = structure(list(asia = c("Afghanistan", "Bahrain", "Bangladesh", 
"Bhutan", "Brunei")), .Names = "asia", row.names = c(NA, -5L), class = "data.frame")
Frank
  • 66,179
  • 8
  • 96
  • 180
  • just like any other method it just return NAs in the continent column for all the countries for some reason!! I even made sure they are both class character, still neither match or ifelse returns anything but NAs – hanif May 27 '16 at 10:35
  • As lmo suggested, try running the code in each of our answers (including the part where the data is read in) to better investigate your problem. This sort of issue (where you and answerers have different results simply because your example data is ambiguous) is why it's recommended to post minimal reproducible examples http://stackoverflow.com/a/28481250/ – Frank May 27 '16 at 12:48
0

Here is one method using ifelse. I modified your data slightly so you can see that it will work for both Asia and Europe

# get your data
df <- read.table(text="Country Consumption Rank
Belarus        17.5    1
                  Brunei        16.8    2
                  Lithuania        15.4    3
                  Austria        15.1    4
                  Romania        14.4    5
                  Ukraine        13.9    6
                  Bangladesh      24.2   5", header=T)

df.europe <- read.table(text=" europe
Albania
                          Andorra
                          Armenia
                          Austria
                          Azerbaijan
                          Belarus", header=T, as.is=T)

df.asia <- read.table(text="asia
Afghanistan
                  Bahrain
                  Bangladesh
                  Bhutan
                  Brunei", header=T, as.is=T)

# use ifelse to get categories
df$con <- ifelse(df$Country %in% df.europe$europe, "europe", 
                 ifelse(df$Country %in% df.asia$asia, "asia", NA))

It is generally a good idea to keep nested ifelse to a minimum, but for such a dataset of a couple thousand observation, it will be fine.

lmo
  • 37,904
  • 9
  • 56
  • 69
  • Fwiw, they said `asia` and `europe` were data.frames, not vectors. – Frank May 25 '16 at 14:05
  • 1
    Yeah, I had it like that originally, but changed it for aesthetics, probably better to keep it more true to the original. Thanks. – lmo May 25 '16 at 14:08
  • Copy and paste all of the code above. You will see that it produces non-NA values. I just tried it with a fresh version of R. Your problem is probably that you are storing the Country variable as a factor. It is easier to work with character variables. This is why I use the as.is=TRUE argument in my `read.table` functions. – lmo May 27 '16 at 11:29