I have two dataframes in R, one of them quite big (say 150000 observations with 160 variables) and one smaller (76 observations of 5 variables).
One of the variables in the big dataframe is country saved as a string, while the other consists of various countries and specific characteristics. Now I want to construct new variables in my new dataframe by adding columns for each of those characteristics and linking an observation to the characteristics of their corresponding country. I have however a few problems:
- One problem is that not all countries are represented in the smaller dataframe, I'd want to drop observations in my first frame from a country not represented in the second.
- The second problem is that it seems that I can't use the standard merge function, as the second dataframe is formatted in the following way:
Country Var1 Var2 Var3 Var4 Var5
NIC -0.61252 -0.54723 -0.41597 -0.38825 -0.17819
RWA -0.60603 -0.28969 -0.57998 -0.05933 -0.14199
GEO -0.48543 -0.08132 0.56275 -0.25436 0.62782
While my first dataframe is formatted in the following way
CNTRY Var1 Var2 etc
Does it seem that I'll need to hardcode a function myself?