I have a pretty big csv-file with data on volatility in it. the file contains numerous columns starting with the particular country (i), the year of when the volatility is indicated, i have a column for the volatility (measured as standard deviation of the logs of the exchange rate), the logarithms of the exchange rate and the exchange rate itself. my country sample includes 152 countries (accordingly there is 152 columns of the measured volatility, the logs of the exchange rate and the exchange rate). The column headers look like this:
"i" "year" "vol0" "vol1" "vol2" "vol3" "vol4" "lER0" "lER1" "lER2" "lER3" "lER4" "ER0" "ER1" "ER2" "ER3" "ER4"
now i am faced with the task to do summary statistics on this data. to design this in a comparable and neat way i want to group the countries (i) respectively their volatilities into different "region groups". the regions are defined in another file (it would look for say the united states like this: country id: 67, region: NAm).
now to my question: how do evaluate the data based on the different groups; meaning how do i assign the countries to the groups and then how do i do the summary statistics per group?