1

I have 2 questions regarding groups in a dataframe in R.

Imagine I have a dataframe (df) like this

| CONT | COUNTRY | GDP | AVG_GDP |
|------|---------|-----|---------|
| AF   | EGYPT   | 3   | 2       |
| AF   | SUDAN   | 2   | 2       |
| AF   | ZAMBIA  | 1   | 2       |
| AM   | CANADA  | 4   | 5       |
| AM   | MEXICO  | 2   | 5       |
| AM   | USA     | 9   | 5       |
| EU   | FRANCE  | 5   | 4       |
| EU   | ITALY   | 4   | 4       |
| EU   | SPAIN   | 3   | 4       |

How can I calculate the average of GDP by continents and then put it in the AVG_GDP column so it looks like in the table above?

The second question is how can I sum the GDP by continents so it looks like this:

| CONT | SUM_GDP |
|------|---------|
| AF   | 6       |
| AM   | 15      |
| EU   | 12      |

For this last question I think that in base R the second column would be obtained with something like df$SUM_GDP <- aggregate(df$GDP, by=list(df$CONT), FUN=sum) but maybe there is another way to make it in a new dataframe.

Thank you in advance

Enrique
  • 23
  • 5
  • 1
    `df$AVG_GDP <- with(df, ave(GDP, CONT))`, `aggregate` returns a dataframe so you can assign it directly and not to a specific column. `df1 <- aggregate(df$GDP, by=list(df$CONT), FUN=sum)` Or using formula syntax `df1 <- aggregate(GDP~CONT, df, sum)` – Ronak Shah May 01 '20 at 01:07

0 Answers0