1

I have a data frame wherein I have a vector with (typically) repeated variables and another column with values. I would like to add a "Total" column to my data frame with the total of values for each variable, e.g. below:

Variable     Value
Chicken      5
Fish         3
Beef         2
Chicken      4
Beef         1
Fish         2

And after running the code that I need help with, get this:

Variable     Value   Total
Chicken      5       9
Fish         3       5
Beef         2       3
Chicken      4       9
Beef         1       3
Fish         2       5

I started with the below, but it is not working... I am just getting a bunch of "NA"s down the column.

df$Total <- sum(df$Value[which(df$Variable == df$Variable)])

Help?

stevenjoe
  • 329
  • 1
  • 4
  • 16

2 Answers2

4

A base-R (ave()) solution (tapply() could be useful too).

dd <- read.table(header=TRUE,text="
Variable     Value
Chicken      5
Fish         3
Beef         2
Chicken      4
Beef         1
Fish         2")

dd$Total <- with(dd,ave(Value,Variable,FUN=sum))

(or)

dd <- transform(dd,Total=ave(Value,Variable,FUN=sum))
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
2

Using dplyr:

library(dplyr)
df %>% group_by(Variable) %>% mutate(Total = sum(Value))
Source: local data frame [6 x 3]
Groups: Variable

  Variable Value Total
1  Chicken     5   9
2     Fish     3   5
3     Beef     2   3
4  Chicken     4   9
5     Beef     1   3
6     Fish     2   5
DatamineR
  • 10,428
  • 3
  • 25
  • 45