1

I have a data frame that looks something like this:

Game Team Value
1      A    0.6
1      B    0.5
2      C    1.2
2      D    0.3

I want to create a new column that calculates the difference in 'value' for a given game, so it would be:

difference
0.1
-0.1
0.9
-0.9

In other words I want to group by the 'game' but I'm not quite sure how to do this

ekad
  • 14,436
  • 26
  • 44
  • 46
WeakLearner
  • 918
  • 14
  • 26
  • @csgillespie How is that a dupe of the linked question? – David Arenburg Apr 27 '16 at 09:42
  • @DavidArenburg The exact question isn't dupe, but the general principle is. Essentially the question is "I want to apply a function to data conditional on a variable". There are lots of questions in the past that have covered this (including timings). – csgillespie Apr 27 '16 at 09:57

4 Answers4

6

A simple base R solution using ave could be

with(df, ave(Value, Game, FUN = diff)) * c(-1, 1)
# [1]  0.1 -0.1  0.9 -0.9
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
3

Provided you have two elements per group only:

library(data.table)

# using @DavidArenburg clean trick
# otherwise c(diff(rev(Value)),diff(Value)) makes it

setDT(df)[,difference:=diff(Value)*c(-1,1), by = Game][]  
   Game Team Value difference
1:    1    A   0.6        0.1
2:    1    B   0.5       -0.1
3:    2    C   1.2        0.9
4:    2    D   0.3       -0.9
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
Colonel Beauvel
  • 30,423
  • 11
  • 47
  • 87
1

You can also use a Base R solution using aggregate

c(t(aggregate(Value~Game, df, function(x) c(diff(rev(x)), diff(x)))[, -1]))
[1]  0.1 -0.1  0.9 -0.9
Roman
  • 17,008
  • 3
  • 36
  • 49
1

With dplyr

library(dplyr)
df1 %>% 
    group_by(Game) %>%
    mutate(difference = diff(Value)*c(-1,1))
#   Game  Team Value difference
#  (int) (chr) (dbl)      (dbl)
#1     1     A   0.6        0.1
#2     1     B   0.5       -0.1
#3     2     C   1.2        0.9
#4     2     D   0.3       -0.9
akrun
  • 874,273
  • 37
  • 540
  • 662