6

My df looks something like this:

ID    Obs    Value
1     1      26
1     2      13
1     3      52
2     1      1,5
2     2      30

Using dplyr, I to add the additional column Col, which is the result of a division of all values in the column value by the group's first value in that column.

ID    Obs    Value    Col
1     1      26       1
1     2      13       0,5
1     3      52       2
2     1      1,5      1
2     2      30       20

How do I do that?

TIm Haus
  • 251
  • 3
  • 8
  • Related [Scale relative to a value in each group (via dplyr)](https://stackoverflow.com/questions/27117429/scale-relative-to-a-value-in-each-group-via-dplyr) – markus Jan 24 '19 at 10:43

2 Answers2

11

After grouping by 'ID', use mutate to create a new column by dividing the 'Value' by the first of 'Value'

library(dplyr)
df1 %>%
   group_by(ID) %>%
   mutate(Col = Value/first(Value))

If the first 'Value' is 0 and we don't want to use it, then subset the 'Value' with a logical expression and then take the first of that

df1 %>%
   group_by(ID) %>%
    mutate(Col = Value/first(Value[Value != 0]))

Or in base R

df1$Col <- with(df1, Value/ave(Value, ID, FUN = head, 1))

NOTE: The comma in 'Value' suggests it is a character column. In that case, it should be first changed to decimal (.) if that is the case, convert to nunmeric and then do the division. It can be done while reading the data

akrun
  • 874,273
  • 37
  • 540
  • 662
  • Is there a way in dplyr to define the value dynamically? There are some groups for which the first value is 0 and I'd want to avoid the INFs by dividing by the first value != 0. – TIm Haus Jan 24 '19 at 10:47
  • 1
    @TImHaus. IN that case, change `mutate` to `mutate(Col = Value/first(Value[Value != 0]))` – akrun Jan 24 '19 at 10:51
2

Or, without creating an additional column:

library(tidyverse)

df = data.frame(ID=c(1,1,1,2,2), Obs=c(1,2,3,1,2), Value=c(26, 13, 52, 1.5, 30))

df %>% 
  group_by(ID) %>% 
  mutate_at('Value', ~./first(.))
#> # A tibble: 5 x 3
#> # Groups:   ID [2]
#>      ID   Obs Value
#>   <dbl> <dbl> <dbl>
#> 1     1     1   1  
#> 2     1     2   0.5
#> 3     1     3   2  
#> 4     2     1   1  
#> 5     2     2  20

### OR ###
df %>% 
  group_by(ID) %>% 
  mutate_at('Value', function(x) x/first(x))
#> # A tibble: 5 x 3
#> # Groups:   ID [2]
#>      ID   Obs Value
#>   <dbl> <dbl> <dbl>
#> 1     1     1   1  
#> 2     1     2   0.5
#> 3     1     3   2  
#> 4     2     1   1  
#> 5     2     2  20

Created on 2020-01-04 by the reprex package (v0.3.0)

abalter
  • 9,663
  • 17
  • 90
  • 145