Substraction between pairs of lines in R

Question

I have been trying to add a new column to my data frame resulting from the substraction of the values of one column by pairs of lines for each "sub-data frame" (each "id_n").

My data frame looks like this:

dput(df[1:30,c(2,5,6,9,14,15)])

structure(list(gen_spe = c("holo_ads", "holo_ads", "holo_ads", "holo_ads", "holo_ads", "holo_ads", "holo_ads", "holo_ads", "holo_ads", "holo_ads", "holo_ads", "holo_ads", "holo_ads", "holo_ads", "holo_ads", "holo_ads", "holo_ads", "holo_ads", "holo_ads", "holo_ads", "holo_ads", "holo_ads", "holo_ads", "holo_ads", "holo_ads", "holo_ads", "holo_ads", "holo_ads", "holo_ads", "holo_ads"), ori = c("guad", "guad", "guad", "guad", "guad", "guad", "guad", "guad", "guad", "guad", "guad", "guad", "guad", "guad", "guad", "guad", "guad", "guad", "guad", "guad", "guad", "guad", "guad", "guad", "guad", "guad", "guad", "guad", "guad", "guad"), spe = c("ads", "ads", "ads", "ads", "ads", "ads", "ads", "ads", "ads", "ads", "ads", "ads", "ads", "ads", "ads", "ads", "ads", "ads", "ads", "ads", "ads", "ads", "ads", "ads", "ads", "ads", "ads", "ads", "ads", "ads" ), id_n = c("1_1", "1_1", "1_1", "1_1", "1_10", "1_10", "1_10", "1_10", "1_11", "1_11", "1_11", "1_11", "1_12", "1_12", "1_12", "1_12", "1_13", "1_13", "1_13", "1_13", "1_14", "1_14", "1_14", "1_15", "1_15", "1_15", "1_16", "1_16", "1_16", "1_16"), npu = c(1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 1, 2, 3, 1, 2, 3, 4), duper = c(0.00997, 0.01002, 0.01213, NA, 0.01049, 0.01024, 0.01292, NA, 0.01054, 0.01009, 0.01424, NA, 0.01088, 0.01027, 0.01444, NA, 0.0102, 0.00995, 0.01165, NA, 0.01079, 0.01047, NA, 0.01061, 0.01129, NA, 0.01038, 0.0102, 0.01317, NA)), row.names = c(NA, 30L), class = "data.frame")

So, we have something like :

id_n <- c("1_1","1_1","1_1","1_1","2_1","2_2","2_3","2_4","3_1","3_2")
duper <- c("0.00997","0.01002","0.01213", "NA", "0.01024", "0.01024", "0.01258", "NA", "0.01045", "0.01020")
npu <- c("1", "2", "3", "4", "1", "2", "3", "4", "1", "2")
x <- data.frame(id_n, duper, npu)

I would like R to give me a new column that corresponds to the substraction of values of column 'duper' 2 by 2 for each id_n.

For example, for id_n = 1_1 : 0.01002-0.00997; 0.01213-0.01002. For id_n = 1_2 : 0.01024-0.01024; 0.01258-0.01024. For id_n = 1_3 : 0.01061-0.01047 And so on.

I am able to make a list of the 'sub-data frames' on which I then would like to apply a funcion but I do not know how to ask R to calculate this. The column 'npu' could be used as values go from 1 to ... for each id_n.

Do you have some ideas?

Thank you very much,

Marine.

Please do not post photos of data or code! If you do, people who are willing to help you would have to type out all that text. Instead provide a [minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610) P.S. Here is [a good overview on how to ask a good question](https://stackoverflow.com/help/how-to-ask) — dario, Oct 20 '21 at 11:59
Please just provide the dput(head(name of your dataset)) in order to help you — 12666727b9, Oct 20 '21 at 12:05
Sorry, I added some information and simple code to explain what I try to do. I hope this is fine! — Marine, Oct 20 '21 at 12:25

denisafonin · Answer 1 · 2021-10-20T12:21:06.400

Something like this (using tidyverse library):

install.packages("tidyverse")
library(tidyverse)

x$duper <- as.numeric(x$duper)

x %>%
  group_by(id_n) %>%
  mutate(new_col = duper - lag(duper))

Returns this result:

# A tibble: 10 x 4
# Groups:   id_n [7]
   id_n     duper npu      new_col
   <chr>    <dbl> <chr>      <dbl>
 1 1_1    0.00997 1     NA        
 2 1_1    0.0100  2      0.0000500
 3 1_1    0.0121  3      0.00211  
 4 1_1   NA       4     NA        
 5 2_1    0.0102  1     NA        
 6 2_2    0.0102  2     NA        
 7 2_3    0.0126  3     NA        
 8 2_4   NA       4     NA        
 9 3_1    0.0104  1     NA        
10 3_2    0.0102  2     NA

Thank you denisafonin ! For the whole data frame, I have to write it like this : `df <- df %>% group_by(id_n) %>% mutate(new_col = duper - lag(duper))` — Marine, Oct 20 '21 at 13:11

Substraction between pairs of lines in R

1 Answers1