I am new to R, and so would greatly appreciate more explanation for any code you might have to help solve my issue.
I have a data.frame with groups of columns related to each other and I want to perform a calculation on each of those many groups to get new output columns. For example, many biological replicates in an experiment where I want to perform the calculations on each replicate independently before collapsing them.
I know I could use mutate in dplyr to create new columns, but I am not sure how to do this in a loop or how to use lapply type strategy to avoid re-tying the columns names each time. My biggest issue is understanding how to convert the column names into something usable by one of these strategies.
For example:
> A.1 <- c(11,12,13,4,15,6,17,18)
> A.2 <- c(2,4,5,5,19,7,5,1)
>
> B.1 <- c (3,4,5,1,31,76,13,70)
> B.2 <- c (10,9,8,15,31,12,13,12)
>
> C.1 <- c(1,2,3,4,5,6,7,8)
> C.2 <- c(2,4,5,8,10,12,15,18)
>
> df <- data.frame(A.1, A.2, B.1, B.2, C.1, C.2)
>
> df
A.1 A.2 B.1 B.2 C.1 C.2
1 11 2 3 10 1 2
2 12 4 4 9 2 4
3 13 5 5 8 3 5
4 4 5 1 15 4 8
5 15 19 31 31 5 10
6 6 7 76 12 6 12
7 17 5 13 13 7 15
8 18 1 70 12 8 18
>
Where I want to create new columns where A.new = A.1/A.2, B.new = B.1/B.2 etc. without typing out each column name explicitly. Also note the "A" and "B" are really character strings, so typing all of them would be very messy and time consuming.
Something like this, but a general case for many column groups:
> df <- df %>% mutate(A.new = A.1/A.2)
> df <- df %>% mutate(B.new = B.1/B.2)
> df <- df %>% mutate(C.new = C.1/C.2)
>
> df
A.1 A.2 B.1 B.2 C.1 C.2 A.new B.new C.new
1 11 2 3 10 1 2 5.5000000 0.30000000 0.5000000
2 12 4 4 9 2 4 3.0000000 0.44444444 0.5000000
3 13 5 5 8 3 5 2.6000000 0.62500000 0.6000000
4 4 5 1 15 4 8 0.8000000 0.06666667 0.5000000
5 15 19 31 31 5 10 0.7894737 1.00000000 0.5000000
6 6 7 76 12 6 12 0.8571429 6.33333333 0.5000000
7 17 5 13 13 7 15 3.4000000 1.00000000 0.4666667
8 18 1 70 12 8 18 18.0000000 5.83333333 0.4444444
>
I don't see the answer to my question on here already, but if you could point me to existing answers that would be much appreciated! I am currently thinking about the column names as containing the variable, but maybe that is not the correct way to approach this (R is also the first programming language I am learning) and so my searches for answers haven't yielded much yet.
Thank you for your guidance in advance!