I have a dataframe, df
:
df <- structure(list(ID = c("ID1", "ID2", "ID3", "ID4", "ID5", "ID6"
), val1 = c(0, 1, 0, 0, 0, 0), sig1 = c(0.76, 0.15, 0.35, 0.66,
0.7, 0.72), val2 = c(0.25, 0.81, 0.21, 0.06, 0.18, NA), sig2 = c(0.65,
0.09, 0.55, 0.88, 0.09, NA), val3 = c(0.26, 0.24, 0.16, -0.1,
0.22, NA), sig3 = c(0.64, 0.62, 0.65, 0.8, 0.04, NA)), row.names = c(NA,
6L), class = "data.frame")
> df
ID val1 sig1 val2 sig2 val3 sig3
1 ID1 0 0.76 0.25 0.65 0.26 0.64
2 ID2 1 0.15 0.81 0.09 0.24 0.62
3 ID3 0 0.35 0.21 0.55 0.16 0.65
4 ID4 0 0.66 0.06 0.88 -0.10 0.80
5 ID5 0 0.70 0.18 0.09 0.22 0.04
6 ID6 0 0.72 NA NA NA NA
This data frame contains 3 pairs of data for each ID number - val1
and sig1
, val2
and sig2
and val3
and sig3
.
I want to add an additional 2 columns. I want the first column (first_val
) to state the name of the first "val
" column where the associated "sig
" column is less than 0.5. In the second column (first_val_value
), I want to paste the associated val1
or val2
value for that ID.
So for example -
For ID1
, none of the sig
values are less than 0.5, so the first_val
column entry should be "-".
For ID2
, sig1
is less than 0.5, so the first_val
column entry should be "val1".
For ID3
, sig1
is less than 0.5, so the first_val
column entry should be "val1".
For ID4
, none of the sig
values are less than 0.5, so the first_val
column entry should be "-".
For ID5
, sig2
is less than 0.5, so the first_val
column entry should be "val2".
For ID6
, none of the sig
values are less than 0.5, so the first_val
column entry should be "-".
So my desired dataframe would look like this:
> df
ID val1 sig1 val2 sig2 val3 sig3 first_val first_val_value
1 ID1 0 0.76 0.25 0.65 0.26 0.64 - -
2 ID2 1 0.15 0.81 0.09 0.24 0.62 val1 1
3 ID3 0 0.35 0.21 0.55 0.16 0.65 val1 0
4 ID4 0 0.66 0.06 0.88 -0.10 0.80 - -
5 ID5 0 0.70 0.18 0.09 0.22 0.04 val2 0.18
6 ID6 0 0.72 NA NA NA NA - -
What is the best way to construct this dataframe?