I have a dataset that has an ID variable and thousands of columns of averages. A reproducible example is below. For each ID, I would like to select the column name that contains the value closest to 0.50. If there is a tie, select the lowest value. Is there an efficient way to do this (preferably using dplyr or data.table)?
df = data.frame(ID = paste("ID", 1:1000, sep = ""),
matrix(rnorm(20000), nrow=10))
> df[1:5, 1:5]
ID X1 X2 X3 X4
1 ID1 -0.5532944 -1.20671805 0.75142048 0.56022595
2 ID2 -1.0083010 -0.01534611 1.53546691 -0.08762588
3 ID3 -0.1606776 -0.96947669 -0.38631278 -1.15647134
4 ID4 -0.5957471 -0.20918120 -0.05246698 -0.84235789
5 ID5 0.1569595 -0.62460245 -0.39454014 0.91089249
My goal is to have a dataframe with the ID variable and the column name that contains the value closest to 0.5 as well as the value.
ID T P
1 ID1 X10 0.5671
2 ID2 X100 0.4999
3 ID3 X34 0.5877
4 ID4 X21 0.5055
5 ID5 X15 0.4987