I have a data frame containing 450K methylation beta-values for approx. 450 probes for two samples. This data is displayed in three columns, and looks like this:
>head(ICGC)
submitted_sample_id probe_id methylation_value
1 X932-01-4D cg00000029 0.6
2 X932-01-6D cg00000029 0.4
3 X932-01-4D cg00000108 0.3
4 X932-01-6D cg00000108 0.7
5 X932-01-4D cg00000109 0.9
6 X932-01-6D cg00000109 0.1
I would like to rearrange this data.frame so that the probe IDs are the rownames and the sample IDs are the column names, so that it looks like this:
>head(ICGC_2)
X932-01-4D X932-01-6D
cg00000029 0.6 0.4
cg00000108 0.3 0.7
cg00000109 0.9 0.1
I have tried:
>library(tidyverse)
ICGC_2 <- ICGC %>% remove_rownames %>% column_to_rownames(var = "probe_id")
But this didn't work as each probe ID in ICGC appears twice in the column (as there are two samples). I also tried:
hello <- data.frame(ICGC[,-2], row.names = ICGC[,2])
But this had the same problem. The reason I want to rearrange this data in this way is because I would like to convert the beta values to M-values and use this data as the object in cpg.annotate (available through Bioconductor package DMRcate) - cpg.annotate requires the object to have unique Illumina probe IDs as rownames and unique sample IDs as column names.
Thank you!