I was attempting to re-arrange some data but ran into some problems and would greatly appreciate any advice or suggestions you may have.
Background: I measured the value of three genes (FTH1, TFR1, VEGF) on a sample called A three times. Some of the measurements of the genes on the third run were not recorded (hence why some genes have two values compared to three for others). The data in long form is as below:
Sample Gene Value
1 A FTH1 19.287
2 A FTH1 18.411
3 A TFR1 21.536
4 A TFR1 22.528
5 A TFR1 20.255
6 A VEGF 14.414
7 A VEGF 14.009
I would like to reshape this data into the following format for easier down-stream analysis:
Sample FTH1 TFR1 VEGF
A 19.287 21.2536 14.414
A 18.411 22.528 14.009
A N/A 20.255 N/A
What would be the best way to go about reformatting the data into the form above?
I tried using dcast as below
library(reshape2)
library(tidyverse)
data = read.csv("data.csv")
dcast(data, Sample ~ Gene, value = "Value")
but was met with the following error:
Aggregation function missing: defaulting to length
Error in .fun(.value[0], ...) :
2 arguments passed to 'length' which requires 1
I think this is happening because some Genes (i.e FTH1 and VEGF) have two entries whereas TFR1 has three - I'm not 100% sure however. Any advice on how to accomplish this re-shape would be greatly appreciated!