Within a data.table is a column containing factors that I would like to manipulate arithmetically. I would like to sum the three values on the left side of each ratio, and sum the three numbers on the right side of the ratio, then return that summed value as a ratio. It's tricky to explain, but if I have this as part of a data table:
FattyAcid
1 4:0/16:0/16:0
2 16:0/16:0/18:1
3 18:1/14:0/18:1
I would then like to return in the data table
FattyAcid Assignment
1 4:0/16:0/16:0 36:0
2 16:0/16:0/18:1 50:1
3 18:1/14:0/18:1 50:2
i.e. for entry 1, (4 + 16 + 16):(0 + 0 + 0) = 36:0
When I call the dataset in the str function, it shows that the relevant column is: "Factor w/ 179 levels "(10:0/10:0/12:0)",..: 112 104 114 33 61 115 106 30 60 66 ..."
EDIT: I've found a solution, but it's not elegant. Basically I have to separate the values using tstrsplit() and paste them into new columns, which eventually generates six columns. Then convert them into numeric (from character), combine the relevant columns and then combine that result again. Then I just delete the old columns. I'm sure there's a better way but I guess it works :)
### split up the fatty acid factors into three columns separated by "/" i.e. individual ID'd fatty acids.
### also remove the starting and trailing brackets
setDT(LipidDataShortest)[, paste0("FattyAcid", 1:3) := tstrsplit(FattyAcid, "/")]
LipidDataShortest <- as.data.table(sapply(LipidDataShortest, gsub, pattern="[(]", replacement = ""))
LipidDataShortest <- as.data.table(sapply(LipidDataShortest, gsub, pattern="[)]", replacement = ""))
### small issue - also removes bracket from "FattyAcid" column. Way to remove only from specific columns?
### split up the specific fatty acids into number of carbons and number of double bonds
setDT(LipidDataShortest)[, paste0("FattyAcidOne", 1:2) := tstrsplit(FattyAcid1, ":")]
setDT(LipidDataShortest)[, paste0("FattyAcidTwo", 1:2) := tstrsplit(FattyAcid2, ":")]
setDT(LipidDataShortest)[, paste0("FattyAcidThree", 1:2) := tstrsplit(FattyAcid3, ":")]
### convert from character to numeric
LipidDataShortest$FattyAcidOne1 <- as.numeric(LipidDataShortest$FattyAcidOne1)
LipidDataShortest$FattyAcidOne2 <- as.numeric(LipidDataShortest$FattyAcidOne2)
LipidDataShortest$FattyAcidTwo1 <- as.numeric(LipidDataShortest$FattyAcidTwo1)
LipidDataShortest$FattyAcidTwo2 <- as.numeric(LipidDataShortest$FattyAcidTwo2)
LipidDataShortest$FattyAcidThree1 <- as.numeric(LipidDataShortest$FattyAcidThree1)
LipidDataShortest$FattyAcidThree2 <- as.numeric(LipidDataShortest$FattyAcidThree2)
### combine the columns to get total carbons and create new column for that, then repeat for alkenes
setDT(LipidDataShortest)[, paste0("Carbons", 1) := LipidDataShortest$FattyAcidOne1 + LipidDataShortest$FattyAcidTwo1 + LipidDataShortest$FattyAcidThree1 ]
setDT(LipidDataShortest)[, paste0("DoubleBonds", 1) := LipidDataShortest$FattyAcidOne2 + LipidDataShortest$FattyAcidTwo2 + LipidDataShortest$FattyAcidThree2 ]
### combine final assignments into new column and delete the unnecessary columns used to get to this point
LipidDataShortest$Assignment <- paste(LipidDataShortest$Carbons1, LipidDataShortest$DoubleBonds1, sep = ":")
LipidDataShortest <- LipidDataShortest[, -c(10:20)]