0

Within a data.table is a column containing factors that I would like to manipulate arithmetically. I would like to sum the three values on the left side of each ratio, and sum the three numbers on the right side of the ratio, then return that summed value as a ratio. It's tricky to explain, but if I have this as part of a data table:

     FattyAcid
1    4:0/16:0/16:0
2    16:0/16:0/18:1
3    18:1/14:0/18:1

I would then like to return in the data table

     FattyAcid        Assignment
1    4:0/16:0/16:0    36:0
2    16:0/16:0/18:1   50:1
3    18:1/14:0/18:1   50:2

i.e. for entry 1, (4 + 16 + 16):(0 + 0 + 0) = 36:0

When I call the dataset in the str function, it shows that the relevant column is: "Factor w/ 179 levels "(10:0/10:0/12:0)",..: 112 104 114 33 61 115 106 30 60 66 ..."

EDIT: I've found a solution, but it's not elegant. Basically I have to separate the values using tstrsplit() and paste them into new columns, which eventually generates six columns. Then convert them into numeric (from character), combine the relevant columns and then combine that result again. Then I just delete the old columns. I'm sure there's a better way but I guess it works :)

### split up the fatty acid factors into three columns separated by "/"     i.e. individual ID'd fatty acids.
### also remove the starting and trailing brackets
setDT(LipidDataShortest)[, paste0("FattyAcid", 1:3) := tstrsplit(FattyAcid, "/")]
LipidDataShortest <- as.data.table(sapply(LipidDataShortest, gsub, pattern="[(]", replacement = ""))
LipidDataShortest <- as.data.table(sapply(LipidDataShortest, gsub, pattern="[)]", replacement = ""))

### small issue - also removes bracket from "FattyAcid" column. Way to remove only from specific columns?

### split up the specific fatty acids into number of carbons and number of double bonds
setDT(LipidDataShortest)[, paste0("FattyAcidOne", 1:2) := tstrsplit(FattyAcid1, ":")]
setDT(LipidDataShortest)[, paste0("FattyAcidTwo", 1:2) := tstrsplit(FattyAcid2, ":")]
setDT(LipidDataShortest)[, paste0("FattyAcidThree", 1:2) := tstrsplit(FattyAcid3, ":")]

### convert from character to numeric
LipidDataShortest$FattyAcidOne1 <- as.numeric(LipidDataShortest$FattyAcidOne1)
LipidDataShortest$FattyAcidOne2 <- as.numeric(LipidDataShortest$FattyAcidOne2)
LipidDataShortest$FattyAcidTwo1 <- as.numeric(LipidDataShortest$FattyAcidTwo1)
LipidDataShortest$FattyAcidTwo2 <- as.numeric(LipidDataShortest$FattyAcidTwo2)
LipidDataShortest$FattyAcidThree1 <- as.numeric(LipidDataShortest$FattyAcidThree1)
LipidDataShortest$FattyAcidThree2 <- as.numeric(LipidDataShortest$FattyAcidThree2)

### combine the columns to get total carbons and create new column for that, then repeat for alkenes
setDT(LipidDataShortest)[, paste0("Carbons", 1) := LipidDataShortest$FattyAcidOne1 + LipidDataShortest$FattyAcidTwo1 + LipidDataShortest$FattyAcidThree1 ]
setDT(LipidDataShortest)[, paste0("DoubleBonds", 1) := LipidDataShortest$FattyAcidOne2 + LipidDataShortest$FattyAcidTwo2 + LipidDataShortest$FattyAcidThree2 ]

### combine final assignments into new column and delete the unnecessary columns used to get to this point
LipidDataShortest$Assignment <- paste(LipidDataShortest$Carbons1, LipidDataShortest$DoubleBonds1, sep = ":")
LipidDataShortest <- LipidDataShortest[, -c(10:20)]
heds1
  • 3,203
  • 2
  • 17
  • 32
  • 3
    Please provide a reproducible example. In general using `file.choose()` is not a good way to reproduce any problem. – Adam Quek May 08 '17 at 04:21
  • It seems like you may be confusing (or not clearly distinguishing) between an R data type `factor` and an explanatory variable in your data. This type of question really needs to see sample data in addition to your code, otherwise everything is speculation. You are close to a [reproducible question](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example), just a little more effort and you'll be "close enough". – r2evans May 08 '17 at 05:43
  • Thanks guys. I've edited the original post to show what I hope is a reproducible example with a small data table. r2evans, you're right, I'm sure I am confusing R data type factors and explanatory variables! I've come up with a solution but I'm sure there's a better way, but hey, it works! – heds1 May 08 '17 at 21:10

0 Answers0