I have a key and a massive metadata table. There's a column in the metadata table which contains values such as the following:
body_site
Lung
Lung
Brain - Amygdala
Brain - Amygdala
Brain - Caudate (basal ganglia)
Brain - Caudate (basal ganglia)
Lung
Lung
Skin - Sun Exposed (Lower leg)
Skin - Sun Exposed (Lower leg)
Brain - Spinal cord (cervical c-1)
Brain - Spinal cord (cervical c-1)
with body_site
as a header. The key looks like this:
Tissue,Key
Adipose - Subcutaneous,ADPSBQ
Adipose - Visceral (Omentum),ADPVSC
Adrenal Gland,ADRNLG
Artery - Aorta,ARTAORT
Artery - Coronary,ARTACRN
Artery - Tibial,ARTTBL
Bladder,BLDDER
Brain - Amygdala,BRNAMY
Brain - Anterior cingulate cortex (BA24),BRNACC
It's a csv
of the corresponding abbreviation for each type of tissue. What I want to do is replace all entries in the first table's column with the corresponding abbreviations in the second table's second column.
The problem is, when I take the advice of the highly-popular post which demonstrates how to do this, I somehow end up with a table that only has values for the body_site
column; in other words, all other data in that table is deleted except for the data that was replaced. On the plus side, the replacement works, but now I have an otherwise completely empty table, save for headers.
Here's what my code looks like. I included both solutions offered by the top answerer, both of which I tried.
library("data.table")
args = commandArgs(trailingOnly=TRUE)
# SraRunTable.txt is args[1]
#sratabl <- fread(args[1])
sratabl <- fread("SraRunTable.txt")
tiskey <- fread("GTExTissueKey.csv")
# current directory is args [2]
new <- sratabl # create a copy of df
# using lapply, loop over columns and match values to the look up table. store in "new".
new[] <- lapply(sratabl, function(x) tiskey$Key[match(x, tiskey$Tissue)])
new <- sratabl
new[] <- tiskey$Key[match(unlist(sratabl), tiskey$Tissue)]