I have a matrix extracted with R which has many NA values ( values not found in the text for extraction)
And i have a reference csv file that contain all combination possibilities of the same data
this is a part of my matrix
data matrix;
OMIM GENES_SYMBOL GENES CHROMOSOME
1 (NA) (arlts1) (perforin) (NA)
2 (NA) (mtr) (NA) (NA)
3 (325410) (NA) (NA) (NA)
4 (NA) (t341c) (NA) (5)
this is how the csv matrix looks like
dictionary matrix;
OMIM GENES_SYMBOL GENES CHROMOSOME
"612367" "alpqtl2" anorexia nervosa,a 1" 1
"606788" "arlts1" basal cell carcinoma, susceptibility to, 3
"325410" "bcc1" bone mineral density qtl 3 10
I want to map the first matrix with the second one to fill all equivalent values and get rid of NA
. the problem is the matrices have not the same length( the second >>>>> the first)
and rows in both are not organised the same; the 1st row of the data matrix
can be the row number 500 in the dictionary matrix
I wrote this code but it worked only when 2 matrices have same length. If not it returns only 2 columns from the data matrix
genemap<- data.table::fread("GeneMap - Copy.csv",sep="\t")
fun <- function(rowi,genemap) {
res <- apply(as.data.frame(genemap),1,function(x) {length(na.omit(match(na.omit(rowi),x)))})
IND <- which( max(datamatrix) == datamatrixs )[1]
rowi[is.na(rowi)] <- unlist(genemap[IND,])[is.na(rowi)]
return(rowi)
}
as.data.frame(t(apply(datamatrix, 1, fun, genemap))
)
OMIM GENES_SYMBOL
1 (NA) (arlts1)
2 (NA) (mtr)
3 (325410) (NA)
4 (NA) ( t341c)
any suggestion to modify the code??