I have a large list which contains expressed genes from many cell lines. Ensembl genes often come with version suffixes, but I need to remove them. I've found several references that describe this here or here, but they will not work for me, likely because of my data structure (I think its a nested array within a list?). Can someone help me with the particulars of the code and with my understanding of my own data structures?
Here's some example data
>listOfGenes_version <- list("cellLine1" = c("ENSG001.1", "ENSG002.1", "ENSG003.1"), "cellLine2" = c("ENSG003.1", "ENSG004.1"))
>listOfGenes_version
$cellLine1
[1] "ENSG001.1" "ENSG002.1" "ENSG003.1"
$cellLine2
[1] "ENSG003.1" "ENSG004.1"
And what I would like to see is
>listOfGenes_trimmed
$cellLine1
[1] "ENSG001" "ENSG002" "ENSG003"
$cellLine2
[1] "ENSG003" "ENSG004"
Here are some things I tried, but did not work
>listOfGenes_trimmed <- str_replace(listOfGenes_version, pattern = ".[0-9]+$", replacement = "")
Warning message:
In stri_replace_first_regex(string, pattern, fix_replacement(replacement), :
argument is not an atomic vector; coercing
>listOfGenes_trimmed <- lapply(listOfGenes_version, gsub('\\..*', '', listOfGenes_version))
Error in match.fun(FUN) :
'gsub("\\..*", "", listOfGenes_version)' is not a function, character or symbol
Thanks so much!