0

I need to create 24 variables with their corresponding data frames for 24 chromosomes and I feel that writing 24 similar lines is not efficient. Still can't figure out how to write the correct for-loop to solve this.

Here is what I've tried:

for (i in c(1:22,'X','Y')){
    Chr[i] <- merge(split[['chr[i]']],split2[['chr[i]']],by='Gene_Name')
}

or

 for (i in c(1:22,'X','Y')){
    Chri <- merge(split[['chri']],split2[['chri']],by='Gene_Name')
}

Can anyone help me correct my code to generate data frames/variables Chr1, Chr2,...ChrY?

Here is the snapshot of part of the data frame I hope to get.

enter image description here

Helena
  • 207
  • 3
  • 8
  • 4
    What do you need? This or something else: `paste0("Chr",c(1:22,"X","Y"))`? Please make your question [reproducible](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – NelsonGon Jan 03 '20 at 04:50
  • You should think of variables as entities created *by the programmer*, not by the program. Don’t try to create them dynamically. Use vectors and lists instead — your first code already does something similar. – Konrad Rudolph Jan 03 '20 at 10:26

2 Answers2

1

It’s not entirely clear what you need the data for but in general the solution to your problem in R is via lists (which you seem to be already using in split and split2!).

For instance, you can create a list of dataframes for your chromosomes:

chr = list()

for (i in paste0('chr', c(1 : 22, 'X', 'Y'))) {
    chr[[i]] <- merge(split[[i]], split2[[i]], by = 'Gene_Name')
}

Or, in a more R-like way (avoiding the for loop):

chrnames = paste0('chr', c(1 : 22, 'X', 'Y'))
chr = lapply(chrnames, function (chr) merge(split[[i]], split2[[i]], by = 'Gene_Name'))

Better yet, Bioconductor has extensive functionality for working with such data via the GenomicRanges package.

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • The first solution immediately solves my problem and I will look into the second solution you provide. Thanks a lot! – Helena Jan 03 '20 at 10:50
0

I am not sure if this is the thing you are after. The following code can generate empty variables to your global environment

varNames <- paste0("Chr",c(1:22,"X","Y"))
list2env(setNames(vector("list",length(varNames)),varNames),envir = .GlobalEnv)
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
  • Sorry for not stating my question clearly. So what I hope to get are 24 variables which they are different merged data frames with info of chromosome i (as the result of function 'merge'). I am still new to R so not sure what empty variables are, but I think this is probably not I am looking for. Let me know if I can clarify my question even more. Thank you! – Helena Jan 03 '20 at 10:04
  • @Helena can you provide some data via `dput()`? otherwise it is tough for other to understand your objective.... – ThomasIsCoding Jan 03 '20 at 10:07
  • Since this is generally a really bad idea, please don’t give such advice without explaining the caveats. – Konrad Rudolph Jan 03 '20 at 10:25
  • I edited my question by adding a snapshot of part of my merged data frame for only chromosome 1. Let's ignore the for loop I write in the question so that it would not mislead you. My question is how do I make 24 data frames (with their variable named Chr1, Chr2, ..., ChrY, respectively, according to which chromosome shown in the column 'Chromosome_Name')? – Helena Jan 03 '20 at 10:26