1

My data I want to combine my Rdata files into one which is having same object name and save to new combined Rdata file in an directory some think like similar to this thread but not able to do it getting error anyone please suggest any simple way to do it in R I am new to R not able to figure it out.

all.files = c("data1.Rdata", "data1.Rdata", "data1.Rdata")

mylist<- lapply(all.files, function(x) {
  load(file = x)
  get(ls()[ls()!= "filename"])
})

names(mylist) <- all.files

teunbrand
  • 33,645
  • 4
  • 37
  • 63
Shrilaxmi M S
  • 151
  • 1
  • 12
  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. What does your code look like now? What errors are you getting? – MrFlick Aug 26 '21 at 05:53
  • @MrFlick Okay I will include – Shrilaxmi M S Aug 26 '21 at 05:57

2 Answers2

1

If I understood it correctly you want to get all that .RData into one single data.frame.

One option is to list all files in your working directory that have the extension .RData, load and combine them using rbind:

ll <- list.files(pattern = '.RData')

res <- do.call(rbind,
               lapply(ll, function(x) {
                 
                 load(file = x)
                 get(ls())
               }))

No we can check both top 6 rows.

head(res)

#     chrom     start       end                gid         gname                tid strand
#32590   chr7  45574608  45574777 ENSMUSG00000085214 0610005C13Rik ENSMUST00000130094      -
#109006  chr4 154023688 154023891 ENSMUSG00000078350 1190007F08Rik ENSMUST00000143047      -
#475764 chr15  83365029  83365513 ENSMUSG00000075511 1700001L05Rik ENSMUST00000178628      -
#448806 chr13  31567474  31567610 ENSMUSG00000038408 1700018A04Rik ENSMUST00000150418      -
#11159   chr6 147694981 147695041 ENSMUSG00000085077 1700049E15Rik ENSMUST00000152737      +
#339243 chr12  22958352  22960254 ENSMUSG00000073164 2410018L13Rik ENSMUST00000149246      -
#             class biotype byname.uniq bygid.uniq
#32590  altAcceptor lincRNA        TRUE       TRUE
#109006 altAcceptor lincRNA        TRUE       TRUE
#475764 altAcceptor lincRNA        TRUE       TRUE
#448806 altAcceptor lincRNA        TRUE       TRUE
#11159  altAcceptor lincRNA        TRUE       TRUE
#339243 altAcceptor lincRNA        TRUE       TRUE

and bottom 6 too:

tail(res)
#       chrom     start       end                gid gname                tid strand
#189235  chr6  90373711  90373841 ENSMUSG00000034430  Zxdc ENSMUST00000113539      +
#563026 chr11  72916473  72916587 ENSMUSG00000055670 Zzef1 ENSMUST00000069395      +
#563046 chr11  72916473  72916587 ENSMUSG00000055670 Zzef1 ENSMUST00000152481      +
#158407  chr3 152449013 152449128 ENSMUSG00000039068  Zzz3 ENSMUST00000106101      +
#158450  chr3 152449013 152449128 ENSMUSG00000039068  Zzz3 ENSMUST00000106103      +
#158465  chr3 152449016 152449128 ENSMUSG00000039068  Zzz3 ENSMUST00000089982      +
#             class        biotype byname.uniq bygid.uniq
#189235 altAcceptor protein_coding       FALSE      FALSE
#563026 altAcceptor protein_coding       FALSE      FALSE
#563046 altAcceptor protein_coding       FALSE      FALSE
#158407 altAcceptor protein_coding       FALSE      FALSE
#158450 altAcceptor protein_coding       FALSE      FALSE
#158465 altAcceptor protein_coding       FALSE      FALSE

and you can check the dimensions.

dim(res)
#24279    11

Edit

This worked on R 4.0.3. It seems that if fails for R 4.1.1.. I'll edit the answer with a new solution.

patL
  • 2,259
  • 1
  • 17
  • 38
  • Using this above code I am getting error I ran this code on my file containing directory I did not understand it properly – Shrilaxmi M S Aug 27 '21 at 07:33
  • Which is the error you're getting? It worked for me. – patL Aug 27 '21 at 07:54
  • `Error in lapply(ll, function(x) { : object 'll' not found` this error I got even i did not understand why ll is there and I am beginner of R – Shrilaxmi M S Aug 27 '21 at 09:22
  • @ShrilaxmiMS I've edited my answer. One part of the code was missing. Please check it now. – patL Aug 27 '21 at 09:27
  • stilll getting a error 'Error in get(ls()[ls() != "filename"]) : first argument has length > 1 Called from: get(ls()[ls() != "filename"])' – Shrilaxmi M S Aug 27 '21 at 09:32
  • Clean your environment with `remove(list = ls())`, leave only those `.RData` you will read and the working directory and copy and paste the code I've sent. I've just ran that and I have all files loaded and in one `data.frame` – patL Aug 27 '21 at 09:39
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/236479/discussion-between-patl-and-shrilaxmi-m-s). – patL Aug 27 '21 at 09:42
0

I typically do this with a loop:

FileVector <- c("data1.Rdata", "data1.Rdata", "data1.Rdata")
Res <- vector(mode = "list",
              length = length(FileVector))

for (m1 in seq_along(FileVector)) {
  FilesLoaded <- load(file = FileVector[m1],
                      verbose = FALSE)
  if ("filename" %in% FilesLoaded) {
    Res[[m1]] <- get("filename")
  }
  rm(list = FilesLoaded)
}

This gives us a list, and we can add other checks to the loop to say, not add data that has some value in whatever column, or we can also check each new piece of data to make sure that certain column names are present. You can also wrap your load() call in a try() call if you have real world concerns like some data files not having generated properly. Then we just slam it all together with do.call()

# Null positions will be dropped
Res <- do.call(rbind,
               Res)

It's usually advantageous to build your vector of file names with something like list.files() and specifying the pattern = argument.

Generally that might look something like:

FileVector <- list.files(path = "~/my/directory",
                         full.names = TRUE,
                         pattern = "mypattern")
Nick
  • 312
  • 1
  • 14
  • `Res <- vector(mode = "list", length = length(FileVector)) ` giving me null output can u please tell me why – Shrilaxmi M S Aug 27 '21 at 07:37
  • If there is no `FileVector` object in your workspace, your constructor will return a list of length 1 with a `NULL`. Is it in your workspace? How was it constructed? – Nick Aug 27 '21 at 14:32