1

I want to create a data.frame by merging all files in folder. Each individual files in the folder is in this format.

sample.1 =
    gene_id         normalized_count
    ABCB7|22          536.0631
    ABCB8|11194       504.5299
    ABCB9|23457       147.6550
    ABCC10|89845      458.8775
    ABCC11|85320      5.6477



sample.n = 
   gene_id         normalized_count
    ABCB7|22          122.3673 
    ABCB8|11194       849.9824
    ABCB9|23457       169.9023
    ABCC10|89845      0.0000
    ABCC11|85320      2.8239


While creating new data.frame, have to paste new column with normalized_count if the gene_id are same. The new column ID should be the name of the file


desired output = 

     gene_id             sample.1     sample.n
        ABCB7|22          536.0631     122.3673 
        ABCB8|11194       504.5299     849.9824
        ABCB9|23457       147.6550     169.9023
        ABCC10|89845      458.8775     0.0000
        ABCC11|85320      5.6477       2.8239

I have tried this for creating a new data.frame.

file_list <- list.files("./")
    dataset <- do.call("cbind",lapply(file_list,FUN=function(files{
                       read.table(files,header=TRUE, sep="\t")}))
Kryo
  • 921
  • 9
  • 24

1 Answers1

2

I ganarated some ".txt" file from your example

file_list <- list.files("./")[15:16]
> file_list
[1] "sample.1.txt" "sample.n.txt"

then:

dataset <- Reduce(function(x, y) merge(x, y, by="gene_id"), 
                  lapply(file_list,FUN=function(files){
                    read.table(files,header=TRUE, sep="")
                    }))
names(dataset)[-1] <- gsub("[.]txt", "", file_list)

> dataset
       gene_id sample.1 sample.n
1     ABCB7|22 536.0631 122.3673
2  ABCB8|11194 504.5299 849.9824
3  ABCB9|23457 147.6550 169.9023
4 ABCC10|89845 458.8775   0.0000
5 ABCC11|85320   5.6477   2.8239

Roland's answer was used

Community
  • 1
  • 1
Andriy T.
  • 2,020
  • 12
  • 23