2

I have a list of character vectors, and I would like to write the list to a single file with all the character vectors. Here is an example of what I need:

>str(mylist)
List of 19
$ geneset1    : chr [1:140] "ASGR2" "ATXN7L3" "BCL6B" "C6orf211" ...
$ geneset2    : chr [1:174] "CKS1B" "CREBL2" "CTNNB1" "CTTN" ...
$ geneset3    : chr [1:346] "AGTR1" "C6" "C6orf211" "CCNK" ...
$ geneset4    : chr [1:259] "ASGR2" "ATF7IP" "ATXN7L3" "CKS1B" ...

My desired output would be a file like this:

#myfile
>geneset1
ASGR2 ATXN7L3 BCL6B C6orf211
>geneset2
ASGR2 ATXN7L3 BCL6B C6orf211
>geneset3
AGTR1 C6 C6orf211 CCNK

My approach is this:

writeLines(unlist(lapply(mylist, FUN=function(x)paste(x, collapse=" "))), con="test.txt")

However, I don't know how to add ">" at the beginning of the line

Thanks

user2380782
  • 1,542
  • 4
  • 22
  • 60

4 Answers4

4

You could just use a for looop:

out=stdout()

X <- list(geneset1= c("ASGR2","ATXN7L3","BCL6B","C6orf211"),
          geneset2= c("CKS1B","CREBL2","CTNNB1","CTTN"),
          geneset3= c("AGTR1","C6","C6orf211","CCNK" ),
          geneset4= c("ASGR2","ATF7IP","ATXN7L3","CKS1B" ) )

for(i in names(X)) {
    cat(">", i, "\n", file=out)
    cat(X[[i]], "\n", file=out)
}
Neal Fultz
  • 9,282
  • 1
  • 39
  • 60
  • Nice and simple! In this case, it might even be better not to vectorize because you avoid creating some intermediate objects. – cbare Sep 21 '15 at 17:57
  • 1
    Beware of the trailing whitespace this solution adds to the lines (via `cat`)! I would not recommend doing this, and, besides aesthetics, it will mess with some tools. – Konrad Rudolph Sep 22 '15 at 09:40
3

close enough,

writeLines(sapply(names(X),function(x) paste(paste0(">",x,"\n"),paste(X[[x]],collapse=" "))), con="test.txt")
Ananta
  • 3,671
  • 3
  • 22
  • 26
2

One way is to use the use.names argument from unlist and then append the names with a paste or paste0:

mylist <- list(geneset1 = c("ASGR2","ATXN7L3","BCL6B","C6orf211"),
               geneset2 = c("CKS1B","CREBL2","CTNNB1","CTTN"),
               geneset3 = c("AGTR1","C6","C6orf211","CCNK" ),
               geneset4 = c("ASGR2","ATF7IP","ATXN7L3","CKS1B" ) )

unlisted <- unlist(lapply(mylist, function(x) { paste(x, collapse = " ") }), use.names = TRUE)
names(unlisted) <- paste0(">", names(unlisted))
names(unlisted)

write.table(unlisted, file = "test.txt", quote = FALSE, 
            sep = "\n", col.names = FALSE)
JasonAizkalns
  • 20,243
  • 8
  • 57
  • 116
2

Turning Neal's solution into a function:

writeGeneSets <- function(genesets, filename) {
    if (missing(filename)) {
        conn <- stdout()
    } else {
        conn <- file(filename, open="w")
        on.exit( close(conn) )
    }

    for (name in names(genesets)) {
        writeLines(paste(">", name), conn)
        writeLines(paste(genesets[[name]], collapse=" "), conn)
    }
}

...which we can try out like so:

genesets <- list(geneset1= c("ASGR2","ATXN7L3","BCL6B","C6orf211"),
                 geneset2= c("CKS1B","CREBL2","CTNNB1","CTTN"),
                 geneset3= c("AGTR1","C6","C6orf211","CCNK" ),
                 geneset4= c("ASGR2","ATF7IP","ATXN7L3","CKS1B" ) )
writeGeneSets(genesets)
writeGeneSets(genesets, "test1.txt")
cbare
  • 12,060
  • 8
  • 56
  • 63
  • I like the function solution!! – user2380782 Sep 21 '15 at 19:08
  • 1
    Beware of the trailing whitespace this solution adds to the lines (via `cat`)! I would not recommend doing this, and, besides aesthetics, it will mess with some tools. – Konrad Rudolph Sep 22 '15 at 09:40
  • @KonradRudolph, Does cat(..., sep="") fix the issue you raise? ...I'll edit the answer to do that and revise again if you know of a better idiom. Thanks – cbare Oct 05 '15 at 23:57
  • On third thought, using writeLines is a bit cleaner. See also: http://stackoverflow.com/questions/2470248/write-lines-of-text-to-a-file-in-r – cbare Oct 06 '15 at 00:19