1

apologies in advanced if this question has been asked before but I couldn't find anything on it.

Right now, I'm attempting to take certain columns from files, and to name the columns the names of those files. I've done it before, and I know it's not too difficult, but I am running into a lot of trouble. MY code is as follows (allfiles is declared earlier in the code as all of the files in that directory)

    makelist<-function(list_text){
  if (list_text == "squared_median " || list_text == "squared_median_ranked"
      || list_text == "value_median " || list_text == "value_median_ranked")
    metric = "median"
  else
    metric = "avg"
  currfiles=allfiles[grepl(list_text,allfiles)]
  currfile=currfiles[1]
  currtable=read.table(currfile, header=T, sep='\t',stringsAsFactors = F)
  a<-cbind(gene=currtable[,1],paste0(currfile)=currtable[,metric])
  #col.name(a[,ncol(a)])<-currfile
  #names(a)[ncol(a)]<-as.character(currfile)
  for(currfile in currfiles[2:length(currfiles)])
  {
    currtable=read.table(currfile, header=T, sep='\t', stringsAsFactors=F)
    if (length(currtable[,metric]) > length(a[,1]))
     apply(a,2, function(x) length(x) = length(currtable[,metric]))

    a=cbind(a, "gene"=currtable[,1],currfile=currtable[,metric])
    #names(a)[ncol(a)]<-paste(currfile)
  }
  #names(a)=c("gene", currfiles[1], "gene", currfiles[2],"gene", currfiles[3],"gene", currfiles[4])
  write.table(a, paste(output_folder, list_text,".txt"),sep='\t',quote=F,row.names=F)
}

Essentially, I'm passing in a string that is used to gather certain files from a directory. From, there the code grabs the median or average column from that file, and names the column the file from which it got that information. I've tried loads of different ways with no success. The commented ways are ways that did not work -- either they left the column name blank, or named it the literal variable name "currfile" as opposed to the file name which it contains. I've gone as far as individually renaming all of the columns with

names(a)=c("gene", currfiles[1], "gene", currfiles[2]...currfiles[n])

And that just names every other column currfiles.

Can you help me identify what's wrong? I've tried setting the name as get(currfile) too and that won't let me run the script.

These lines

#col.name(a[,ncol(a)])<-currfile
#names(a)[ncol(a)]<-as.character(currfile)

Have left me with blank column names.

** as an aside the lines with the if statement concerning length are supposed to extend the length of each column to the latest longest column, but doesn't seem to be working. That could be something else I'll read up about a bit more.

Thanks for your help, Mike

12345mike
  • 11
  • 2

1 Answers1

0

To set column names of a table, you use colnames(table) (ref).

In your case, I'd expect colnames(a)[-1] <- currfile to do the trick, if I am understanding currently that you want to name the last column of the table a with the string in variable currfile.

blep
  • 726
  • 1
  • 11
  • 28
  • Thank you, this finally worked! Would you mind explaining the syntax for me? I'm new to R but have a background in C++. Or at least maybe help to understand why the other methods did not work or were leaving it blank? I was unsure if it was referencing an out of scope variable, and maybe that's why it was blank? Thank you – 12345mike Sep 03 '15 at 23:23
  • @12345mike 1) there's a good explanation of names vs. colnames here: http://stackoverflow.com/questions/24799153/what-is-the-difference-between-names-and-colnames/24799304#24799304 2) col.names is a property you specify when reading/writing tables, but isn't a function to get them. – blep Sep 04 '15 at 00:02