apologies in advanced if this question has been asked before but I couldn't find anything on it.
Right now, I'm attempting to take certain columns from files, and to name the columns the names of those files. I've done it before, and I know it's not too difficult, but I am running into a lot of trouble. MY code is as follows (allfiles is declared earlier in the code as all of the files in that directory)
makelist<-function(list_text){
if (list_text == "squared_median " || list_text == "squared_median_ranked"
|| list_text == "value_median " || list_text == "value_median_ranked")
metric = "median"
else
metric = "avg"
currfiles=allfiles[grepl(list_text,allfiles)]
currfile=currfiles[1]
currtable=read.table(currfile, header=T, sep='\t',stringsAsFactors = F)
a<-cbind(gene=currtable[,1],paste0(currfile)=currtable[,metric])
#col.name(a[,ncol(a)])<-currfile
#names(a)[ncol(a)]<-as.character(currfile)
for(currfile in currfiles[2:length(currfiles)])
{
currtable=read.table(currfile, header=T, sep='\t', stringsAsFactors=F)
if (length(currtable[,metric]) > length(a[,1]))
apply(a,2, function(x) length(x) = length(currtable[,metric]))
a=cbind(a, "gene"=currtable[,1],currfile=currtable[,metric])
#names(a)[ncol(a)]<-paste(currfile)
}
#names(a)=c("gene", currfiles[1], "gene", currfiles[2],"gene", currfiles[3],"gene", currfiles[4])
write.table(a, paste(output_folder, list_text,".txt"),sep='\t',quote=F,row.names=F)
}
Essentially, I'm passing in a string that is used to gather certain files from a directory. From, there the code grabs the median or average column from that file, and names the column the file from which it got that information. I've tried loads of different ways with no success. The commented ways are ways that did not work -- either they left the column name blank, or named it the literal variable name "currfile" as opposed to the file name which it contains. I've gone as far as individually renaming all of the columns with
names(a)=c("gene", currfiles[1], "gene", currfiles[2]...currfiles[n])
And that just names every other column currfiles.
Can you help me identify what's wrong? I've tried setting the name as get(currfile) too and that won't let me run the script.
These lines
#col.name(a[,ncol(a)])<-currfile
#names(a)[ncol(a)]<-as.character(currfile)
Have left me with blank column names.
** as an aside the lines with the if statement concerning length are supposed to extend the length of each column to the latest longest column, but doesn't seem to be working. That could be something else I'll read up about a bit more.
Thanks for your help, Mike