How to extract data by matching file names with the column names

Question

It should be easy but I am not able to do it. I have many files and each file is named by species name. I also have a data frame in which each column is named with the species names. I just want to extract the column from data frame and combine that column with the respective species files after changing the column name to let say, 'Common' probably in a loop, so that later on colud compare all the species.

Df:

ID  Tilia_americana Fraxinus_americana  Ulmus_americana
1   23  32  32
2   21  34  35
3   20  33  32
4   19  33  36
5   23  23  34
6   22  34  37

Sorry, for not being being specific earlier. As you can see the column names are species names, In addition I have three separate files with species names. The header of the first file is like this:

Tilia_americana:

ID  Wie Rei Wee
1   2   4   3
2   4   3   4
3   3   2   5
4   5   5   2
5   6   3   4
6   7   4   3

and after extracting the column of Tilia_american from DF and changing the column name to 'Common' and combining it with the Tilia_american file the out put should be like this:

ID  Wie Rei Wee Common
1   2   4   3   23
2   4   3   4   21
3   3   2   5   20
4   5   5   2   19
5   6   3   4   23
6   7   4   3   22

At the end want to save each file separetly... Thanks

[please read this](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) and revise your question :) — Anthony Damico, Dec 19 '12 at 10:31
I would use `lapply` for looping over the species names. If you improve your question, I could show you the specifics. — Roland, Dec 19 '12 at 10:31
Echoing @AnthonyDamico's sentiments, the more specific you can be about the nature of your problem, highlighting what you've tried and where you get stuck, the more likely you're going to have people offering *helpful* answers. — A5C1D2H2I1M1N2O1R2T1, Dec 19 '12 at 10:33
When you say "file", I assume you mean a file in some folder on your computer, not an *object* in your R workspace. Is that correct? What is the current form of the file? .RData, CSV, Excel? plain text? What form do you want the saved file to be? — A5C1D2H2I1M1N2O1R2T1, Dec 19 '12 at 10:52
@ Ananda you are right the files are save in a folder in my computer's drive and these are text files. I also want to save them in text format. Thanks — Gongon, Dec 19 '12 at 10:57
@mariariaz, FYI, if you use `@` correctly in your comments -- in other words, no space between and correct handle :) -- the person you're responding to will be notified of your comment. — A5C1D2H2I1M1N2O1R2T1, Dec 19 '12 at 11:51
@AnandaMahto Ahaa.. I did not realized that... thanks for pointing out, I will take care of this next time. — Gongon, Dec 19 '12 at 11:54
@AnandaMahto yes you are right. I am learning these things. I am still strugling to post questions in correct order as every time I ask some question, someone edit that question for me and it felt bad. Thanks for your suggestions and help. — Gongon, Dec 19 '12 at 11:59

Roman Luštrik · Answer 1 · 2012-12-19T11:57:18.213

4

Without knowing the specifics (like if filename and column name in data.frame match exactly) it's hard to give you specific advice, but perhaps something along the lines.

importMyData <- function(x, my.df) {
    data.from.file <- read.table(x, header = TRUE) # set your import function and its params
    sp.name <- unlist(strsplit(x, ".txt"))
    out <- cbind(data.from.file, my.df[, sp.name])
    out
}

You can use this function inside sapply.

my.file <- list.files(pattern = ".txt")
sapply(my.file, FUN = importMyData, my.df = my.df)

edited Dec 19 '12 at 11:57

answered Dec 19 '12 at 11:42

Roman Luštrik

69,533
24
154
197

Yes the file names and column names matche exactly and thanks for your suggestion I will try it now. – Gongon Dec 19 '12 at 11:51
You will need to remove the extension. I've added something to remove the ".txt" part. – Roman Luštrik Dec 19 '12 at 11:56
I tried this code and realized that the columns are of different lenghts... What should I do as I tried merge command but it seems not be working.. – Gongon Dec 19 '12 at 15:07
@mariariaz That depends on your data. I think you should open a new question of how to merge two columns of different lengths. I think this has been discussed on SO before, so make sure you employ the search button. – Roman Luštrik Dec 19 '12 at 17:43
Thank you yes I sorted it out. You are right about the merge command. I did search and found the solution. – Gongon Dec 19 '12 at 20:45

vaettchen · Accepted Answer · 2012-12-19T12:30:41.953

1

You can get the list of species files with something like

files <- list.files( pattern = ".txt" )

assuming that these text files have the extension .txt and there are no no other text files in that folder.

With

species <- gsub( ".txt", "", files )

you can remove the extension, then you have your column names in the Common data.frame.

You now can build a loop (there may be better ways, like lapply...):

for( i in 1:length( files ) )
{
    x <- read.table( files[i], header = TRUE )
    x <- cbind( x, Common[ colnames( Common ) == species[i] ] )
    write.table( x, files[i], row.names = FALSE )
}

Hope this gets you started!

edited Dec 19 '12 at 12:30

answered Dec 19 '12 at 11:36

vaettchen

7,299
22
41

Thank you for your suggestion now I will try it and will let you know. – Gongon Dec 19 '12 at 11:52
I tried this code but I have cloumn with different lengths and therefore the cbind command is not working... any other suggestions.. – Gongon Dec 19 '12 at 16:04
How meaningful is the data row if the added column doesn't match the length of the other columns? It would be helpful to better understand what the result is you actually want to achieve – vaettchen Dec 19 '12 at 20:31

How to extract data by matching file names with the column names

2 Answers2