How to extract a name from a content of a text file?

Question

I have a text file that can be read as:

    file=read.table("C:\\data.txt", sep="")
    > class(file)
   [1] "data.frame"
    > head(file)
    name     bat     cat co ro
 1 face     2        16 25  96

I have many text files in a directory that can be listed as:

   dir<- list.files("C:\\datasets", "*.txt", full.names = TRUE)

the files are named like this:

     ds_ds_df_2011_ 25_96.txt
     this corresponds to:
    ds_ds_df_2011_ co_ro.txt # co ro change while the rest is the same in all files.

where co is file$co and ro is file$ro.

What I need is to add the corresponding name from file$name to the files names to become: ds_ds_df_2011_ co_ro_name.txt

is this possible?

possible duplicate of [How do I rename files using R?](http://stackoverflow.com/questions/10758965/how-do-i-rename-files-using-r) — Tensibai, Jul 22 '15 at 09:23
What I understood is that you want to rename `ds_ds_df_2011_ 25_96.txt` into `ds_ds_df_2011_ 25_96_face.txt`. — Tensibai, Jul 22 '15 at 09:38
Remove `.txt` from `dir`, append `_co_ro_name`, append `.txt`. — Roman Luštrik, Jul 22 '15 at 09:53
Not that much, it gives you clues on how to do it, using `file.rename`. You just have to get the subset of you df with the corresponding values ... — Tensibai, Jul 22 '15 at 09:54
It would appear I do not understand part of your question. Can you make it clearer what are your inputs and desired output? — Roman Luštrik, Jul 22 '15 at 09:57
I want to rename `ds_ds_df_2011_ 25_96.txt` into `ds_ds_df_2011_ 25_96_face.txt` based on `file$name`. — temor, Jul 22 '15 at 10:00
Read first file `ds_ds_df_2011_ 25_96` , find the name from `file$name` that corresponds to `25` (file$co) and add it to the file name. — temor, Jul 22 '15 at 10:06

Tensibai · Accepted Answer · 2015-07-22T11:39:17.160

As you seems being too lazy to try by yourself:

library(stringr)

sapply(dir,function(x) { 
             val <- str_match(x,"ds_ds_df_2011_ (\\d+)_(\\d+).txt")         
             dest <- paste0( sub(".txt$","",x), "_", df$name[df$co==val[2] & df$ro==val[3]],".txt") 
             file.rename(x,dest)  
           })

What's done in the dest line is:

sub(".txt","",x) remove the .txt from the file name
df$name[df$co==val[2] & df$ro==val[3]] Get the name from the data frame where co and ro are the values extracted from the file name just before.
paste0(...) Glue together the start of the file name, an underscore, the name extracted from the df and the .txt extension

I used df instead of your original file. Generic advice: NEVER use a keyword as a variable name, it lead to problems.

Backup your files before using it.

ulfelder · Answer 2 · 2015-07-22T11:02:41.837

0

Here's a version that uses match() to look up the right name in the data frame (here named df) and doesn't require any packages. Note, though, that it assumes the order of file names in dir matches the row ordering in df.

df <- data.frame(name = c("face", "head"), bat = seq(2), cat = c(16, 26), co = c(25, 35), ro = c(96, 106))
dir <- c("ds_ds_df_2011_ 25_96.txt", "ds_ds_df_2011_ 35_106.txt")

sapply(dir, function(x) {
  sub("\\.", paste0("_", df$name[match(x, dir)], "."), x)
})

And here's the output from that:

        ds_ds_df_2011_ 25_96.txt        ds_ds_df_2011_ 35_106.txt 
 "ds_ds_df_2011_ 25_96_face.txt" "ds_ds_df_2011_ 35_106_head.txt"

edited Jul 22 '15 at 11:02

answered Jul 22 '15 at 10:25

ulfelder

5,305
1
22
40

the order thing is a problem because the order of file names in dir DOES matches the row ordering in df. – temor Jul 22 '15 at 11:26
If it does match, then this should work. The problem would arise if the order did not match. Have you tried it? – ulfelder Jul 22 '15 at 11:42
Sorry. I wanted to say DOES NOT matche the row ordering in df. – temor Jul 22 '15 at 11:57
Yeah, this is probably too brittle to be useful. The previous answer is more flexible and therefore better. – ulfelder Jul 22 '15 at 12:41

How to extract a name from a content of a text file?

2 Answers2