-2

I have a folder which contains some 2000 CSVs with file names that contain character '[ ]' in it - e.g.: [Residential]20151001_0000_1.csv

I want to:

  • Remove '[]' from names so that we have file name as:

    Residential_20151001_0000_1.csv

    and place new files within a new folder.

  • The read all the files from that new folder in one data frame (without header) after skipping first row from each file.

  • Also extract 20151001 as date (e.g. 2015-10-01) in a new vector as list such that the new vector is:

    File Name Date

    Residential_20151001_0000_1.csv 2015-10-01

David Arenburg
  • 91,361
  • 17
  • 137
  • 196
Manoj Kumar
  • 5,273
  • 1
  • 26
  • 33
  • Yes Akrun, because they all are related to one big issue, convert file names, store in a separate folder, read all into on data frame that should not have header and skips first line of each CSVs while reading it. – Manoj Kumar Feb 22 '16 at 07:10
  • 3
    This sounds a lot like you want the SO community to do your work for you... – Buggy Feb 22 '16 at 07:11
  • Well guys, do not take it otherwise... If you can't help, at least stop making comments that I am wanting others to do my job. If I could have done it myself, I would never have posted my issues here on stackoverflow. I need help, and thanks very much for your thoughts otherwise, you may well keep it with you. – Manoj Kumar Feb 22 '16 at 07:15
  • 1
    You are taking some flak @ManojKumar because the questions you ask are mildly easy to google. I added an answer with how I would find the info you need. – Paul Hiemstra Feb 22 '16 at 07:16

2 Answers2

2

This code will answer your first question albeit with a small change in logic. Firstly, lets create a backup of all the csv containing [] by copying them to another folder. For eg - If your csvs were in directory "/Users/xxxx/Desktop/Sub", we will copy them in the folder Backup.

Therefore,

library(stringr)
library(tools)
setwd("/Users/xxxx/Desktop/Sub")
dir.create("Backup")
files<-data.frame(file=list.files(path=".", pattern = "*.csv"))
for (f in files)
file.copy(from= file.path("/Users/xxxx/Desktop/Sub", files$file), to= "/Users/xxxx/Desktop/Sub/Backup")

This has now copied all the csv files to folder Backup.

Now lets rename the files in your original working directory by removing the "[]". I have taken a slightly longer route by creating a dataframe with the old names and new names to make things easier for you.

Name<-file_path_sans_ext(files$file)
files<-cbind(files, Name)
files$Name<-gsub("\\[", "",files$Name)
files$Name<-gsub("\\]", "_",files$Name)
files$Name<-paste(files$Name,".csv",sep="")

This dataframe looks like:

files
     file                            Name
1 [Residential]20150928_0000_4.csv Residential_20150928_0000_4.csv
2 [Residential]20151001_0000_1.csv Residential_20151001_0000_1.csv
3 [Residential]20151101_0000_3.csv Residential_20151101_0000_3.csv
4 [Residential]20151121_0000_2.csv Residential_20151121_0000_2.csv
5 [Residential]20151231_0000_5.csv Residential_20151231_0000_5.csv

Now lets rename the files to remove the "[]". The idea here is to replace file with Name:

for ( f in files$file)
file.rename(from=file.path("/Users/xxxx/Desktop/Sub", files$file), 
            to=file.path("/Users/xxxx/Desktop/Sub",files$Name))

You've renamed your files now. If you run: list.files(path=".", pattern = "*.csv") You will get the new files:

"Residential_20150928_0000_4.csv" 
"Residential_20151001_0000_1.csv" 
"Residential_20151101_0000_3.csv"
"Residential_20151121_0000_2.csv" 
"Residential_20151231_0000_5.csv"

Try it!

CuriousBeing
  • 1,592
  • 14
  • 34
1

In order:

  • After googling r replace part of string I found: R - how to replace parts of variable strings within data frame. This should get you up and running for this issue.
  • For skipping the first line, read the documentation of read.csv. There you will find the skip argument.
  • Have a look at the strftime/strptime functions. Alternatively, have a look at lubridate.
Community
  • 1
  • 1
Paul Hiemstra
  • 59,984
  • 12
  • 142
  • 149