4

I have .csv files in a directory (lets say C:/Dowloads). I am able to read all the files from that directory using the list.files("path"). But I am unable to read a specified number of files using a for loop. That is, lets say I have 332 files and I just want to read only files 1 to 10 or 5 to 10.

Here is an example:

files <- list.files("path")
files ## displays all the files.

Now for testing I did:

k <- files[1:10]
k
## here it displays the files from 1 to 10.

So I kept the same thing using a for loop, as I want to read files one by one.

for(i in 1:length(k)){
  length(i) ## just tested the length 
}

But it is giving as NA or Null or 1.

Can any one explain how can I read specified .csv files using a for loop or any other way?

gung - Reinstate Monica
  • 11,583
  • 7
  • 60
  • 79
Sumanth Sharma
  • 135
  • 2
  • 16
  • 1
    `i` is just a single number, so the length of `i` should always be `1`. If you are getting something other than that (eg, `NA`), you will need to post a [reproducible example](http://stackoverflow.com/q/5963269/1217536) for people to work with to figure out why. If you want to *read* files in a `for` loop, why don't you try `read.table(file=files[i])`? – gung - Reinstate Monica Jul 17 '16 at 15:27

3 Answers3

4

list.files return a character vector of class character. A character vector is a vector of strings (i.e. characters). The function length applied to a character vector files or a range of elements within the character vector files[1:10] or to a single element in a character vector files[i] will return the number of strings in that character vector, the number of strings in the range, or 1, respectively. Use nchar instead to get the number of characters for each element (each string) of the character vector. So:

path.to.csv <- "/path/to/your/csv/files"
files<-list.files(path.to.csv)
print(files)  ## list all files in path

k<-files[1:10]
print(k)      ## list first 10 files in path

for(i in 1:length(k)) {  ## loop through the first 10 files
  print(k[i]) ## each file name
  print(nchar(k[i])) ## the number of characters in each file name
  df <- read.csv(paste0(path.to.csv,"/",k[i]))  ## read each as a csv file
  ## process each df in turn here
}

Note that we have to paste the "path" to the file name in calling read.csv.

EDIT: I thought I add this as an alternative:

path.to.csv <- "/path/to/your/csv/files"
files<-list.files(path.to.csv)

for(iFile in files) {  ## loop through the files
  print(iFile) ## each file name
  print(nchar(iFile)) ## the number of characters in each file name
  df <- read.csv(paste0(path.to.csv,"/",iFile))  ## read each as a csv file
  ## process each df in turn here
}

Here, the for loop is over the collection (vector) of files so that iFile is the i-th file name.

Hope this helps.

aichao
  • 7,375
  • 3
  • 16
  • 18
  • `df` will get overwritten with each iteration – user20650 Jul 17 '16 at 17:04
  • @user20650: yes, indeed. I'm assuming that he is to process each file in turn when he said "read files one by one." I edited the post to reflect this. The answer is not meant to be an "end-all," just enough to hopefully answer his question. I have no idea how he wants to process the data in each of those files. – aichao Jul 17 '16 at 17:23
4

To read a specific number of files at a time you can subset your vector of files. First create a vector of your files, including a path:

f = list.files("/dir/dir", full.names=T, pattern="csv")
# nb full.names returns the full path to each file

Then, read each file to a separate list item (in this case, the first 10):

dl = lapply(f[1:10], read.csv)

Finally, have a look at list item 1:

head(dl[[1]])
MikeRSpencer
  • 1,276
  • 10
  • 24
3

Unfortunately there's no reproducible example to work with. Usually, when I have to do similar tasks, I do so:

files <- list.files(pattern='*.csv') # this search all .csv files in current working directory 
for(i in 1:length(files){
    read.csv(files[i], stringsAsFactors=F)
}

Your code is not working because you're testing the length of an index, not of the vector. Hope this helps

Eugen
  • 442
  • 1
  • 9
  • 16