-1

I have 500 csv. files with data that looks like:

sample data

I want to extract one cell (e.g. B4 or 0.477) per a csv file and combine those values into a single csv. What are some recommendations on how to do this easily?

  • 1
    What about reading one file per time into a `data.frame`, accessing the required cell and storing it somewhere? – Bruno Zamengo Oct 02 '17 at 14:55
  • @BrunoZamengo there's no point reading all the file – pogibas Oct 02 '17 at 14:55
  • see `?read.table`. the skip and nrows arguments will be useful. You could also use `scan`, which takes both of those (nlines instead of nrows) arguments and is a little more fine tuned. – lmo Oct 02 '17 at 14:58

2 Answers2

1

You can try something like this

all.fi <- list.files("/path/to/csvfiles", pattern=".csv", full.names=TRUE)  # store names of csv files in path as a string vector
library(readr)  # package for read_lines and write_lines
ans <- sapply(all.fi, function(i) { eachline <- read_lines(i, n=4)  # read only the 4th line of the file
                        ans <- unlist(strsplit(eachline, ","))[2]  # split the string on commas, then extract the 2nd element of the resulting vector
                        return(ans) })
write_lines(ans, "/path/to/output.csv")
CPak
  • 13,260
  • 3
  • 30
  • 48
0

I can not add a comment. So, I will write my comment here.

Since your data is very large and it is very difficult to load it individually, then try this: Importing multiple .csv files into R. It is similar to the first part of your problem. For second part, try this:

You can save your data as a data.frame (as with the comment of @Bruno Zamengo) and then you can use select and merge functions in R. Then, you can easily combine them in single csv file. With select and merge functions you can select all the values you need and them combine them. I used this idea in my project. Do not forget to use lapply.