1

I have folder "data" with 36 text files. Each file have 3000 more columns and rows. I want to get specific column and row value as a vector

Example, column 10 and row 10. I want looping to get value that column and row on 36 text files in folder "Data". I'm new to R.

Its my code in matlab

function data = readImage

data = [];
listImage = ls('*.hdf');

for i = 1:size(listImage,1)
    name = strtrim(listImage(i,:));
    citra = hdfread(name,'PIXEL DATA');
    result = point(citra); 
    data = [data; result];
end

end

And

function p = point(image)

p = [];

for i = 3941  %column number
    for j = 1595  %row number
        image = citra(i,j);
        p = [image];
    end
end

end

I have successfully import files

setwd("D:/data")
temp = list.files(pattern="*.txt")
for (i in 1:length(temp)) assign(temp[i], read.table(temp[i]))
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • 2
    Please share what you already tried so far. –  Nov 12 '15 at 08:03
  • I have edited my post – public_html Nov 12 '15 at 08:21
  • Is your R code successfully import your text files at each iteration? If not, check here for that part of your question : http://stackoverflow.com/questions/11433432/importing-multiple-csv-files-into-r . If you manage to do that then you need a simple loop that iterates over all files and picks the element at position (10,10). – AntoniosK Nov 12 '15 at 11:16
  • Thanks you, its work. Now I need a loop to get coulumn and row value each files as vector example at position [10,10] – public_html Nov 13 '15 at 03:07

1 Answers1

5

If you want to grab a specific row and column from a collection of files, I recommend you use data.table::fread(). It is made very simple with the select argument. With it you can select the column, coupled with skip and nrow to grab any number of rows. Try the following for reading only row 10, column 10 from each file -

library(data.table)
datalist <- lapply(temp, fread, select = 10, skip = 9, nrow = 1)

If you have a header row in each of those files, you can change to skip = 10 instead of 9 or add header = TRUE. Then you can name each element with

names(datalist) <- paste0("temp", seq_along(datalist)) 

Now you've got a list with named elements that can be accessed with the $ operator by name. This is usually better than assigning them all to the global environment.

The list elements in datalist will be data tables. If you need single atomic vector elements then the following may be better -

datalist <- lapply(temp, function(x) fread(x, select=10, skip=9, nrow=1)[[1L]])

With this you could use unlist(datalist) to drop the list to a named atomic vector with all the values, should you not want them in a list.

Another thing to take into consideration is that if you have row names in the file you'll need to compensate for those too. If you play around with the select and skip arguments it won't take long to get it right.

For a full example of these methods, we can look at the following. Here we are grabbing row 3, column 2 from the iris dataset, three times.

## write iris to a csv file
write.csv(iris, file = "iris.csv", quote = FALSE, row.names = FALSE)

temp <- rep("iris.csv", 3)
datalist <- lapply(temp, function(x) fread(x, select=2, skip=3, nrow=1)[[1L]])
names(datalist) <- paste0("temp", seq_along(datalist))

## results
datalist
# $temp1
# [1] 3.2
#
# $temp2
# [1] 3.2
#
# $temp3
# [1] 3.2
unlist(datalist)
# temp1 temp2 temp3 
#   3.2   3.2   3.2 

## compare to
iris[3, 2]
[1] 3.2
Rich Scriven
  • 97,041
  • 11
  • 181
  • 245