Extract data from text files using for loop

Question

I have 40 text files with names :

[1] "2006-03-31.txt" "2006-06-30.txt" "2006-09-30.txt" "2006-12-31.txt" "2007-03-31.txt"
[6] "2007-06-30.txt" "2007-09-30.txt" "2007-12-31.txt" "2008-03-31.txt" etc...

I need to extract one specific data, i know how to do it individually but this take a while:

m_value1 <- `2006-03-31.txt`$Marknadsvarde_tot[1]
m_value2 <- `2006-06-30.txt`$Marknadsvarde_tot[1]
m_value3 <- `2006-09-30.txt`$Marknadsvarde_tot[1]
m_value4 <- `2006-12-31.txt`$Marknadsvarde_tot[1]

Can someone help me with a for loop which would extract the data from a specific column and row through all the different text files please?

`gsub("\\..*","", yourstring)`? [See this](https://stackoverflow.com/questions/10617702/remove-part-of-string-after) — simone, May 27 '17 at 15:57
Well I need to get the data from the variable `Marknadsvarde_tot` within several text files but i do not know how to loop through different text files and then to get the value. `Posttyp=2006-03-31 Kvartalsslut =58052 Institutnr_fondbolag=Nordea Fonder Marknadsvarde_tot=7896558077` — Lorenna Van Munnecom, May 27 '17 at 16:05

Bea · Accepted Answer · 2017-05-27T16:59:06.237

2

Assuming your files are all in the same folder, you can use list.files to get the names of all the files, then loop through them and get the value you need. So something like this?

m_value<-character() #or whatever the type of your variable is
filelist<-list.files(path="...", all.files = TRUE)
for (i in 1:length(filelist)){
   df<-read.table(myfile[i], h=T)
   m_value[i]<-df$Marknadsvarde_tot[1]
}

EDIT:

In case you have imported already all the data you can use get:

txt_files <- list.files(pattern = "*.txt") 
for(i in txt_files) { x <- read.delim(i, header=TRUE) assign(i,x) }

m_value<-character()
for(i in 1:length(txt_files)) {
  m_value[i] <- get(txt_files[i])$Marknadsvarde_tot[1]
}

edited May 27 '17 at 16:59

answered May 27 '17 at 16:07

Bea

1,110
12
20

Thank you @GyB. I have tried the for loop provided but I get an error message : `Error in read.table(myfile[i], h = T) : object 'myfile' not found`. I understand it is the name of the file which the data are to be read from but all txt files have been imported into df and assigned their names using this code `txt_files <- list.files(pattern = "*.txt") for(i in txt_files) { x <- read.delim(i, header=TRUE) assign(i,x) }` How should I replace `read.table(myFile[i], h=T)` – Lorenna Van Munnecom May 27 '17 at 16:44
Cool, I am glad :) – Bea May 27 '17 at 17:13
@ GyB i realised their is a slight issue because all data are extracted correctly from the 1st to 7th dataframe afterwards it is not `[1] 7896558077 6983744285 7306744576 8428045883 9298350108 10169081810 450 [8] 428 404 380 339 312 291 343 [15] 386 404 490 399 409 446 440 [22] 434 377 381 413 366 393 etc... ` Maybe because of the type first 7 are "numeric" and the rest "factor", do you know how I can solve that? – Lorenna Van Munnecom May 27 '17 at 17:57
It looks like you are dealing with factors in some of the dataframes. what type did you declare the m_value ? – Bea May 27 '17 at 18:02
If it is not the case yet, you can declare m_value as character, and then force the new value into character: `m_value[i] <- as.character(get(txt_files[i])$Marknadsvarde_tot[1])` – Bea May 27 '17 at 18:06
Cool I understand so the values are correct : `[1] 7896558077" "6983744285" "7306744576" "8428045883" "9298350108" [6] "10169081810" "9815819673,00" "8655151786,00" "7376521992,00" "6892083842,00` How do I need to proceed if i want to convert it to numeric now in the for loop? Since starting at the 7th data it is presented as `8655151786,00` – Lorenna Van Munnecom May 27 '17 at 18:25
to convert a factor into numerical value you need to pass through converting it in character first, so it would be `m_value[i] <- as.numeric(as.character(get(txt_files[i])$Marknadsvarde_tot[1]))`. And since it's probably the commas that caused the variable to be considered as factor you will need to change the `,` into `.` using `gsub`, so it will be: `m_value[i] <- as.numeric(gsub(",",".",as.character(get(txt_files[i])$Marknadsvarde_tot[1])))` – Bea May 27 '17 at 19:41

score 1 · Answer 2 · answered May 27 '17 at 16:33

You could utilize the select-parameter from fread of the data.table-package for this:

library(data.table)
file.list <- list.files(pattern = '.txt')
lapply(file.list, fread, select = 'Marknadsvarde_tot', nrow = 1, header = FALSE)

This will result in a list of datatables/dataframes. If you just want a vector with all the values:

sapply(file.list, function(x) fread(x, select = 'Marknadsvarde_tot', nrow = 1, header = FALSE)[[1]])

Ajay Ohri · Answer 3 · 2017-05-27T17:35:11.917

0

temp = list.files(pattern="*.txt")
library(data.table)
list2env(
  lapply(setNames(temp, make.names(gsub("*.txt$", "", temp))), 
         fread), envir = .GlobalEnv)

Added data.table to an existing answer at Importing multiple .csv files into R

After you get all your files you can get data from the data.tables using DT[i,j,k] where i will be your condition

edited May 27 '17 at 17:35

answered May 27 '17 at 16:48

Ajay Ohri

3,382
3
30
60

Extract data from text files using for loop

3 Answers3