1

Hi somehow my loop is not working. It only takes the last variable. Here's the code:

library(readxl)
library(readr)
library(plyr)
library(dplyr)

path = "C:/Users/benja/OneDrive/Studium/Bachelorarbeit/Ressourcen/Conference Calls/"
Enterprise = "ABB Ltd"

#Import Dictionary
Dictionary <- read_excel("C:/Users/benja/OneDrive/Studium/Bachelorarbeit/Ressourcen/LoughranMcDonald_MasterDictionary_2014.xlsx", 
                     sheet = "Tabelle1")
for (File in c("2016 Q1.xml","2016 Q2.xml","2016 Q3.xml","2016 Q4.txt"))
  {

  #Import Text
  ABB_2016_Q4 <- read_delim(paste0(path,Enterprise,"/",File), 
                        " ", escape_double = FALSE, col_names = FALSE, 
                        trim_ws = TRUE)

  #Umformatierung -> Zuerst Transp, Vektor, kleinbuchstaben, dataframe
  ABB_2016_Q4 = data.frame(tolower(c(t(ABB_2016_Q4))))
  colnames(ABB_2016_Q4) = "Word"

  #Zusammenführung Text-Dictionary
  Analyze_2016_Q4 = inner_join(Dictionary,ABB_2016_Q4)

  #Analyse
  Rating = sum(Analyze_2016_Q4$Rating)

}

If I try to test it with

 print(File)

it has the appropriate list but the loop is not working anyways. And how can I save the results after each loop? I want to have each Rating for the different quartals displayed.

  • Difficult to say without a [minimal reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) but it looks like your `for()` loop is overwriting the `Rating` object each iteration. Use the `for()` loop just to load the files; move everything else out – Phil May 15 '17 at 11:34
  • I tried to put examples in it, but you can imagine it as plain text files, which I change into a single column vector and then I inner join it with the Dictionary to "rate" it. I tried your solution, but it seems that it overwrites the data as well. – Benjamin Berger May 15 '17 at 12:29

2 Answers2

0

your loop is probably working, but at the moment it's not returning anything : )

you can for instance write your result to a list:

#initiate result list
allResults <- list()
#populate your filelist; depending on your directory, you can also use list.files()
files <- c("2016 Q1.xml","2016 Q2.xml","2016 Q3.xml","2016 Q4.txt")
#iterate through your files
for (i in (1:length(files))
  { #Import Text
    ABB_2016_Q4 <- read_delim(paste0(path,Enterprise,"/",files[i]), 
                        " ", escape_double = FALSE, col_names = FALSE, 
                        trim_ws = TRUE)

  #Umformatierung -> Zuerst Transp, Vektor, kleinbuchstaben, dataframe
  ABB_2016_Q4 = data.frame(tolower(c(t(ABB_2016_Q4))))
  colnames(ABB_2016_Q4) = "Word"

  #Zusammenführung Text-Dictionary
  Analyze_2016_Q4 = inner_join(Dictionary,ABB_2016_Q4)

  #Analyse & store results & add identifier:
  allResults[[i]] = data.frame(ID = paste0("Q",i), 
                               result =sum(Analyze_2016_Q4$Rating),
                               stringsAsFactors = FALSE)

}
 # flatten resultlist to a dataframe:
 allResultsDf <- do.call(rbind, allResults)
Janna Maas
  • 1,124
  • 10
  • 15
  • Thanks a lot Janna with this it works really well! There is one last small question following up to the list: How can I create the list with the names of the quarters next to each result? Now its an empty space: allResults List of 4 : num 3 : num 3 : num 1 : num 9 – Benjamin Berger May 15 '17 at 15:28
  • see edit: ofcourse, this hinges on whether your datafiles are always in the same order as your quarters. If you want a safer option, you could also extract the quarter name from the filename, for instance. – Janna Maas May 16 '17 at 08:44
  • Also do take @Phil's answer to heart because it's a more generalized way to deal with what you're probably doing and will save you time in the long run (because you won't have to hard-code all your filenames, for instance) – Janna Maas May 16 '17 at 08:44
0

It looks like you're loading one 'master' file, then loading lots of individual files and trying to join these to the master. If that's the case, I'd take a more functional approach rather than use a for() loop.

Some example data:

master <- data.frame(
  key = letters,
  stringsAsFactors = FALSE
)    

a <- data.frame(
  key = sample(letters, 13),
  dat = sample(1:100, 13),
  stringsAsFactors = FALSE
)

a$key
letters_reduced <- letters %in% a$key
letters_reduced <- letters[!letters_reduced]

b <- data.frame(
  key = sample(letters_reduced, 13),
  dat = sample(1:100, 13),
  stringsAsFactors = FALSE
)

readr::write_csv(a, "~/StackOverflow/BenjaminBerger/a.csv")
readr::write_csv(b, "~/StackOverflow/BenjaminBerger/b.csv")

So we have the master object in memory. To load in multiple files in R, assuming they're in the same directory, I'd use list.files() then iterate over the files with lapply() and read_csv():

files <- list.files("StackOverflow/BenjaminBerger", pattern = "*.csv",
                    full.names = TRUE)
df <- lapply(files, readr::read_csv)

You now have a list of data frames. There are many ways you could join these to your master object, but perhaps the simplest is to 'collapse' the list of data frames into one data frame, and do one join with this. This is as easy as:

df <- dplyr::bind_rows(df)
master <- dplyr::inner_join(master, df, by = "key")

Which gets you:

head(master)
#    key dat
#  1   a  38
#  2   b  52
#  3   c  59
#  4   d  77
#  5   e  34
#  6   f  93
Phil
  • 4,344
  • 2
  • 23
  • 33
  • Thanks a lot for your help Phil! WIth Janna's code it worked for my problem, so I did not have to try another one, but it is much appreciated!! – Benjamin Berger May 15 '17 at 15:35
  • No worries; do be sure to upvote/accept answers that are useful or solve your problem – Phil May 15 '17 at 15:45