0

I'm trying to change the , (comma) to . (dot) in all of my text files that are in a specific folder using R. However I don't want to manually put in the filepath each time. Instead I want to loop over all of the .TXT files in the folder, and make the change in them and then just save them again with the same name at the same place.

As of right now I have problems with the writeLines function where I tried to set the path with a changable variable, this does not seem to work, resulting in the error message:

"Error in writeLines(tx2, path = listFiles[i]) : unused argument (path = listFiles[i])"

This is my curent code draft:

folder_path <- "C:/Users/pathToMyFiles"
setwd(folder_path)
listFiles= list.files(path = "C:/Users/pathToMyFiles", pattern= "*.TXT",
           full.names = TRUE)

#print(listFiles)
#print(listFiles[1])

i=1
for (i in length(listFiles)) {
  tx  <- readLines(listFiles[i])
  tx2  <- gsub(pattern = ",", replace = ".", x = tx)
  writeLines(tx2, path = listFiles[i])
  i <- i + 1
}

When looking at the produced output all of the steps in the code seems to work, except the "writeLines" function.

I would be grateful if somebody knows a workaround this.

All the best!

N

Norruas
  • 61
  • 1
  • 9
  • 2
    Not R, but `sed -i -e "s/,/./g" path/to/files*` – r2evans Apr 02 '20 at 15:00
  • 3
    BTW: [`?writeLines`](https://stat.ethz.ch/R-manual/R-devel/library/base/html/writeLines.html) shows that the argument is `con=`, not `path=`. – r2evans Apr 02 '20 at 15:01
  • @r2eveans, yes you're right, thank you! I changed it to con => no more errors, but it is still not changing anything in the files. – Norruas Apr 02 '20 at 16:18
  • @r2eveans sed works great, but unforunanently I have to make this usable in windows, without any extra installations of packages. I did read this on that topic [https://stackoverflow.com/questions/127318/is-there-any-sed-like-utility-for-cmd-exe] – Norruas Apr 02 '20 at 16:29
  • 1
    Do you have RTools installed? `sed` is included with that. – r2evans Apr 02 '20 at 16:32
  • @r2evans, I installed RTools now, great to know about it for future uses, thank you! – Norruas Apr 02 '20 at 16:50
  • @oszkar Ah, good to know, thank you! – Norruas Apr 02 '20 at 16:50
  • @Noruas Sorry, moved the comment (and added some more explanation) to the answer. – oszkar Apr 02 '20 at 16:52
  • 1
    I know it's not the R-based solution you were asking for, but if the files are large, then it will be *much* faster to use `system("sed -i -e "s/,/./g" path/to/*.txt")` than it will be to `readLines`, `gsub`, and `writeLines`) in R. Just a suggestion. – r2evans Apr 02 '20 at 17:04

1 Answers1

1

Cleaning up your solution to work:

setwd('C:/Users/pathToMyFiles')

text_file_list <- list.files(pattern='*.txt')
for (text_file in text_file_list) {
  text_from_file <- readLines((text_file))
  modified_text <- gsub(',', '.', text_from_file)
  writeLines(modified_text, text_file)
}

And a loop free solution using pipe (without changing directory this time):

library(magrittr)

{text_file_list <- list.files(path='C:/Users/pathToMyFiles',
                              pattern='*.txt',
                              full.names=TRUE)} %>%
  lapply(readLines) %>%
  lapply(function(x) gsub(',', '.', x)) %>%
  {mapply(function(x, y) writeLines(x, y), ., text_file_list)}

Some comments on your code:

  1. After setwd() you do not need the path and full.names arguments in list.files(). Buy the way, it is good practice not to change directories from code, than of course you have to use those arguments (as @r2evans pointed out).
  2. You don't need i = 1 and i <- i + 1 for the for loop
  3. And where you had the error: you have to use i in 1:length(fileList). The way you have used it changed only the last file in the list.
  4. All the other changes are just cosmetics
oszkar
  • 865
  • 5
  • 21
  • 1
    The use of `setwd` should make no difference, and it is not good practice. It's much better to use `list.files(..., full.names=TRUE)` and use the full path than force the script to change working directory whenever you want to do this. It's not that it doesn't work, but all too many times in my experience a script or function or something that has to do this either (a) does not change back to the original directory, breaking everything that operates after it, or (b) I want to do this in a different subdir and have to `setwd` to other places with different patterns. – r2evans Apr 02 '20 at 17:02
  • Good point, but `setwd()` was already there in the post. Another way to overcome this is to add `orig_wd <- getwd()` prior the code chunk and `setwd(orig_wd)` after it. But yes it is more safe practice not to change directories (the code can fail before reaches the `setwd(orig_wd)` part, causing misterious errors later, etc). – oszkar Apr 02 '20 at 17:06
  • Thank you for your additional commments! I implementet this bit of code to an already existing script and there they change the working directory, which is why I wanted to keep that bit. But now I know that it isn't good practice, thanks again! – Norruas Apr 02 '20 at 17:58