0

My R file contains one data frame. The data frame has 12838 rows and 51 columns.

I am attempting to write each row in one of the columns to separate .txt files.

In addition, I am attempting to name each file using another column from the data frame.

The code I am using right now only produces 587 txt files (with the right name), until I receive two errors.

I am trying:

for (i in 1:nrow(r_Data)) {
  content <- as.character(r_Data$fulltext[i])
  filename <- as.character(r_Data$`Style of Cause`[i])
  write(content, file = paste0(filename, ".txt"))
}

I get two errors:

  1. First error:

    Error in file(file, ifelse(append, "a", "w")) : 
    cannot open the connection
    
  2. Second error:

    In addition: Warning message:
    In file(file, ifelse(append, "a", "w")) :
    cannot open file 'City of Toronto Economic Development Corporation v. Information and Privacy Commissioner/ Ontario.txt': No such file or directory
    

More Context:

I do not understand these errors. Especially the second one, as it is the 12,205th row in my data frame ... yet I the code is only outputting 587 rows.

Further, the error seems to be indicating the file/directory is not there ... but the code is able to access 587 rows already ...

Any help would be greatly appreciated.

P.S. If my Stack OverFlow question asking etiquette/syntax is wrong, I kindly ask you to correct me rather than down voting my question, and I will implement the corrections!

bmogil
  • 23
  • 5
  • 2
    (1) Instead of iterative `write`s, perhaps you can do a simpler `writeLines(as.character(r_Data$fulltext), paste0(filename, ".txt"))`, no `for` loop required. Otherwise, `cannot open connection` can be many things, including the file is opened and locked by another process, the parent directory does not exist, you don't have write permissions to the file or the directory, or perhaps other things. Having said that ... (2) it appears that your filename has a slash `/` in it. This can lead to problems on some OSes. Remove the slash and try again. – r2evans Jun 29 '23 at 18:58
  • Error in file(con, "w") : cannot open the connection In addition: Warning message: In file(con, "w") : cannot open file 'City of Toronto Economic Development Corporation v. Information and Privacy Commissioner/ Ontario.txt': No such file or directory > – bmogil Jun 29 '23 at 19:03
  • I see the `/` is still there. Do you intend to have a directory `City ... Commissioner` and a file named `Ontario.txt`? If so, did you `dir.create("City...Commissioner")` first? – r2evans Jun 29 '23 at 19:10
  • Yes, sorry. I am working on getting rid of that right now. As for your question, no I just intended to have a text file named "City of Toronto Economic Development Corporation v. Information and Privacy Commissioner/ Ontario.txt" ... all of the text files are supposed to go to the same directory. – bmogil Jun 29 '23 at 19:15
  • What happens when you remove the `/` from that filename? `gsub("/", "_", filename)` is one such way – r2evans Jun 29 '23 at 19:16
  • Thank you! Okay, now I am receiving: Error in file(file, ifelse(append, "a", "w")) : cannot open the connection In addition: Warning message: In file(file, ifelse(append, "a", "w")) : cannot open file 'Réno-Dépt Inc. v. Wonderland Commercial Centre Inc..txt': Invalid argument – bmogil Jun 29 '23 at 19:25
  • So, same error, for a different row? Maybe I just need to apply some mass regexes to fix all of the characters that are giving me problems? If so, your help with identifying which characters may give me problems would be very helpful. – bmogil Jun 29 '23 at 19:26
  • Oh, and by the way, now the code is producing 676 text files. Improvement :) – bmogil Jun 29 '23 at 19:26
  • Sounds like it's getting stuck when the file name contains a special character forbidden in your operating system. It's probably best to clean that column before running your loop. Something like `gsub("[^[:alnum:] ]", "", str)` – Dan Adams Jun 29 '23 at 19:37
  • Could you possibly explain what exactly that regex is doing? Only because the integrity of my data is vital, so any changes would have to be approved by me before I implement them. – bmogil Jun 29 '23 at 19:42
  • Yes, it might be useful to generalize that. https://stackoverflow.com/a/56595128/3358272 suggests the use of `stringi::stri_trans_general`. For instance, taking a portion of your next failing filename, we can see that stringi::stri_trans_general("Dép\u00f4t Inc", "Latin-ASCII")` returns `"Depot Inc"` (no accents). Perhaps in total: `filename <- gsub("/", "_", stringi::stri_trans_general(as.character(r_Data$\`Style of Cause\`[i]), id="Latin-ASCII"))` works. – r2evans Jun 29 '23 at 19:44

0 Answers0