4

I am reading county geojson files provided here into R Studio (R 3.1, Windows 8) for each of the states. I am using list.files() function in R.

For state PR, which has many counties with accented (Spanish) names viz. Bayamón.geo.json, Añasco.geo.json. The function list.files() returns shortened form of file names like An~asco.geo.json, Bayamo´n.geo.json.

And when in the next step I try to read the actual file using above filenames. I get an error that these files don't exist.

I was using system default encoding ISO-8859-1 and also tried changing it to UTF-8, but no luck.

Please help me solve this issue. How can I read files with accented filenames?

Shekhar Sahu
  • 504
  • 1
  • 6
  • 19
  • What exactly does the code you tried look like? Does the code work on non-accented file names? Are you running windows? A [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) would be helpful. – MrFlick Jun 05 '17 at 14:02
  • 1
    @MrFlick I think that the OP listed his entire code. It is `list.files()`. In order to reproduce the problem, You need to create a file with a name containing an accent mark. I simply created a new text file and named it ` Bayamón.txt`. I get the poster's bad result. BTW `dir()` has the same problem. – G5W Jun 05 '17 at 14:10
  • 1
    @G5W what operating system and R version? What encoding did you use to get the accented character? – MrFlick Jun 05 '17 at 14:11
  • I am doing this under Windows, R version 3.2.2. I cut the name `Bayamón` from the post and pasted it as the file name. I think that means UTF-8 encoding. BTW, in the Windows Explorer window, the name shows up correctly. – G5W Jun 05 '17 at 14:15
  • Also, OP mentioned R Studio. I am using RGui. – G5W Jun 05 '17 at 14:17
  • I flippin' hate encodings. https://stackoverflow.com/questions/24354375/manipulating-files-with-non-english-names-in-r Try changing the locale. – Roman Luštrik Jun 05 '17 at 14:23
  • @RomanLuštrik I had already tried `Sys.setlocale(category = "LC_ALL", locale="Spanish")` but it did not solve the problem. – G5W Jun 05 '17 at 16:07
  • @MrFlick I am using R Studio with R 3.1 on Windows 8 machine. If you download the files from the link I mentioned, try to list the files of PR folder. It will reproduce the error. – Shekhar Sahu Jun 06 '17 at 09:54
  • I was using system default ISO-8859-1 and also tried changing it to UTF-8, but no luck. – Shekhar Sahu Jun 06 '17 at 09:56
  • 2
    This problem is still unsolved. – Shekhar Sahu Jun 19 '17 at 13:47

2 Answers2

1

I had the same problem and I guess it happened because the default system language on my computer was different from the filenames I wanted to convert (e.g. system language = English, filename = written in french). Finally, the code below helped me to change filenames.

FILENAME_OLD is the full path for original files e.g. "C:/directory/file.wav"

FILENAME_NEW is the full path for new filenames e.g. "C:/directory/file_new.wav"

######### change filenames with non-english characters
path = "C:/directory"
setwd(path)

test_old <- Sys.glob('C:/directory/*')
test_new <- gsub("FILENAME_OLD",
                 "FILENAME_NEW", test_old)

file.rename(test_old, test_new)
Jay
  • 55
  • 1
  • 6
1

Solution 1

Use Sys.glob() instead of list.files()

For your exemple, if you put USA as your working directory, you can type : Sys.glob(paths="./PR/*") to obtain a complete list, with accents, of the files in the "PR" folder.

If you want to check all files in all the working directory folder, you can type :

Sys.glob(paths=paste0(list.dirs(),"/*"))

In this code, list.dirs() is used to obtain the list of folders. paste0(list.dirs(),"/*") simply appends "/*" to every folder path, so the function Sys.glob will recursively list files in every folders and subfolders.

Solution 2

If the folders have accents, it will NOT work. Then I would recommend to use the package fs. In this package, the function dir_ls() should work. You need to install the fs package (install.packages("fs") and load it with library(fs)), then the following code should work :

dir_ls(recurse=TRUE)

The recurse=TRUE option allows you to list files in subfolders.

Documentation for the fs package :

https://cran.r-project.org/web/packages/fs/vignettes/function-comparisons.html

https://fs.r-lib.org/

Documentation on the dir_ls function : https://www.rdocumentation.org/packages/fs/versions/1.5.0/topics/dir_ls

Dr_Ruben
  • 41
  • 3