3

I wanted to read in a .dta file in R in order to convert it to a .csv file. First, I tried to do so by using the foreign package, but it reported:

Error in read.dta(file): not a Stata version 5-12 .dta file

So I tried to do it by using teh haven package, but that also failed and reported:

Error in df_parse_dta_file(spec, encoding, cols_skip, n_max, skip, name_repair = .name_repair) : Failed to parse C:/Users/folder/data.dta: This version of the file format is not supported

I also tried to convert it with the rio package: install.packages("rio") library(rio) install_formats() convert("file.dta","file.csv")

but it reported:

Error in arg_reconcile(haven::read_dta, file = file, ..., .docall = TRUE, : Failed to parse C:/Users/folder/data.dta: This version of the file format is not supported. This error was generated by: haven::read_dta With the following arguments: "._costs.dta"

Does anyone know how to import such .dta files in R so that one can convert a .csv file ?

PS: The preamble of the .dta-file looks like this:

<stata_dta>118LSFM 23 Apr 2019 16:22

stats19
  • 31
  • 1
  • 3
  • did you try with `rio`? https://cran.r-project.org/web/packages/rio/vignettes/rio.html – Dan Chaltiel Mar 03 '21 at 11:38
  • Yes, I tried and it did not work. I have edited my question to include the report that appeared. – stats19 Mar 03 '21 at 13:27
  • A Stata .dta file always starts with a preamble like `
    118LSF`. Telling us that you see something similar will confirm that it really is a .dta file. Telling us what release produced a .dta file would put some precision on the question.
    – Nick Cox Mar 03 '21 at 14:10
  • I have added the preamble to my question. It goes like this:
    118LSFM23 Apr 2019 16:22
    – stats19 Mar 03 '21 at 15:18
  • 1
    Good news: it's what you think it is. Bad news: it is so recent a format that the R routines you've tried are not up-to-date. So, where does it come from and can't the provider provide an older version of the dta file (use `saveold`) or .csv (use `import delimited`. – Nick Cox Mar 03 '21 at 15:38
  • Ok, so I'm a bit confused right now. I tried to open the file/files in stata, but it reported that they are not stata format. But the preamble implies that it should be. – stats19 Mar 03 '21 at 19:03
  • That could happen if the Stata you used to try to read the files is not up-to-date. Stata 16.1 for example can read all versions from 1.0 on, but you wouldn't expect that Stata 1.0 can read any later versions. For more detail that you can read regardless of what Stata is accessible to you, see https://www.stata.com/help.cgi?dta – Nick Cox Mar 03 '21 at 19:16

1 Answers1

1

Try adding encoding = "UTF-8" or encoding = "Latin1" inside the read_dta() function to tell R import same data without encoding into numbers. It might take a little while to clean data tho :(