56

Is there a way to read a Stata version 13 dataset file in R?

I have tried to do the following:

> library(foreign)
> data = read.dta("TEAdataSTATA.dta") 

However, I got an error:

Error in read.dta("TEAdataSTATA.dta") :
not a Stata version 5-12 .dta file

Could someone point out if there is a way to fix this?

kolonel
  • 1,412
  • 2
  • 16
  • 33
  • 5
    Not within the `foreign` package. `?read.dta`: " Frozen: will not support Stata formats after 12". – Ben Bolker May 27 '14 at 21:15
  • @BenBolker Thanks Ben, is there something that does it for a Stata 13 file (by the way, i think your father is Ben Bolker, he taught me math a long time ago :) , sorry if this is strange ) – kolonel May 27 '14 at 21:18
  • you mean Ethan Bolker, right? Don't know, sorry -- probably someone would have to sit down and reverse-engineer the format. http://www.stata.com/statalist/archive/2013-10/msg00701.html comments that transfer to SPSS is hard now, too. – Ben Bolker May 27 '14 at 21:22
  • http://r.789695.n4.nabble.com/Stata-support-in-package-foreign-td4684022.html – Ben Bolker May 27 '14 at 21:24

6 Answers6

96

There is a new package to import Stata 13 files into a data.frame in R.

Install the package and read a Stata 13 dataset with read.dta13():

install.packages("readstata13")

library(readstata13)
dat <- read.dta13("TEAdataSTATA.dta")

Update: readstata13 imports in version 0.8 also files from Stata 6 to 14

More about the package: https://github.com/sjewo/readstata13

sjewo
  • 1,096
  • 1
  • 7
  • 6
  • 1
    I get `Error in loadNamespace(name) : there is no package called ‘httr’` when attempting to execute the `devtools::install_github("sjewo/readstata13", ref="0.2")` line. – Dan Nissenbaum Oct 21 '14 at 09:51
  • 1
    Hi Dan! Your devtools is probably outdatet and the httr package is missing. Try to update your packages `install.packages("devtools", dependencies=T)` and check if you can load devtools with `library(devtools)` . Maybe you could also check for httr with `library(httr)` . – sjewo Oct 21 '14 at 17:56
  • What does your package do with any `strL` variables in the dataset? – Nick Cox Oct 28 '14 at 14:25
  • 1
    strL variables inherit a refer to a string in a list obtained by `attr(NameOfDataset, "strl")` . – sjewo Oct 28 '14 at 15:03
  • 1
    In version 0.4 the option `replace.strl=TRUE` can be set to replace the reference to a `strL` string in the data.frame with the actual value. – sjewo Nov 08 '14 at 16:15
  • Adding that by my benchmark, `readstata13` is faster, albeit on a very small data set (420 obs x 12 var); not sure how it scales up so I'm not adding it to the answer – MichaelChirico Jun 22 '15 at 19:00
29

There's a new package called Haven, by Hadley Wickham, which can load Stata 13 dta files (as well as SAS and SPSS files)

library(haven) # haven package now available on cran
df <- read_dta('c:/somefile.dta')

See: https://github.com/hadley/haven

cacti5
  • 2,006
  • 2
  • 25
  • 33
yoyoyoyosef
  • 7,000
  • 8
  • 40
  • 39
  • 2
    Just a head's up that `haven` did not seem to be as accurate as `readstata13` as far as formatting, marking many numerical variables as character variables. So that may be something to consider for those using `haven`. – coip Apr 02 '15 at 17:13
  • +1 because `haven` is also the best-suited, I've found, for importing other proprietary formats like `.sas7bdat` from SAS, see [here](http://stackoverflow.com/questions/30006822/read-sas-sas7bdat-data-into-r), whereas `readstata13` is clearly a single-purpose package. – MichaelChirico Jun 22 '15 at 18:54
  • 1
    FYI, `haven` is now available on CRAN. No need to install from github. – Tom Nov 05 '15 at 04:05
13

If you have Stata 13, then you can load it there and save it as a Stata 12 format using the command saveold (see help saveold). Afterwards, take it to R.

If you have, Stata 10 - 12, you can use the user-written command use13, (by Sergiy Radyakin) to load it and save it there; then to R. You can install use13 running ssc install use13.

Details can be found at http://radyakin.org/transfer/use13/use13.htm

Other alternatives, still with Stata, involve exporting the Stata format to something else that R will read, e.g. text-based files. See help export within Stata.

Update

Starting Stata 14, saveold has a version() option, allowing one to save in Stata .dta formats as old as Stata 11.

Roberto Ferrer
  • 11,024
  • 1
  • 21
  • 23
  • Ok thanks a million. But sorry I still have one problem, when I try to write to convert the data to an spss file, I type: Suppose I have imported the data from the STATA file to a data frame called `data1`, then I write the data frame to a text file using `write.table(data , 'mydata.txt' , sep="\t")` and then to get the SPSS file I do: `write.foreign(b , "mydata.txt" , "DerLeew.sps" , package="SPSS")` but I get error: `Error in writeForeignSPSS(df = list(studyid = c("P0008", "P0018", "P0031", : I cannot abbreviate the variable names to eight or fewer letters`, thanks. – kolonel May 27 '14 at 23:12
  • 1
    I can't say why. My knowledge of R is more limited. I do wonder what's going on with all those conversions: Stata to R to txt to SPSS. If you want to convert Stata to SPSS, try the Stata command `savespss`, again by Sergiy Radyakin. Read the following to get started: http://www.radyakin.org/transfer/savespss/savespss.htm. – Roberto Ferrer May 28 '14 at 02:36
  • wouldn't I have to use Stat/Transfer to do it directly in STATA? – kolonel May 28 '14 at 02:38
  • 1
    Not at all. Sergiy has made a great effort on the issue of data transfer. All you need to do is install that command with `net from http://radyakin.org/transfer/savespss/beta` and read the web page in my previous comment to learn the (very simple) syntax. (By the way, the spelling is Stata, not STATA. It's not an acronym. ) – Roberto Ferrer May 28 '14 at 02:52
6

In the meanwhile savespss command became a member of the SSC archive and can be installed to Stata with: findit savespss

The homepage http://www.radyakin.org/transfer/savespss/savespss.htm continues to work, but the program should be installed from the SSC now, not from the beta location.

Nick Cox
  • 35,529
  • 6
  • 31
  • 47
3

I am not familiar with the current state of R programs regarding their ability to read other file formats, but if someone doesn't have Stata installed on their computer and R cannot read a specific version of Stata's dta files, Pandas in Python can now do the vast majority of such conversions.

Basically, the data from the dta file are first loaded using the pandas.read_stata function. As of version 0.23.0, the supported encoding and formats can be found in a related answer of mine.

Then one can either save the data as a csv file and import them using standard R functions, or instead use the pandas.DataFrame.to_feather function, which exports the data using a serialization format built on Apache Arrow. The latter has extensive support in R as it was conceived to promote interoperability with Pandas.

1

I had the same problem. Tried read.dta13, read.dta but nothing worked. Then tried the easiest and least expected: MS Excel! It opened marvelously. I saved it as a .csv and used in R!!! Hope this helps!!!!

dpel
  • 1,954
  • 1
  • 21
  • 31