186

Is there a way to import data from a JSON file into R? More specifically, the file is an array of JSON objects with string fields, objects, and arrays. The RJSON Package isn't very clear on how to deal with this http://cran.r-project.org/web/packages/rjson/rjson.pdf.

user313967
  • 2,013
  • 2
  • 14
  • 9
  • Also similar to this question: http://stackoverflow.com/questions/2260147/transposing-json-list-of-dictionaries-for-analysis-in-r. – Shane Apr 11 '10 at 16:39
  • Hi Shane, tried using RJSON. I'm interested mostly in the necessary data manipulation. Here's an example of a JSON file I'm working with. example.json: [{"winner":"68694999","votes":[{"ts":"Thu Mar 25 03:13:01 UTC 2010","user":{"name":"Lamur","user_id":"68694999"}},{"ts":"Thu Mar 25 03:13:08 UTC 2010","user":{"name":"Lamur","user_id":"68694999"}}],"lastVote":{"timestamp":1269486788526,"user":{"name":"Lamur","user_id":"68694999"}},"startPrice":0},...] – user313967 Apr 11 '10 at 17:01
  • 1
    A caution: If the JSON file is truly large, apparently the .so or .dll libraries won't process it. A preferable format is NetCDF, but some organizations are not aware of this issue. –  Feb 07 '14 at 23:07

7 Answers7

208

First install the rjson package:

install.packages("rjson")

Then:

library("rjson")
json_file <- "http://api.worldbank.org/country?per_page=10&region=OED&lendingtype=LNX&format=json"
json_data <- fromJSON(paste(readLines(json_file), collapse=""))

Update: since version 0.2.1

json_data <- fromJSON(file=json_file)
rcs
  • 67,191
  • 22
  • 172
  • 153
  • 3
    Note that the edit is referring to an update to the library, not to R. The update changes the last line of the previous example and you still need to load in the library as above. – Steven Waterman Apr 12 '19 at 16:21
108

jsonlite will import the JSON into a data frame. It can optionally flatten nested objects. Nested arrays will be data frames.

> library(jsonlite)
> winners <- fromJSON("winners.json", flatten=TRUE)
> colnames(winners)
[1] "winner" "votes" "startPrice" "lastVote.timestamp" "lastVote.user.name" "lastVote.user.user_id"
> winners[,c("winner","startPrice","lastVote.user.name")]
    winner startPrice lastVote.user.name
1 68694999          0              Lamur
> winners[,c("votes")]
[[1]]
                            ts user.name user.user_id
1 Thu Mar 25 03:13:01 UTC 2010     Lamur     68694999
2 Thu Mar 25 03:13:08 UTC 2010     Lamur     68694999
xn.
  • 15,776
  • 2
  • 30
  • 34
31

An alternative package is RJSONIO. To convert a nested list, lapply can help:

l <- fromJSON('[{"winner":"68694999",  "votes":[ 
   {"ts":"Thu Mar 25 03:13:01 UTC 2010", "user":{"name":"Lamur","user_id":"68694999"}},   
   {"ts":"Thu Mar 25 03:13:08 UTC 2010", "user":{"name":"Lamur","user_id":"68694999"}}],   
  "lastVote":{"timestamp":1269486788526,"user":
   {"name":"Lamur","user_id":"68694999"}},"startPrice":0}]'
)
m <- lapply(
    l[[1]]$votes, 
    function(x) c(x$user['name'], x$user['user_id'], x['ts'])
)
m <- do.call(rbind, m)

gives information on the votes in your example.

Karsten W.
  • 17,826
  • 11
  • 69
  • 103
  • 1
    `x$user$name, x$user$user_id` should now be `x$user['name'], x$user['user_id']`. Also, `m <- do.call(rbind, m)` might be a better way of converting the list to a matrix. – jbaums Oct 10 '13 at 01:16
  • is there something like convertToDataFrame function for JSON (like there is for XML package) ? – userJT Apr 30 '15 at 21:49
16

If the URL is https, like used for Amazon S3, then use getURL

json <- fromJSON(getURL('https://s3.amazonaws.com/bucket/my.json'))
Anthony
  • 1,513
  • 11
  • 17
4

First install the RJSONIO and RCurl package:

install.packages("RJSONIO")
install.packages("(RCurl")

Try below code using RJSONIO in console

library(RJSONIO)
library(RCurl)
json_file = getURL("https://raw.githubusercontent.com/isrini/SI_IS607/master/books.json")
json_file2 = RJSONIO::fromJSON(json_file)
head(json_file2)
Moby M
  • 910
  • 2
  • 7
  • 26
3

load the packages:

library(httr)
library(jsonlite)

I have had issues converting json to dataframe/csv. For my case I did:

Token <- "245432532532"
source <- "http://......."
header_type <- "applcation/json"
full_token <- paste0("Bearer ", Token)
response <- GET(n_source, add_headers(Authorization = full_token, Accept = h_type), timeout(120), verbose())
text_json <- content(response, type = 'text', encoding = "UTF-8")
jfile <- fromJSON(text_json)
df <- as.data.frame(jfile)

then from df to csv.

In this format it should be easy to convert it to multiple .csvs if needed.

The important part is content function should have type = 'text'.

sm925
  • 2,648
  • 1
  • 16
  • 28
Aaron C
  • 301
  • 1
  • 8
1

import httr package

library(httr)

Get the url

url <- "http://www.omdbapi.com/?apikey=72bc447a&t=Annie+Hall&y=&plot=short&r=json"
resp <- GET(url)

Print content of resp as text

content(resp, as = "text")

Print content of resp

content(resp)

Use content() to get the content of resp, but this time do not specify a second argument. R figures out automatically that you're dealing with a JSON, and converts the JSON to a named R list.

Adarsh Pawar
  • 682
  • 6
  • 15