0

I called an API from R using getURL that returns a JSON response.

When I check with typeof in R, it gives me [1] "character".

I am trying to have my data in JSON format as it should be, to be able to convert it to a DataTable. What could be the reason that it is a character list and how do I fix it?

This is what I am getting in the data returned from the API:

[1] "HTTP/1.1 200 OK\r\nDate: Thu, 04 Jan 2018 20:38:50 GMT\r\nContent-Type: application/json; charset=utf-8\r\nTransfer-Encoding: chunked\r\nConnection: keep-alive\r\nSet-Cookie: __cfduid=d6bbf45645c3bd5332f83d25d06d8b8ca1515098329; expires=Fri, 04-Jan-19 20:38:49 GMT; path=/; domain=.onesignal.com; HttpOnly\r\nStatus: 200 OK\r\nCache-Control: public, max-age=7200\r\nAccess-Control-Allow-Origin: *\r\nX-XSS-Protection: 1; mode=block\r\nX-Request-Id: bd2552de-bf7d-4a0c-94d6-ff1b6856002a\r\nAccess-Control-Allow-Headers: SDK-Version\r\nETag: W/\"47580e0a23e806945b01f1237219175c\"\r\nX-Frame-Options: SAMEORIGIN\r\nX-Runtime: 0.112902\r\nX-Content-Type-Options: nosniff\r\nX-Powered-By: Phusion Passenger 5.1.4\r\nCF-Cache-Status: REVALIDATED\r\nExpires: Thu, 04 Jan 2018 22:38:50 GMT\r\nServer: cloudflare-nginx\r\nCF-RAY: 3d8100f109c6a23f-ICN\r\n\r\n{\"total_count\":2057,\"offset\":0,\"limit\":50,\"notifications\":[{\"adm_big_picture\":\"\",\"adm_group\":\"\",\"adm_group_message\":{\"en\":\"\... <truncated>

If I try to use fromJSON function with this data, I get:

Error in file(con, "r") : cannot open the connection
systemdebt
  • 4,589
  • 10
  • 55
  • 116
  • Error in file(con, "r") : cannot open the connection This is what I got when I tried that. Tried with fromJSON from all 3 packages, jsonlite, RJSONIO and rjson. From what it looks like, I need to parse my data in some way . what way I am not sure – systemdebt Jan 04 '18 at 20:42
  • I meant I tried with other packages as well including jsonlite. Yes, I am sure. I am specifying packagname::fromJSON. – systemdebt Jan 04 '18 at 20:45
  • I see. Looks like the string isn't all JSON. Looks like the JSON part starts at the first `{`. – Gregor Thomas Jan 04 '18 at 20:46
  • That's possible. But how do I get there? – systemdebt Jan 04 '18 at 20:47
  • 1
    Something like `sub('^[^\\{]*\\{', '{', x)` should remove everything up to the first `{`. – Gregor Thomas Jan 04 '18 at 20:52
  • awesome :D Could you please explain me this regex or point me to a relevant resource, please? I lose it at regex always :) – systemdebt Jan 04 '18 at 20:56

2 Answers2

4

jsonlite::fromJSON works great for parsing JSON. Your problem is that you have a bunch of stuff in front of your JSON. (Maybe after too, can't tell...)

I think the JSON starts at the first {, so we'll remove everything before that. Calling your data x:

x = sub('^[^\\{]*\\{', '{', x)
jsonlite::fromJSON(x)

Type the unescaped version of the patter into the Regex101 tool for an explanation. (Unescaped version uses single not double backslashes: ^[^\{]*\{ . In R strings we need to double the backslashes.)

Here's a working example based on your data:

x = 'HTTP/1.1 200 OK
Date: Thu, 04 Jan 2018 20:38:50 GMT
Content-Type: application/json; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
Set-Cookie: __cfduid=d6bbf45645c3bd5332f83d25d06d8b8ca1515098329; expires=Fri, 04-Jan-19 20:38:49 GMT; path=/; domain=.onesignal.com; HttpOnly
Status: 200 OK
Cache-Control: public, max-age=7200
Access-Control-Allow-Origin: *
X-XSS-Protection: 1; mode=block
X-Request-Id: bd2552de-bf7d-4a0c-94d6-ff1b6856002a
Access-Control-Allow-Headers: SDK-Version
ETag: W/\"47580e0a23e806945b01f1237219175c\"
X-Frame-Options: SAMEORIGIN
X-Runtime: 0.112902
X-Content-Type-Options: nosniff
X-Powered-By: Phusion Passenger 5.1.4
CF-Cache-Status: REVALIDATED\r\nExpires: Thu, 04 Jan 2018 22:38:50 GMT
Server: cloudflare-nginx
CF-RAY: 3d8100f109c6a23f-ICN
{\"total_count\":2057,\"offset\":0,\"limit\":50,\"notifications\":[{\"adm_big_picture\":\"\",\"adm_group\":\"\"}]}'

y = gsub('^[^\\{]*\\{', '{', x)
jsonlite::fromJSON(sub('^(^\\{)*\\{', '{', y))
# $total_count
# [1] 2057
# 
# $offset
# [1] 0
# 
# $limit
# [1] 50
# 
# $notifications
#   adm_big_picture adm_group
# 1                      
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
0

You can use the rjson package to transform your input into a json. Using simplifyDataFrame parameter fromJSON should output a dataframe object.

Importing data from a JSON file into R

[edit]

Your data is returning with some header, you can overcome it removing it from the string and passing to fromJSON

library(stringr)
library(rjson)

json <- str_sub(str_extract(data, "ICN\\r\\n\\r\\n.*"), 8)
df <- as.data.frame(fromJSON(json))

> head(df)
   total_count
1        2057