4

I have the following "stacked JSON" object within R, example1.json:

{"ID":"12345","Timestamp":"20140101", "Usefulness":"Yes",
  "Code":[{"event1":"A","result":"1"},…]}
{"ID":"1A35B","Timestamp":"20140102", "Usefulness":"No",
  "Code":[{"event1":"B","result":"1"},…]}
{"ID":"AA356","Timestamp":"20140103", "Usefulness":"No",
  "Code":[{"event1":"B","result":"0"},…]}

These are not comma-separated. The fundamental goal would be to parse certain fields (or all fields) into an R data.frame or data.table:

    Timestamp    Usefulness
 0   20140101      Yes
 1   20140102      No
 2   20140103      No

Normally, I would read in a JSON within R as follows:

library(jsonlite)

jsonfile = "example1.json"
foobar = fromJSON(jsonfile)

This however throws a parsing error:

Error: lexical error: invalid char in json text.
          [{"event1":"A","result":"1"},…]} {"ID":"1A35B","Timestamp"
                     (right here) ------^

This is a similar question to the following, but in R: multiple Json objects in one file extract by python

EDIT: This file format is called a "newline delimited JSON", NDJSON.

ShanZhengYang
  • 16,511
  • 49
  • 132
  • 234
  • Are there really newlines before `"Code"` or did you do that for readability? I also assume the `...` is you and not the JSON. If they are files with one compact JSON record per-line, they are "ndjson" files and you can use `ndjson::stream_in()` which is faster than the `jsonlite` counterpart and always produces a "flat" data frame. – hrbrmstr May 20 '18 at 01:43
  • And, if it is that, this is a dup and we need to know that so it can be marked as such. – hrbrmstr May 20 '18 at 01:44
  • @hrbrmstr Yes, please mark as a duplicated question. – ShanZhengYang May 20 '18 at 11:03
  • Similar to: https://stackoverflow.com/questions/59921946/how-to-read-a-newline-delimited-json-file-from-r – mayrop Apr 08 '22 at 00:55

1 Answers1

3
  1. The three dots ... invalidate your JSON, hence your lexical error.

  2. You can use jsonlite::stream_in() to 'stream in' lines of JSON.


library(jsonlite)

jsonlite::stream_in(file("~/Desktop/examples1.json"))
# opening file input connection.
# Imported 3 records. Simplifying...
# closing file input connection.
#      ID Timestamp Usefulness Code
# 1 12345  20140101        Yes A, 1
# 2 1A35B  20140102         No B, 1
# 3 AA356  20140103         No B, 0

Data

I've cleaned your example data to make it valid JSON and saved it to my desktop as ~/Desktop/examples1.json

{"ID":"12345","Timestamp":"20140101", "Usefulness":"Yes","Code":[{"event1":"A","result":"1"}]}
{"ID":"1A35B","Timestamp":"20140102", "Usefulness":"No","Code":[{"event1":"B","result":"1"}]}
{"ID":"AA356","Timestamp":"20140103", "Usefulness":"No","Code":[{"event1":"B","result":"0"}]}
SymbolixAU
  • 25,502
  • 4
  • 67
  • 139