8

I have a json.rows file -> instances.json.rows with approximately 223k rows

I tried using jsonlite and came up with

instancesfile <- fromJSON("instances.json.rows")

But i kept getting an error

Error in parse_con(txt, bigint_as_char) : parse error: trailing garbage
      kcBy-cs", "time_type": "in"} {"cluster_ids": ["Bz4SOc6zZn0"]
                 (right here) ------^

Here is an image of the data from the first row of my file. Apologies if my question is not clear enough. Let me know in the comments and I will edit my question as required. Thank you in advance!

McGrady
  • 10,869
  • 13
  • 47
  • 69
ak95
  • 85
  • 1
  • 6
  • I think you need to replace newlines with , and wrap your whole file in a pair of {}... I am guessing that the file you are working with is actually a bunch of json statements separated by newlines, rather than a single unified statement. – kpie Apr 09 '17 at 03:57
  • Hi, yes. Each line of the file is a single JSON document that describes a single event or entity. I'm new to working with json files. Could you please answer in detail on how i can load it into Rstudio ? Or point me to material i can read up if you don't have the time? – ak95 Apr 09 '17 at 04:02

2 Answers2

12
out <- lapply(readLines("instances.json.rows"), fromJSON)

Congrats out is what you want it to be. The L apply applies the fromJSON function to each member returned from readLines and returns the results to out. I miss Spoke a bit in my comment, to make your file valid json you would have to replace the newlines with comma, then put the result where the * is in the below example. But that's all non-sense, just use the above one liner.

{"data":[*]}
kpie
  • 9,588
  • 5
  • 28
  • 50
  • That definitely worked and helped me read the json file into my studio. But I now have a really long list which i cannot analyze. Any tips on how I can go ahead with that ? Would you mind if i sent you an email with screenshots of the list? Thank you! – ak95 Apr 10 '17 at 00:37
  • What if you cast your long list to a data frame, could you get it done? http://stackoverflow.com/questions/4227223/r-list-to-data-frame – kpie Apr 11 '17 at 22:26
5
library(jsonlite)
instancesfile <- stream_in(file("instances.json.rows"))

Advantages:

  • Formats automatically as a data frame
  • Gives verbose progress reports (unless you change the default)
  • Let's you adjust the pagesize
Jeff Parker
  • 1,809
  • 1
  • 18
  • 28