0

I'm looking at reddit data from a particular subreddit, using the jsonlite package, and it appears that there's a parsing issue. Using the old reddit hyperlink, reading in the landing page of the subreddit works, but the error happens when I try to read in the second page, and the following pages. Here's the original code:

library(jsonlite)
page <- "https://old.reddit.com/r/Landlord/?count=25&after=t3_yl00x9" #second page 
jsonlite::fromJSON(page)

Here's the subsequent error message:

Error in parse_con(txt, bigint_as_char) : 
  lexical error: invalid char in json text.
                                       <!doctype html><html xmlns="htt
                     (right here) ------^

Referring to another post several years ago (link here) I've tried a few other solutions, but the original problem has persisted. Here's the sample code I've tried:

library(ndjson)
library(curl)
jsonlite::fromJSON(page)
jsonlite::stream_in(url(page))
ndjson::stream_in(page)
jsonlite::stream_in(curl(page))

And lastly, here's some of my session information, for reference:

R version 4.2.1 (2022-06-23)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 11.2.1

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

I'm not too familiar with JSON and unstructured text data at this point, and I wonder if it's a simple error on my part. Any thoughts?? Thanks in advance.

Update:

As neilfws noted, the hyperlink that I had input wasn't JSON but HTML. I had forgotten to paste '.json' in the string. Here's the edited code that ran for me, below:

#string elements
base <- "https://old.reddit.com/r/Landlord/"
json <- ".json"
add <- "?count=25&after=t3_yn1aoo"
#concatenate strings
page <- paste0(base,json,add)
jsonlite::fromJSON(page)
  • 1
    The URL in your question points to a web page where the source is HTML, not JSON, hence the parsing error. Have I missed something? Why are you expecting that link to return JSON? – neilfws Nov 07 '22 at 02:49
  • The issue in the question you link to is different: that was JSON, but it was malformed. – neilfws Nov 07 '22 at 02:57
  • Thanks for your thoughts. I realized that I forgot to paste ".json" at the end of the 'r/Landlord/' part of the link! I'll add an update in the post to include the code segments that fixed the problem – TurnipHead Nov 07 '22 at 05:49

0 Answers0