2

I’m new to API’s and working with JSON and would love some help here.

I know everything I’m trying to accomplish can be done using the PRAW library, but I’m trying to figure it out without PRAW.

I have a for loop that pulls post titles from a specific subreddit, inputs all the post titles into a pandas data frame, and after the limit is reached, changes the ‘after parameter to the last post id so it repeats with the next batch.

Everything worked perfectly, but when I tried the same technique with a specific thread and gathering the comments, the ‘after’ parameter doesn’t work to grab the next batch.

I’m assuming ‘after’ works differently with threads than with a subreddits posts. I saw in the JSON ‘more’ with a list of ids. Do I need to use this somehow? When I looked at the JSON for the thread, the ‘after’ says ‘none’ even with the updated parameters.

Any idea on what I need to change here? It’s probably something simple.

Working code for getting the subreddit posts with limit 5:

params = {"t":"day","limit":5}
for i in range(2):
    response = requests.get('https://oauth.reddit.com/r/stocks/new',
                            headers=headers, params = params)
    response = response.json()
    for post in response['data']['children']:
        name = post['data']['name']
        print('name',name)
    params['after'] = name
    print(params)

Giving the output:

name t3_lifixn
name t3_lifg68
name t3_lif6u2
name t3_lif5o2
name t3_lif3cm
{'t': 'day', 'limit': 5, 'after': 't3_lif3cm'}
name t3_lif26d
name t3_lievhr
name t3_liev9i
name t3_liepud
name t3_lie41e
{'t': 'day', 'limit': 5, 'after': 't3_lie41e'}

Code for the Reddit thread with limit 10

params = {"limit":10}
for i in range(2):
    response = requests.get('https://oauth.reddit.com/r/wallstreetbets/comments/lgrc39/',
                            params = params,headers=headers)
    response = response.json()
    for post in response[1]['data']['children']:
        name = post['data']['name']
        print(name)
    params['after'] = name
    print(params)

Giving the output:

t1_gmt20i4
t1_gmzo4xw
t1_gmzjofk
t1_gmzjkcy
t1_gmtotfl
{'limit': 10, 'after': 't1_gmtotfl'}
t1_gmt20i4
t1_gmzo4xw
t1_gmzjofk
t1_gmzjkcy
t1_gmtotfl
{'limit': 10, 'after': 't1_gmtotfl'}

Even though the limit was set to 10, it only gave 5 id's before continuing the loop. Also, rather than updating the 'after' parameter, it just restarted.

Hausra5
  • 31
  • 4
  • 1
    pls, avoid to use images as code example, you can use the code snippets or send a external link with a code that can be executed or copied – Sheldon Oliveira Feb 12 '21 at 21:32
  • Thanks for the feedback! Sorry, this was my first question and I’m just learning the proper way to ask. – Hausra5 Feb 12 '21 at 21:44
  • As Sheldon Oliveira said, can you copy paste the relevant code in your question? Like go to your editor, copy paste the code you want, put ``` lang-py with a newline then your code, then another newline and ```. See [SO's editing help](https://stackoverflow.com/editing-help) to make your post better. Also see [why text and not images](https://meta.stackoverflow.com/q/285551), also https://idownvotedbecau.se/imageofcode – Lakshya Raj Feb 12 '21 at 23:44
  • Thanks for the help, I have updated the formatting. – Hausra5 Feb 14 '21 at 16:27

1 Answers1

1

I ended up figuring out how to do it. Reading the documentation for Reddit's API, when in a thread and you want to pull more comments, you have to compile a list of the id's from the more sections in the JSON. It's a nested tree and looks like the following:

{'kind': 'more', 'data': {'count': 161, 'name': 't1_gmuram8', 'id': 'gmuram8', 'parent_id': 't1_gmt20i4', 'depth': 1, 'children': ['gmuram8', 'gmt6mf6', 'gmubxmr', 'gmt63gl', 'gmutw5j', 'gmtpitn', 'gmtoec3', 'gmtnel0', 'gmt4p79', 'gmupqhx', 'gmv70rm', 'gmtu2sj', 'gmt2vc7', 'gmtmjai', 'gmtje0b', 'gmtkzzj', 'gmt93n5', 'gmtvsqa', 'gmumhat', 'gmuj73q', 'gmtor7c', 'gmuqcwv', 'gmt3lxe', 'gmt4l78', 'gmum9cm', 'gmt857f', 'gmtjrz3', 'gmu0qcl', 'gmt9t9i', 'gmt8jc7', 'gmurron', 'gmt3ysv', 'gmt6neb', 'gmt4v3x', 'gmtoi6t']}}

When using the get request, you would use the following url and format

requests.get(https://oauth.reddit.com/api/morechildren/.json?api_type=json&link_id=t3_lgrc39&children=gmt20i4,gmuram8....etc)

Hausra5
  • 31
  • 4