0

I am working with an API that returns the following format:

{
    "count": 900,
    "next": "api/?data&page=2",
    "previous": null,
    "results": 
        [{json object 1}, {json object 2}, {...}]
}

Problem is that I want to retrieve all "results" from all pages, and save that into one json file.

I'm thinking of a while loop that keeps making requests to the API and aggregating the resulting "results" into one variable, until the "next" value is null.

Something like

while json1["next"] != null:
    r = request.get(apiURL, verify=False, allow_redirects=True, headers=headers, timeout=10)
    raw_data = r.json()["results"]

    final_data.update(raw_data)

I tried it but since r.json()["results"] is a list I don't know how to handle the different formats and transform that into a JSON file

When trying to do final_data.update(raw_data) it gives me an error saying:

'list' object has no attribute 'update'

Or when trying json.loads(raw_data) it gives me:

TypeError: the JSON object must be str, bytes, or bytearray, not list"
Zodi
  • 13
  • 4
  • Intresting!.... – Ganime Dewer Feb 16 '23 at 13:55
  • What do you mean with "since r.json()["results"] is a list I don't know how to handle the different formats"? You should be able to save a list to a JSON file without issue. – D Malan Feb 16 '23 at 13:57
  • 1
    Is There Any Error You are Getting When Saving The JSON? The Error Would Help Understanding The Issue! – Ganime Dewer Feb 16 '23 at 14:01
  • @DMalan When trying to do for example json.loads(raw_data), I get the error: "TypeError: the JSON object must be str, bytes, or bytearray, not list". – Zodi Feb 16 '23 at 14:01
  • Can You Share More Code? – Ganime Dewer Feb 16 '23 at 14:03
  • 1
    If you're trying to save the data to a file, you need to use `json.dumps`, not `json.loads`. – D Malan Feb 16 '23 at 14:06
  • @GanimeDewer @DMalan There's not much else to the code. The problem is that raw_data is a list of json objects, and it does not let me aggregate that into a json when doing `final_data.update(raw_data)` – Zodi Feb 16 '23 at 14:12
  • "When trying to do final_data.update(raw_data) it gives me an error saying:" This problem is because of `final_data`, not because of `raw_data`. It also has nothing to do with JSON. "I'm thinking of a while loop that keeps making requests to the API and aggregating the resulting "results" into one variable" - since the "results" will be lists, your plan was to aggregate them **into a list**, right? So. Does Python's `list` have an `update` method? Apparently not, right? Do you know how to merge the contents of lists? If not, that is the real question here - please see the linked duplicates. – Karl Knechtel Feb 16 '23 at 14:16
  • "Or when trying json.loads(raw_data)" Well, yes, because that **makes no sense**. Loading is when you take JSON data and create objects (`dict`s and `list`s) in your Python program. You already have a list, and you want to make JSON data from it; that is saving, not loading. That is a separate issue; we generally want one question per post here, but I added the reference duplicate for this task as well. – Karl Knechtel Feb 16 '23 at 14:20
  • @KarlKnechtel I understand now, sorry for the confusion. Thanks! – Zodi Feb 16 '23 at 14:26

1 Answers1

0

JSON file is a text file. To save your raw_data, which is a list, in a text file, you need to encode it using json.dumps():

import json

with open('output.json', 'w', encoding="utf-8") as f:
    raw_data_as_string = json.dumps(raw_data)
    f.write(raw_data_as_string)

To aggregate the results from different pages, your final_data can be a list, created before you iterate the pages, and then you can final_data.extend(raw_data) in a loop, where raw_data contains results from a single page.

After that you json.dumps(final_data) as shown earlier.

wombatonfire
  • 4,585
  • 28
  • 36