Multiple URLs to save json data

Question

I'm trying to call multiple (more than 10 urls) at a time and save all those 10 url's data, which will be in json format and trying to save in my location

here is the below code I have tried, using this I can only achieve to get only last URL's data saved in my json file. How to get all URL's data and stored in a single json file?

import json
import requests

URLs = ['http://httpbin.org/ip',
'http://httpbin.org/user-agent',
'http://httpbin.org/headers']

json_list = []
for url in URLs:
    data = requests.get(url)
    resolvedwo = data.json()
    with open('resolvedworesolution.json', 'w') as f:
         json.dump(resolvedwo, f)

@DirtyBit, copy paste issue. It is not that problem as of now — MDI, Mar 20 '19 at 14:08
Possible duplicate of [Python how to keep writing to a file without erasing what's already there](https://stackoverflow.com/questions/25553031/python-how-to-keep-writing-to-a-file-without-erasing-whats-already-there) ... [Python Open a txt file without clearing everything in it?](https://stackoverflow.com/q/6334382/2823755) ... [How do you append to a file in Python?](https://stackoverflow.com/questions/4706499/how-do-you-append-to-a-file-in-python) — wwii, Mar 20 '19 at 14:11
Ultimately you are saving the response. I don't think there is a point in deserializing to json(from response) and again back to text(from json). You can simply save the response directly to the file. Unless, of course, you want to validated that response is json. — Biswanath, Mar 20 '19 at 14:21
@MDI When you use `json.dump` to append to a file, you will no longer have valid json, so there's not really a point in saving it to a `.json` file, as you will not be able to use `json.load(f)` — C.Nivs, Mar 20 '19 at 14:25
You can simply append r.text ( I think data.text) in you case, as you are expecting text response. — Biswanath, Mar 20 '19 at 14:30

Sunitha · Accepted Answer · 2019-03-20T16:47:32.413

Your problem is that you are overwriting the file, each time in the loop. Instead, store the loop results in a list and write it to the file only once

import requests
import json

URLs = ['http://httpbin.org/ip',
        'http://httpbin.org/user-agent',
        'http://httpbin.org/headers']

json_list = []

for url in URLs:
    data = requests.get(url)
    resolvedwo = data.json()
    json_list.append(resolvedwo)

with open('resolvedworesolution.json', 'w+') as f:
    json.dump(json_list, f, sort_keys=True, indent=4)

Output:

[
    {
        "origin": "137.221.143.66, 137.221.143.66"
    },
    {
        "user-agent": "python-requests/2.21.0"
    },
    {
        "headers": {
            "Accept": "*/*",
            "Accept-Encoding": "gzip, deflate",
            "Host": "httpbin.org",
            "User-Agent": "python-requests/2.21.0"
        }
    }
]

indent=4 implies that each line needs to be indended by 4 spaces; It just makes the json file to look pretty — Sunitha, Mar 20 '19 at 16:57

DirtyBit · Answer 2 · 2019-03-20T14:32:03.763

0

Use append mode while writing to the file in order to "retain" the existing data:

import json
import requests
URLs = ['http://httpbin.org/ip',
'http://httpbin.org/user-agent',
'http://httpbin.org/headers']

json_list = []
for url in URLs:
    data = requests.get(url)
    resolvedwo = data.json()
    with open('resolvedworesolution.json', 'a') as f:   # Using the append mode
        json.dump(resolvedwo, f)
        f.write("\n")                                   # new line for readability

OUTPUT:

{"origin": "159.122.207.241, 159.122.207.241"}
{"user-agent": "python-requests/2.21.0"}
{"headers": {"Accept": "*/*", "Accept-Encoding": "gzip, deflate", "Host": "httpbin.org", "User-Agent": "python-requests/2.21.0"}}

EDIT:

You could write the response to the file in one-go:

with open('resolvedworesolution.json', 'a') as f:
    f.write(str(resolvedwo))
    f.write("\n")

OR

for url in URLs:
    data = requests.get(url)
    with open('resolvedworesolution.json', 'a') as f:
        f.write(data.text)
        f.write("\n")

edited Mar 20 '19 at 14:32

answered Mar 20 '19 at 14:12

DirtyBit

16,613
4
34
55

throws error: `json.decoder.JSONDecodeError: Extra data: line 1 column 605744 (char 605743)` – MDI Mar 20 '19 at 14:16
@MDI before going here: https://stackoverflow.com/questions/21058935/python-json-loads-shows-valueerror-extra-data Could you make sure the file is empty before running this? – DirtyBit Mar 20 '19 at 14:20
Yes, I cleared the file and then ran the code and it throws the same error – MDI Mar 20 '19 at 14:23
Actually the error line was not in the question section, below is the code where I try to load and open `with open('C:\\Users\\ibmha\\PycharmProjects\\Projects\\Resolved_without_Resolution\\resolvedworesolution.json', encoding='utf-8-sig') as f: data = json.load(f)` – MDI Mar 20 '19 at 14:24
@MDI see my comment on your question – C.Nivs Mar 20 '19 at 14:25
@DirtyBit With using f.write am getting this error `UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa3 in position 1219786:` – MDI Mar 20 '19 at 14:29
1

@MDI try `with open('resolvedworesolution.json', 'a', encoding="ISO-8859-1")` as f: – DirtyBit Mar 20 '19 at 14:30
@MDI, okay. enough. see the latest edit using `.text` – DirtyBit Mar 20 '19 at 14:32
Sry buddy, I appreciate your response but still getting this error using data.text: `UnicodeEncodeError: 'charmap' codec can't encode character '\uf0d8' in position 433229:` – MDI Mar 20 '19 at 14:37
Use a different encoding perhaps try with utf-8 – DirtyBit Mar 20 '19 at 14:40
Using, encoding='utf-8-sig' – MDI Mar 20 '19 at 14:45

score 0 · Answer 3 · answered Mar 20 '19 at 14:12

0

When writing to files, opening a file in w mode will erase/truncate the contents before writing to it.

with open('resolvedworesolution.json', 'a') as f:

That should solve your problem

answered Mar 20 '19 at 14:12

C.Nivs

12,353
2
19
44

throws error: `json.decoder.JSONDecodeError: Extra data: line 1 column 605744 (char 605743)` – MDI Mar 20 '19 at 14:18
It's very possible that the `json` from the website content is not valid – C.Nivs Mar 20 '19 at 14:23
@MDI `JSONDecode` is a `json.loads` error, so where is that error actually coming from? – C.Nivs Mar 20 '19 at 14:29
Here, Actually the error line was not in the question section, below is the code where I try to load and open `with open('C:\\Users\\ibmha\\PycharmProjects\\Projects\\Resolved_without_Resolution\\resolvedworesolution.json', encoding='utf-8-sig') as f: data = json.load(f)` – MDI Mar 20 '19 at 14:30

dzang · Answer 4 · 2019-03-20T14:51:58.067

0

You can store the info in an object that can be serialized as a whole:

import json
import requests

URLs = ['http://httpbin.org/ip',
'http://httpbin.org/user-agent',
'http://httpbin.org/headers']

json_list = []
for url in URLs:
    data = requests.get(url)
    resolvedwo = data.json()
    json_list.append(resolvedwo)

with open('resolvedworesolution.json', 'w+') as f:
    json.dump(json_list, f)

edited Mar 20 '19 at 14:51

answered Mar 20 '19 at 14:12

dzang

2,160
2
12
21

throws error: `json.decoder.JSONDecodeError: Extra data: line 1 column 605744 (char 605743)` – MDI Mar 20 '19 at 14:18
try now. You can store the content in a dict or list and write everything at the end of the for loop – dzang Mar 20 '19 at 14:33
For me it works nicely. I changed `'w'` to `'w+'` because I didn't have the file, so it creates it and then writes the json data... Make sure that you don't have already a file with that name or change the filename... – dzang Mar 20 '19 at 14:53

score -1 · Answer 5 · answered Mar 20 '19 at 14:10

-1

Instead of:

resolvedwo = data.json()

You probably want:

resolvedwo += data.json()

answered Mar 20 '19 at 14:10

alexthefifth

1
1

Multiple URLs to save json data

5 Answers5