I'm writing a script to do the following:
- Ingest a csv file
- Loop through values in a url column
- Return status codes for each url field
My data is coming from a csv file that I've written. The url field contains a string with 1 or 2 urls to check.
The CSV file is structured as follows:
id,site_id,url_check,js_pixel_json
12187,333304,"[""http://www.google.com"", ""http://www.facebook.com""]",[]
12187,333304,"[""http://www.google.com""]",[]
I have a function that loops through every column correctly however when it I attempt to pull the status code, I'm getting a
Traceback (most recent call last):
File "help.py", line 29, in <module>
loopUrl(inputReader)
File "help.py", line 26, in loopUrl
urlStatus = requests.get(url)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/requests/api.py", line 72, in get
return request('get', url, params=params, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/requests/api.py", line 58, in request
return session.request(method=method, url=url, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/requests/sessions.py", line 498, in request
prep = self.prepare_request(req)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/requests/sessions.py", line 441, in prepare_request
hooks=merge_hooks(request.hooks, self.hooks),
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/requests/models.py", line 309, in prepare
self.prepare_url(url, params)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/requests/models.py", line 375, in prepare_url
scheme, auth, host, port, path, query, fragment = parse_url(url)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/urllib3/util/url.py", line 185, in parse_url
host, url = url.split(']', 1)
ValueError: not enough values to unpack (expected 2, got 1)
Here is my code:
import requests
import csv
input = open('stackoverflow_help.csv')
inputReader = csv.reader(input)
def loopUrl(inputReader):
pixelCheck = []
for row in inputReader:
checkUrl = row[2]
if inputReader.line_num == 1:
continue #skip first row
elif checkUrl == '[]':
continue
elif checkUrl == 'NULL':
continue
urlList = str(checkUrl)
for url in urlList:
urlStatus = requests.get(url)
print(urlStatus.response_code)
loopUrl(inputReader)
The issue traces back to the module and I believe something is happening with the loop which is causing the error.