I am beginner to python but have two questions
I am writing a script to search through an excel file and read how many good https requests I receive. All my functions seem to work until it's time to access the excel file. One of error I see is :
['urlRequest.py']
('What is:', ['import csv'])
Traceback (most recent call last):
File "urlRequest.py", line 100, in <module>
impPixel = row[8]
IndexError: list index out of range
I know trying to access a list won't happen if I tried to access for example:
list['cookies','baker','kitchen]
# accessing list[0],list[1], list[2] ,
# will return an element but list[3] will throw an out of range error.
Also, I printed row before my loop and I don't understand the error it gives:
['import csv']
I looked at my excel file and it doesn't seem to be out of range. Any help would be appreciated.
Thirdly -----> I am not sure if I am using my nested if statements correctly.
Here is the current state of my code:
import sys
import csv
import requests
from requests.exceptions import ConnectionError
##########################################################################################
# Initial count value for URLs:
goodURLs = 0
badURLs = 0
nullURLs = 0
# with open("SolutionsData.xls", "r") as f:
# reader = csv.reader(f)
# next(reader)
# Remember to open() to read the file Michael
# should be xls extention or .cvs ?
##########################################################################################
# startswith() looks for an return a string
def getURLs(str, impression, st):
if ('Unreachable' in str or str.startswith('4') or str.startswith('5')):
print "For tacticID: " + impression
print "URL: " + st
print "current state : " + str
##########################################################################################
# This function will look for URl request and update then as needed
# onlt check the colums needed
def countingFunc(str):
if (str.startswith('2') or str.startswith('3')):
global goodURLs
goodURLs += 1
stringCount = str(goodURLs)
return stringCount
elif (str.startswith('4') or str.startswith('5')):
global badURLs
badURLs += 1
stringCount = str(badURLs)
return stringCount
##########################################################################################
# URL request & update the count values.
def apiCollector(st, impression):
# regular expression to be tested
st = st.translate(None, '[]\'\"\\')
try:
r = requests.head(st)
code = r.status_code
str = str(code)
stringCount = countingFunc(str)
except:
str = 'Unreachable'
global nullURLs
nullURLs += 1
stringCount = str(nullURLs)
str = getURLs(str, impression, st)
#print str
return str
##########################################################################################
# This is the main logic in the file, it will
# Open the given excel spreadsheet
# The CSV file is opened and read.Checks impression pixel URLS
# Run the given functions then deliver an ouput
# I can use csv.DictReader() instead but lets strive for performance
excelData = sys.argv[0]
print(sys.argv)
with open(excelData) as csvfile:
reader = csv.reader(csvfile, delimiter=',')
next(reader)
for row in reader:
impPixel = row[8]
tacticID = row[1]
if "http" in impPixel:
if "," in impPixel:
splitter = impPixel.split(',')
for item in splitter:
api = apiCollector(item, tacticID)
else:
api = apiCollector(impPixel, tacticID)
print "Successful URL requests: " + str(goodURLs)
print "Failed URL request: " + str(badURLs)
print "# of unreachable URL: " + str(nullURLs)