0

I am new to Python and was writing a simple query to get information from the USPS API and store the results in a .csv file that I could reference later. I can successfully query the API, but I would like to scale the query to about 2.2 million queries. Doing this in a for loop would take weeks, so I looked into multithreading as a way to run the requests in parallel. I am having 2 problems:

  1. When I do more than 15 or so threads, I get a connection error. The error is similar to this question, but since my smaller queries work, I believe it must be throttling from the server.

  2. How do I keep the key value for the dictionary instead of having it change to "0, 1, 2, ..."?

  3. If I have to restrict my queries to only small batches at a time, can I keep a master file that I continuously append to as a for loop runs? I know that Python has this structure for adding to dictionaries

Below is a minimal example of my code (the data set has to be big because the error comes when I do a large volume):

from xml.etree import ElementTree as ET
from threading import Thread
import numpy as np
import pandas as pd
import requests
import csv

# API Information
usps_uname = '536UNIVE4362'
usps_pw    = '462YK79VT194'
url = 'http://production.shippingapis.com/ShippingAPITest.dll?'
req = "StandardB"

# Single API query
def delivery(origin, destination):
  query = url + 'API=' + str(req) + '&XML=%3C' + str(req) + \
  'Request%20USERID=%22' + str(usps_uname) + '%22%3E' + \
  '%3COriginZip%3E' + "%05d" % origin + '%3C/OriginZip%3E' + \
  '%3CDestinationZip%3E' + "%05d" % destination + '%3C/DestinationZip%3E' + \
  '%3C/' + str(req) + 'Request%3E'
  data = requests.get(query, auth = (usps_uname, usps_pw))
  root = ET.fromstring(data.content)
  if root[2].text == "No Data":
    DeliveryTime = 99
  else:
    DeliveryTime = int(root[2].text)
  return DeliveryTime

# Returns delivery times from all origins to specified destination
def delivery_range(origin_range, destination, thread_index, store=None):
  store[thread_index] = [0] * len(origin_range)
  for i, x in enumerate(origin_range):
    store[thread_index][i] = delivery(x, destination)
  return store

# Threading attempt
def threaded_process(nthreads, origin_range):
  store = {}
  threads = []
  for i in range(nthreads):
    ids = origin_range.values()[i]
    destination = origin_range.keys()[i]
    t = Thread(target=delivery_range, args=(ids, destination, i, store))
    threads.append(t)
  [ t.start() for t in threads ]
  [ t.join() for t in threads ]
  return store

origin_range = {
2072: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
3063: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
6095: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
7001: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
8085: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
8691: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
15205: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
17013: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
17015: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
17339: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
18031: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
18202: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
19709: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
19720: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
21224: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
23803: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
23836: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390], 
28027: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390],
29172: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390],
29303: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390],
30344: [ 2072,  3063,  6095,  7001,  8085,  8691, 15205, 17013, 17015,
       17015, 17339, 18031, 18202, 19709, 19720, 21224, 23803, 23836,
       28027, 29172, 29303, 30344, 33182, 33570, 33811, 33897, 37090,
       37127, 37310, 37416, 40165, 40218, 40511, 41048, 42718, 46075,
       46075, 46168, 46168, 46231, 47130, 53144, 66219, 67337, 75019,
       75261, 76155, 76177, 77338, 78154, 85043, 85043, 85338, 85906,
       89030, 89408, 92374, 92408, 92408, 92551, 94560, 95206, 95304,
       95363, 98004, 98032, 98327, 98390],
}

ans = threaded_process(len(origin_range), origin_range)

writer = csv.writer(open('DeliveryTimes.csv', 'wb'))
for key, value in ans.items():
   writer.writerow([key, value])

If you don't get an error the first time, run the code again and it should error out.

Community
  • 1
  • 1
Mallick Hossain
  • 651
  • 5
  • 13

1 Answers1

0

You can set your python script to take the origin and destination as inputs via the argument list (for instance, using the argparse module), and then use GNU Parallel (http://www.gnu.org/software/parallel/) to call this script with the all possible combinations passed in as arguments.

GNU Parallel will do OS level parallelization of your code so you don't have to worry about doing it in Python.

MauricioRoman
  • 832
  • 1
  • 9
  • 15