I have a CSV file that contains 3 Columns: Forename, Surname and Date of Death. I need to parse each line of the CSV, extract the individual parts of the date of death, and then create a custom URL that I can then send as a request to a website. The response then needs to be used to extract data from a HTML table that is produced from this request. This extracted data should then be stored in either a CSV or txt file.
How would I make this more efficient via parallelisation, as there is a decent number of lines in this file that need processing?
The original version of this program has been made and works in Java. I am looking to move it over to Python as I want to learn the language and I've heard that it is more efficient.
This is relevant code that I have in Python so far. However, this isn't working at the moment, the error thrown is:
File "<ipython-input-23-17025eccd9eb>", line 4
print 'line[{}] = {}'.format(i, line)
^
SyntaxError: invalid syntax
import csv
with open("List_new.csv", "r") as f:
reader = csv.reader(f, delimiter=", ")
for i, line in enumerate(reader):
print 'line[{}] = {}'.format(i, line)
Essentially, I wish to go through the CSV file line by line, extract the relevant data, form a Custom URL for each line, and then send a HTTP request that can then be processed, while also asynchronously sending the requests out to speed up the process.
Any help would be much appreciated!