Python: Excel to Web to PDF

Question

I'm new to programming and am searching for the best way to pull PDFs of a series of water bills from a city website. I have been able to open the webpage and been able to open an account using an account numbers from an excel list, however, I am having trouble creating a loop to run through all accounts without rewriting code. I have some ideas, but I'm guessing that better suggestions exist. See below for the intro code:

import bs4, requests, openpyxl, os

os.chdir('C:\\Users\\jsmith.ABCINC\\Desktop')

addresses = openpyxl.load_workbook ('WaterBills.xlsx')
type (addresses)
sheet = addresses.get_sheet_by_name ('Sheet1')
cell = sheet ['A1']
cell.value

from selenium import webdriver
browser = webdriver.Firefox()
browser.get('https://secure.phila.gov/WRB/WaterBill/Account/GetAccount.aspx')
elem = browser.find_element_by_css_selector('#MainContent_AcctNum')
elem.click()
elem.send_keys (cell.value)
elem = browser.find_element_by_css_selector('#MainContent_btnLookup')
elem.click()

Thanks for your assistance!

Here's the start for creating a loop. I'm assuming you have a list of values from stored down the spreadsheets 'A' column. `accounts = [row.value for row in sheet.columns[0]]` you can then iterate over the account values with `for account in accounts:` Is that what you're looking for? — freebie, Aug 16 '16 at 18:48

score 1 · Accepted Answer · answered Aug 16 '16 at 19:57

1

Couldn't find a nice way to download the PDF but here's everything but:

    import openpyxl

    from selenium import webdriver


    workbook        = openpyxl.load_workbook('WaterBills.xlsx')
    sheet           = workbook.get_sheet_by_name('Sheet1')
    column_a        = sheet.columns[0]
    account_numbers = [row.value for row in column_a if row.value]

    browser = webdriver.Firefox()
    browser.get('https://secure.phila.gov/WRB/WaterBill/Account/GetAccount.aspx')

    for account_number in account_numbers:
        search_box = browser.find_element_by_id('MainContent_AcctNum')
        search_box.click()
        search_box.send_keys(account_number)

        search_button = browser.find_element_by_id('MainContent_btnLookup')
        search_button.click()

        # TODO: download the page as a PDF

        browser.back()

    browser.quit()

answered Aug 16 '16 at 19:57

freebie

2,161
2
19
36

http://stackoverflow.com/questions/23359083/how-to-convert-webpage-into-pdf-by-using-python – David Zemens Aug 16 '16 at 20:01
@DavidZemens I tried the pdfkit example with no success on Python 3.5 – freebie Aug 16 '16 at 20:03
@freebie -- Thank you! This is an enormous help. Now for that pdf.. Is the hindrance with pdfkit the fact that there is no pdfkit for Python 3.5? – Ashley Aug 16 '16 at 20:53
@Ashley the docs say it's compatible with python 2 and 3 https://pypi.python.org/pypi/pdfkit – David Zemens Aug 16 '16 at 21:10
@DavidZemens, I saw that as well, but I'm in the same boat as (at)freebie, I haven't been able to get it to work. I've also been trying to figure out how to get pyqt5 or html2pdf to work to no avail. – Ashley Aug 16 '16 at 21:16
@Ashley consider [editing](http://stackoverflow.com/posts/38979791/edit) your question so that it includes *only* the code you're currently using, as well as the problems/errors (with full traceback) when attempting to implement the `pdfkit`. Also might want to revise your title so that it's clear you have an error with that module in python 3.5 – David Zemens Aug 16 '16 at 21:20
@DavidZemens, should I edit the question to address the loop alone so I can give Freebie the credit they are due and repost the html to pdf question as a new query? – Ashley Aug 16 '16 at 21:34
Yes that would also be a good way to do it, since Freebie was able to answer part of a multipart question. Good luck! – David Zemens Aug 16 '16 at 21:37
This code has been working for me for quite some time, but now the webdriver is no longer driving the web! I get a very long error (Traceback (most recent call last): File "C:/Users/...", line 11, in ...) about 60 seconds after the following code is entered: (browser = webdriver.Firefox()) I have found this error in a search, but it hasn't helped. I've updated Selenium but it didn't help. Suggestions? – Ashley Sep 20 '16 at 20:18
Not sure from that stack trace excerpt. Have you tried making sure you're running the latest Firefox and using the latest version of Selenium? – freebie Sep 20 '16 at 20:25
Yes. After a little more searching it seems that Firefox 49 is not compatible with selenium 2.53.6. It was suggested that I use Marionette Driver. Does that see right? – Ashley Sep 20 '16 at 21:00
Not sure, not heard of that before. I've actually started using Selenium for one of my projects since finding out about it from your original question. Hope that 49 not being supported is only momentary. I had heard there is a driver for Chrome as well so you could try that – freebie Sep 20 '16 at 21:44

Python: Excel to Web to PDF

1 Answers1