I would use several python idioms to clean up the code:
- Wrap all code in functions
- Generally speaking, putting your code in functions makes it easier to read and follow
- When you run a python script (
python foo.py
), the python interpreter runs every line it can, in order, one by one. When it encounters a function definition, it only runs the definition line (def bar():
), and not the code within the function.
- This article seems like a good place to get more info on it: Understanding Python's Execution Model
- Use the
if __name__ == "__main__":
idiom to make it an importable module
- Similar to the above bullet, this gives you more control on how and when your code executes, how portable it is, and how reusable it is.
- "Importable module" means you can write your code in one file, and then import that code in another module.
- More info on
if __name__ == "__main__"
here: What does if name == “main”: do?
- Use try/finally to make sure your driver instances get cleaned up
- Use explicit waits to interact with the page so you don't need to use
sleep
- By default, Selenium tries to find and return things immediately. If the element hasn't loaded yet, Selenium throws an exception because it isn't smart enough to wait for it to load.
- Explicit waits are built into Selenium, and allow your code to wait for an element to load into the page. By default it checks every half a second to see if the element loaded in. If it hasn't, it simply tries again in another half second. If it has, it returns the element. If it doesn't ever load in, the Wait object throws a TimeoutException.
- More here: Explicit and Implicit Waits
- And here: WAIT IN SELENIUM PYTHON
Code (untested for obvious reasons):
from selenium import webdriver
from explicit import waiter, ID # This package makes explicit waits easier to use
# pip install explicit
from selenium.webdriver.common.by import By
# Are any of these needed?
# import time
# import bs4
# import gspread
# from oauth2client.service_account import serviceAccountCredentials
def bank_login(driver, username, password):
"""Log into the bank account"""
waiter.find_write(driver, 'dUsername', username, by=ID)
waiter.find_write(driver, 'password', password, by=ID, send_enter=True)
def get_amount(driver, source):
"""Click the page and scrape the amount"""
# Click the page in question
waiter.find_element(driver, source, by=By.LINK_TEXT).click()
# Why are you using beautiful soup? Because it is faster?
# time.sleep(3)
# html = driver.page_source
# soup = bs4.BeautifulSoup(html)
# elems=soup.select('#CurrentBalanceAmount')
# SavingsAcc = float(elems[0].getText().strip('$').replace(',',''))
# driver.back()
# I would do it this way:
# When using explicit waits there is no need to explicitly sleep
amount_str = waiter.find_element(driver, "CurrentBalanceAmount", by=ID).text
# This conversion scheme will handle none $ characters too
amount = float("".join([char for char in amount_str if char in ["1234567890."]]))
driver.back()
return amount
def main():
driver = webdriver.Chrome()
try:
driver.get(bank_url)
bank_login(driver, 'username', 'password')
print(sum([get_amount(driver, source) for source in ['Savings', 'cheque']]))
finally:
driver.quit() # Use this try/finally idiom to prevent a bunch of dead browsers instances
if __name__ == "__main__":
main()
Full disclosure: I maintain the explicit
package. You could replace the waiter
calls above with relatively short Wait calls if you would prefer. If you are using Selenium with any regularity it is worth investing the time to understand and use explicit waits.