1

I am very new to python and I am looking to scrape following website:Link

I think that Selenium might be the right tool and I started to write following code:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

path='http://planning.hackney.gov.uk/Northgate/PlanningExplorer/generalsearch.aspx'

browser = webdriver.Firefox()
browser.get(path)

elem = browser.find_element_by_id('txtPostCode')
elem.clear()
elem.send_keys("E9 7JP")
elem.send_keys(Keys.RETURN)

print (browser.current_url)

So far so good, it works. However, the return value of browser.current_url is not quite what is displayed in the url bar of my browser. I mean the the return value of the script is:

//planning.hackney.gov.uk/Northgate/PlanningExplorer/generalsearch.aspx

however the url in the browser is showing me this one here:

//planning.hackney.gov.uk/Northgate/PlanningExplorer/Generic/StdResults.aspx?PT=Planning%20Applications%20On-Line&SC=Postcode%20is%20E9%207JP&FT=Planning%20Application%20Search%20Results&XMLSIDE=/Northgate/PlanningExplorer/SiteFiles/Skins/Hackney/Menus/PL.xml&XSLTemplate=/Northgate/PlanningExplorer/SiteFiles/Skins/Hackney/xslt/PL/PLResults.xslt&PS=10&XMLLoc=/Northgate/PlanningExplorer/Generic/XMLtemp/j5jzxiwxklgslnam4qffypw5/052dd052-3993-4f10-83aa-dd0c6c326676.xml

Now I wonder how to get hold of this adress?!

Thanks a lot!

demouser123
  • 4,108
  • 9
  • 50
  • 82
flow.v
  • 15
  • 1
  • 1
  • 4
  • Could you add your Python version, python-selenium version, firefoxdriver version and firefox version to the post? I could not reproduce your issue using python3, python-selenium 2.53, firefoxdriver 2.53 and Firefox 45.9.0. Running your scripts gives me longer URL, just like you expected. – Mirek Długosz Apr 30 '17 at 17:30
  • thanks for your answer. my python version is 3.6.1, selenium is 3.4.0, Firefox is 53, i have no idea how to figure out the version of the Firefox driver, but it's the newest one. l installed selenium just a few days ago.. – flow.v May 01 '17 at 20:30

1 Answers1

3

Did you made any other request in between checking your script returned URL and the URL shown by the browser. The request URL sent post the Keys.RETURN adds a session identifier with the URL, which might be the reason why you are getting different URL.

I have this script

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
chromepath='chrome_driver_path' //change this to your chromedriver path
driver = webdriver.Chrome(chromepath)

driver.get('http://planning.hackney.gov.uk/Northgate/PlanningExplorer/generalsearch.aspx')

print(driver.current_url)

elem = driver.find_element_by_id('txtPostCode')
elem.clear()
elem.send_keys("E9 7JP")
elem.send_keys(Keys.RETURN)

print (driver.current_url)

driver.quit()

Keypress code has been copied from your code itself. I get an identical URL from both the browser and the script

Script gives me this URL - Link Browser gives me this same URL - Copied Manually

demouser123
  • 4,108
  • 9
  • 50
  • 82
  • perfect, thanks a lot! it works! the same code give me different results depending on the driver i am using... meaning that the Firefox driver gives me the short address, the chrome driver the long one and the one i need... – flow.v May 01 '17 at 20:49