7

I'm trying to scrape this website using Python and Selenium, it requires you to select a date from drop-down box then click search to view the planning applications.

URL: https://services.wiltshire.gov.uk/PlanningGIS/LLPG/WeeklyList.

I have the code working to select the first index of the drop-down box and press search. How would I open multiple windows for all the date options in the drop-down box or go through them one by one so I can scrape it?

from selenium import webdriver
from selenium.webdriver.support.ui import Select
from selenium.webdriver.chrome.options import Options


options = Options()
options.add_argument('--headless')
options.add_argument('--disable-gpu')
driver = webdriver.Chrome('/Users/weaabduljamac/Downloads/chromedriver', 
chrome_options=options)

url = 'https://services.wiltshire.gov.uk/PlanningGIS/LLPG/WeeklyList'
driver.get(url)

select = Select(driver.find_element_by_xpath('//*[@id="selWeek"]'))
select.select_by_index(1)

button = driver.find_element_by_id('csbtnSearch')
button.click()

app_numbers = driver.find_element_by_xpath('//*[@id="form1"]/table/tbody/tr[1]/td[1]/a').text
print(app_numbers)

Drop-down box HTML:

<select class="formitem" id="selWeek" name="selWeek">
   <option selected="selected" value="2018,31">Week commencing Monday 30 July 2018</option>
   <option value="2018,30">Week commencing Monday 23 July 2018</option>
   <option value="2018,29">Week commencing Monday 16 July 2018</option>
   <option value="2018,28">Week commencing Monday 9 July 2018</option>
   <option value="2018,27">Week commencing Monday 2 July 2018</option>
   <option value="2018,26">Week commencing Monday 25 June 2018</option>
   <option value="2018,25">Week commencing Monday 18 June 2018</option>
   <option value="2018,24">Week commencing Monday 11 June 2018</option>
   <option value="2018,23">Week commencing Monday 4 June 2018</option>
   <option value="2018,22">Week commencing Monday 28 May 2018</option>
</select>
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
Abdul Jamac
  • 127
  • 1
  • 1
  • 10
  • I guess you need to use switch_to_window Look for this post: https://stackoverflow.com/questions/17325629/how-to-open-a-new-window-on-a-browser-using-selenium-webdriver-for-python – James Aug 07 '18 at 09:24
  • What you are asking can very easily be done in WATIR(sits on the top of Ruby Selenium Binding). If you ready to consider Ruby, I can help you. – Rajagopalan Aug 07 '18 at 10:32

3 Answers3

3

As per your question you won't be able to open multiple windows for different drop-down options as the <options> tags doesn't contains any href attribute. They will always render the new page in the same browser window.

However to select a date from the Dropdown and then click() Search to view the planning applications you can use the following solution:

  • Code Block:

    from selenium import webdriver
    from selenium.webdriver.support.ui import Select
    from selenium.webdriver.chrome.options import Options
    
    options = Options()
    options.add_argument('--headless')
    options.add_argument("start-maximized")
    options.add_argument('disable-infobars')
    driver=webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
    url = 'https://services.wiltshire.gov.uk/PlanningGIS/LLPG/WeeklyList'
    driver.get(url)
    
    select = Select(driver.find_element_by_xpath("//select[@class='formitem' and @id='selWeek']"))
    list_options = select.options
    for item in range(len(list_options)):
        select = Select(driver.find_element_by_xpath("//select[@class='formitem' and @id='selWeek']"))
        select.select_by_index(str(item))
        driver.find_element_by_css_selector("input.formbutton#csbtnSearch").click()
        print(driver.find_element_by_xpath('//*[@id="form1"]/table/tbody/tr[1]/td[1]/a').text)
        driver.get(url)
    driver.quit()
    
  • Console Output:

    18/06760/FUL
    18/07187/LBC
    18/06843/FUL
    18/06705/FUL
    18/06449/FUL
    18/05534/FUL
    18/06030/DEM
    18/05784/FUL
    18/05914/LBC
    18/05241/FUL
    

trivia

To scrape all the links you need to replace:

find_element_by_xpath('//*[@id="form1"]/table/tbody/tr[1]/td[1]/a')

with:

find_elements_by_xpath('//*[@id="form1"]/table/tbody/tr[1]/td[1]/a')
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
  • Hello thank you for the response this sort of does what I want. Would I be able to scrape all the application numbers on each page before going to the next date? – Abdul Jamac Aug 07 '18 at 11:13
  • because ideally I would like to scrape application number, address , proposal , status on each page then write it to json file – Abdul Jamac Aug 07 '18 at 11:15
  • @AbdulJamac Isn't `like to scrape application number, address , proposal , status on each page` different from **`for all the date options in the drop-down box or go through them one by one`**? Please raise a new question for your new requirement. Stackoverflow volunteers will be happt to help you out. – undetected Selenium Aug 07 '18 at 11:18
  • 1
    alright Ive done that thank you for the help I really appreciate it – Abdul Jamac Aug 07 '18 at 11:41
0

I am pretty sure this is not possible and you would have to loop through the options and storing the data somewhere and then appending new data from each dropdown.

hope this helps.

0

You can perform click + ctrl on the search button to open the link in new window, scrap the data, and return to first page to select next option

# original window to switch back
window_before = driver.window_handles[0]

select = Select(driver.find_element_by_id('selWeek'))
options = select.options
for option in options :
    select.select_by_visible_text(option.text)

    # click to open link in new window
    button = driver.find_element_by_id('csbtnSearch')
    ActionChains(driver).key_down(Keys.CONTROL).click(button).key_up(Keys.CONTROL).perform()

    # switch to new window and scrap the data
    driver.switch_to_window(driver.window_handles[1])

    # scrap the data

    # return to original window
    driver.close()
    driver.switch_to_window(window_before)
Guy
  • 46,488
  • 10
  • 44
  • 88
  • Are there any imports Ive got the errors: 1. Instance of 'Select' has no 'select_by_deselect_by_visible_text' member 2.Undefined variable 'ActionChains' member 3.Undefined variable 'Keys' 4.Undefined variable 'window_handles' – Abdul Jamac Aug 07 '18 at 09:39
  • @AndreiSuvorkov We won't know until he tries, won't we? – Guy Aug 07 '18 at 09:44
  • Ive got still got the error: 'Undefined variable 'window_handles'' – Abdul Jamac Aug 07 '18 at 09:46
  • @AbdulJamac `select.select_by_visible_text` (one `select.`) and `driver.window_handles[1]` (add `driver.`) – Guy Aug 07 '18 at 09:47
  • Yea thank you that removed all the code error now when I run it i get driver.switch_to_window(driver.window_handles[1]) IndexError: list index out of range – Abdul Jamac Aug 07 '18 at 09:48
  • @AbdulJamac Did the click open a new window? – Guy Aug 07 '18 at 09:50
  • no it goes to the first page and then prints out that error no other windows are opened – Abdul Jamac Aug 07 '18 at 09:56
  • I think it because you can do open in a new tab when you right click search – Abdul Jamac Aug 07 '18 at 10:07
  • is there way to do command, left click on the search button to open a new tab that seems work on the website. – Abdul Jamac Aug 07 '18 at 10:29