1

I was able to scrape the following website before using "driver = webdriver.PhantomJS()" for work reason. What I was scraping were the price and the date.

https://www.cash.ch/fonds/swisscanto-ast-avant-bvg-portfolio-45-p-19225268/swc/chf

This stopped working some days ago due to a disclaimer page which I have to agree at first.

https://www.cash.ch/fonds-investor-disclaimer?redirect=fonds/swisscanto-ast-avant-bvg-portfolio-45-p-19225268/swc/chf

Once agreed I visually saw the real content, however the driver seems not, print out is [], so it must be still with the url of the disclaimer.

Please see code below.

    from selenium import webdriver
    from bs4 import BeautifulSoup
    import csv
    import os

    driver = webdriver.PhantomJS()
    driver.set_window_size(1120, 550)

    #Swisscanto
    driver.get("https://www.cash.ch/fonds/swisscanto-ast-avant-bvg-       portfolio-45-p-19225268/swc/chf")
    s_swisscanto = BeautifulSoup(driver.page_source, 'lxml')
    nav_sc = s_swisscanto.find_all('span', {"data-field-entry": "value"})
    date_sc = s_swisscanto.find_all('span', {"data-field-entry": "datetime"})

    print(nav_sc)
    print(date_sc)
    print("Done Swisscanton")
halfer
  • 19,824
  • 17
  • 99
  • 186
Shanshan
  • 11
  • 3
  • 1
    Try to find out if the disclaimer sets any cookie and do this before scraping. And you should check if you comply with the disclaimer because it is there for a reason – Marged Jul 01 '17 at 21:33
  • Hi Marget, could you further explain how to examine if the disclaimer sets any cookie? Thank you. – Shanshan Jul 03 '17 at 14:05
  • Please have a look at my code, really short, just added to the post. – Shanshan Jul 04 '17 at 13:40

1 Answers1

2

This should work (I think the button you want to click in zustimmen?)

driver = webdriver.PhantomJS()
driver.get("https://www.cash.ch/fonds/swisscanto-ast-avant-bvg-portfolio-45-p-19225268/swc/chf"

accept_button = driver.find_element_by_link_text('zustimmen')
accept_button.click()

content = driver.page_source

More details here python selenium click on button

whieronymus
  • 301
  • 4
  • 15
  • Hi there, thank you very much! The situation is weird though... After clicking on "zustimmen" once, the browser remembers this and never asked again. The browser shows the page with price and date, but the scrapped page is still the one with disclaimer. If scrapping price and date I got [] printed out... Meaning I still didn't get to the real page.... Do you know why? And how to fix this? Many thanks! – Shanshan Jul 02 '17 at 13:09
  • Hi, I kind of got stuck there. Could you be so kind to run this short code? Once agreed, the real website is shown. But my print out is null. – Shanshan Jul 04 '17 at 13:34
  • Please have a look at my code, really short, just added to the post. – Shanshan Jul 04 '17 at 13:40