0

In general, I try to get at least some tags from this site, and always gives none. I have no idea how to fix this.

There is a button Tickets, after you press it from the side there is an additional panel, so I want to parse it, I can not understand how. As I understand it, this tab is not loaded immediately after clicking, what to do next I do not understand. P.S. just started to learn it.

# coding: utf-8-sig
import urllib.request
from bs4 import BeautifulSoup

headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36"}

def get_html(url):

    request = urllib.request.Request(url,None,headers)
    response = urllib.request.urlopen(request)
    return response.read()

def parse(html):

    soup = BeautifulSoup(html,"html.parser")
    table = soup.find('body', class_='panel-open')
    print(table)

def  main():
    parse(get_html('http://toto-info.co/'))

if __name__ == '__main__':
    main()
  • Sites like this one can sometimes be scraped with the aid of Selenium (*see* http://selenium-python.readthedocs.io/). One thing you can do with Selenium is to use the `execute_script` method of `webdriver` to execute Javascript code. For instance, you can execute `document.documentElement.outerHTML`. I understand that the HTML5 API makes it possible to write to local store; however, I have yet to work out details. – Bill Bell May 16 '17 at 20:11

1 Answers1

0

That would be because the body element of the web page http://toto-info.co/ does not contain the class attribute "panel-open".

You can see what the body element contains by changing the line in your code:

table = soup.find('body', class_='panel-open')

to

table = soup.find('body')

This will now print the body element and all the elements it contains.

As you will see the body element contains very little except script if you want to get the script to render you will have to use other technologies I suggest you do a Google search for starters e.g. Web-scraping JavaScript page with Python.

An example that does select something by class, if you are interested is:

table = soup.find('div', class_='standalone')

But that selects from this page:

<div class="standalone" data-app="" id="app"></div>

but that is about all of the markup on this page that is displayed without JavaScript.

Community
  • 1
  • 1
Dan-Dev
  • 8,957
  • 3
  • 38
  • 55