2

I have written a python(version:3.4.3) script in which I am using BeautifulSoup to fetch the contents from a html page,in that page there is a paragraph tag and initially there is no content in it,but I have added an event listener on the window which will add some content in that paragraph tag.

Python script:

    from bs4 import BeautifulSoup
    import requests
    import os
    from urllib.parse import urljoin
    def url():
     res = requests.get('http://localhost:8000/a.html')
     soup = BeautifulSoup(res.text);
     print(soup.p.get_text());


Html file:
<!DOCTYPE html>
<html>
<body>
<p id="demo"></p>
<script>
var x = document.getElementById("demo");

function getLocation() {
    if (navigator.geolocation) {
        navigator.geolocation.getCurrentPosition(showPosition);
    } else {
        x.innerHTML = "Geolocation is not supported by this browser.";
    }
}

function showPosition(position) {
    x.innerHTML = "It worked";
}
window.addEventListener('load',getLocation);
</script>

</body>
</html>

The problem is that the last line of the python script prints a blank line instead of the data which I dynamically added in that paragraph.

I think the problem is with the addEventListener as I am not actually opening the page.

Could anyone please provide an alternative way to fetch the content from a tag in which the content is added by some javascript code(using beautifulsoup)?

Saurabh
  • 60
  • 12
  • 2
    Using Selenium is one way I've done this before, http://stackoverflow.com/questions/13960326/how-can-i-parse-a-website-using-selenium-and-beautifulsoup-in-python –  Jul 13 '16 at 13:38
  • Thanks for the link, but isn't there a way to get the answer by some modification in the JavaScript code,any other event listener that can be added instead of "load"? – Saurabh Jul 14 '16 at 04:30
  • I just tried using selenium and it works! Just one thing more,the firefox page is also opening up,how can I stop that? – Saurabh Jul 14 '16 at 06:19
  • If you need Selenium to run "headless" you could install and use PhantomJS as the webdriver instead of Firefox. Example: http://toddhayton.com/2015/02/03/scraping-with-python-selenium-and-phantomjs/ –  Jul 15 '16 at 11:38
  • I used it ,but it does not support geolocation. – Saurabh Jul 15 '16 at 17:30
  • Ugh! The other way I've seen a "headless" selenium is a lot more work to implement (and I've never done this myself) http://www.installationpage.com/selenium/how-to-run-selenium-headless-firefox-in-ubuntu/ –  Jul 15 '16 at 18:02
  • Well, then I think I should find another way for it,thanks for the links. – Saurabh Jul 16 '16 at 08:29

0 Answers0