0

I can browse page https://www.xiami.com/artist/O9fc383 in browser, but when I parse it by chrome driver, I got different source as below so that I can't scrape that page, what can I do? There is no this form for source in browser.

...
<form action="//www.xiami.com:443/artist/O9fc383/_____tmd_____/verify/" id="nc-verify-form" method="GET">
...
<script>
    var referrer=document.referrer;
    if (referrer && referrer.indexOf("__tmd__")===-1 ){
        localStorage.x5referer = document.referrer;
    }else{
        localStorage.x5referer = window.location.href;
    }
</script>
mikezang
  • 2,291
  • 7
  • 32
  • 56
  • May be the HTML is rendered at the client side, similar to single page applications. – Arun Ghosh Aug 31 '18 at 06:07
  • Possible duplicate of [Selenium: Browser display is different then HTML code](https://stackoverflow.com/questions/19398385/selenium-browser-display-is-different-then-html-code) – veritaS Aug 31 '18 at 06:26
  • Tried it, different result from browser and scraping. – mikezang Aug 31 '18 at 08:18

1 Answers1

0

What lib do you use for scraping? if you use beautiful soup or urllib you cannot scrape code generated by javascript. You will have to use something like silenium Scraping a JS-Rendered Page

veritaS
  • 511
  • 1
  • 5
  • 23
  • I use selenium so that I got source, I guess when page is accessed by chrome driver, it load a different content, how can I simulate in browser? – mikezang Aug 31 '18 at 06:18
  • This post or its duplicate should help https://stackoverflow.com/questions/19398385/selenium-browser-display-is-different-then-html-code – veritaS Aug 31 '18 at 06:26
  • can you browse page `https://www.xiami.com/artist/O9fc383`, then try to scraping it with selenium, and check if they are the same for me? – mikezang Aug 31 '18 at 08:20