0

Im trying to webscrape the price of a player from futbin.com however I keep getting returned "-" instead of the value of the player which in this case is 0

<div class="bin_price lbin">
                        <span class="price_big_right">
                            <span id="ps-lowest-1" data-price="0">0 <img alt="c" class="coins_icon_l_bin" src="https://cdn.futbin.com/design/img/coins_bin.png"></span>
                        </span>
                        </div>

Here is what I've written in R

bon = read_html("https://www.futbin.com/18/player/1")
html_node(bon, "span#ps-lowest-1") %>%
  html_text()

Ive even tried extracting the complete SPAN and the value still isn't being returned

Thanks guys in advanced

Phil
  • 7,287
  • 3
  • 36
  • 66
  • If you view the source for that page, you'll see that the HTML reads `-`. This differs to what the web inspector shows. I'd guess some javascript is dynamically altering the content to show 0. – neilfws Oct 04 '17 at 00:03
  • @neilfws any idea on how to go about this – Arthur Alex Oct 04 '17 at 01:42
  • I believe there are solutions using RSelenium and/or phantomjs, but I have never used them. Search for those terms + "scraping", see what comes up. – neilfws Oct 04 '17 at 02:13

1 Answers1

0

With the following code, I was able to obtain a value :

library(RSelenium)
library(rvest)
shell('docker run -d -p 4445:4444 selenium/standalone-firefox')
remDr <- remoteDriver(remoteServerAddr = "localhost", port = 4445L, browserName = "firefox")
remDr$open()
remDr$navigate("https://www.futbin.com/18/player/1")
Sys.sleep(5)

remDr$executeScript("scroll(0, 5000)")
remDr$executeScript("scroll(0, 15000)")

page_Content <- remDr$getPageSource()[[1]]
read_html(page_Content) %>% html_node("span#ps-lowest-1") %>% html_text()

4,200,000
Emmanuel Hamel
  • 1,769
  • 7
  • 19