I am scraping an Airbnb page using rvest.
My objective is to get the number of listings of a user (on the lower left-hand side of the web page) as well as the links for each listing.
However, it seems that Airbnb is blocking access to the source or something. I am a bit lost..
1) Using SelectorGadget and rvest, I have identified the node I'm interested in. Here is my entire code:
library(rvest)
URL = "https://www.airbnb.com/users/show/..."
--> put any user id instead of ...
source = read_html(URL)
source %>% html_nodes(".row-space-3") %>% .[[1]] %>% html_text()
And here is my (disappointing) output:
[1] "\n "
Looking for the webpage source code I should get "Listings (2)" - here it is:
<div class="listings row-space-2 row-space-top-4">
<h2 class="row-space-3">
Listings
<small>(2)</small>
</h2>
What is happening?
PS:
2) I noticed that when I try to get the source code by brute force with XML THERE IS A WHOLE SECTION MISSING if compared to the source code on Chrome or Firefox
library(XML)
library(RCurl)
URL = "https://www.airbnb.com/users/show/..."
parsed <- htmlParse(getURL(URL),asText=TRUE,encoding = "UTF-8")