I want to parse this HTML: and get this elements from it:
a) p
tag, with class: "normal_encontrado"
.
b) div
with class: "price"
.
Sometimes, the p
tag is not present in some products. If this is the case, an NA
should be added to the vector collecting the text from this nodes.
The idea is to have 2 vectors with the same length, and after join them to make a data.frame
. Any ideas?
The HTML part:
<html>
<head></head>
<body>
<div class="product_price" id="product_price_186251">
<p class="normal_encontrado">
S/. 2,799.00
</p>
<div id="WC_CatalogEntryDBThumbnailDisplayJSPF_10461_div_10" class="price">
S/. 2,299.00
</div>
</div>
<div class="product_price" id="product_price_232046">
<div id="WC_CatalogEntryDBThumbnailDisplayJSPF_10461_div_10" class="price">
S/. 4,999.00
</div>
</div>
</body>
</html>
R Code:
library(rvest)
page_source <- read_html("r.html")
r.precio.antes <- page_source %>%
html_nodes(".normal_encontrado") %>%
html_text()
r.precio.actual <- page_source %>%
html_nodes(".price") %>%
html_text()