When webscraping, I'm getting: {{price}}
. The webbrowser shows the price S/1800.00 (some number), looking the source code is where you see the {{price}}
.
This happens for precio.tarjeta
, I get all the other variables correctly.
Code:
library(rvest)
library(purrr)
library(tidyverse)
urls <- list("https://www.oechsle.pe/tecnologia/televisores/?&optionOrderBy=OrderByScoreDESC&optionOrderBy=OrderByScoreDESC&O=OrderByScoreDESC&optionOrderBy=OrderByScoreDESC&page=1",
"https://www.oechsle.pe/tecnologia/televisores/?&optionOrderBy=OrderByScoreDESC&optionOrderBy=OrderByScoreDESC&O=OrderByScoreDESC&optionOrderBy=OrderByScoreDESC&page=2")
h <- urls %>% map(read_html) # scrape once, parse as necessary
df <- map_dfr(h %>%
map(~ .x %>%
html_nodes("div.product")), ~
data.frame(
periodo = lubridate::year(Sys.Date()),
fecha = Sys.Date(),
ecommerce = "oeschle",
marca = .x %>% html_node(".brand") %>% html_text(),
producto = .x %>% html_node(".prod-name") %>% html_text(),
precio.antes = .x %>% html_node('.ListPrice') %>% html_text(),
precio.actual = .x %>% html_node('.BestPrice') %>% html_text(),
precio.tarjeta = .x %>% html_node('.tOhPrice') %>% html_text()
))
Update 1:
I'm noticing the products repeat themselves, i.e. there is a duplication of products, even when they are page 1 and page 2 with different products in browser.
Why?