0

I am trying to scrape data frame the following website

https://infogram.com/detallecasos-1h7z2l9yqgdy2ow

using rvest package in R.

But I get

{xml_nodeset (0)}

I have tried to solve it, with multiple stackoverflow answers but could not. I appreciate who can help me.

library(rvest)

read_html("https://infogram.com/detallecasos-1h7z2l9yqgdy2ow") %>% 
  html_nodes('table') %>%
  html_table(fill = TRUE)

Expected output

enter image description here

Rafael Díaz
  • 2,134
  • 2
  • 16
  • 32
  • "I attack whoever can help me." It looks like you've got a pretty funny typo or translation mistake here – camille Mar 22 '20 at 19:00
  • Does this answer your question? [rvest function html\_nodes returns {xml\_nodeset (0)}](https://stackoverflow.com/questions/51219793/rvest-function-html-nodes-returns-xml-nodeset-0) – stefan Mar 22 '20 at 19:06
  • A JSON is fetched via the API. You can use that address, just look at the network analysis in the developer tools. –  Mar 22 '20 at 19:42

1 Answers1

0

Here is the code, which returns the table

library(rvest)
library(stringr)
library(rjson)
library(data.table)

pg <- read_html("https://infogram.com/detallecasos-1h7z2l9yqgdy2ow") 
Mat <- pg %>% html_nodes("body") %>%
  html_nodes("script:contains('window.infographicData')") %>%
  html_text() %>% str_extract(string = ., pattern = "\\[\\[.*\\]\\]") %>% 
  substr(x = ., start = 2, stop = nchar(.)-1) %>% fromJSON() %>%
  do.call(rbind, .)

colnames(Mat) = Mat[1, ]

DT <- as.data.table(Mat[-1,-1])
DT
     Fecha de diagnóstico Ciudad de ubicación Atención    Edad Sexo       Tipo* País de procedencia
  1:           06/03/2020              Bogotá     casa 10 a 19    F   Importado              Italia
  2:           09/03/2020                Buga hospital 30 a 39    M   Importado              España
  3:           09/03/2020            Medellín     casa 50 a 59    F   Importado              España
  4:           11/03/2020            Medellín     casa 50 a 59    M Relacionado            Colombia
  5:           11/03/2020            Medellín     casa 20 a 29    M Relacionado            Colombia
 ---                                                                                               
227:           22/03/2020                Cali     casa 20 a 29    M   Importado      Estados Unidos
228:           22/03/2020                Cali     casa 20 a 29    M   Importado              España
229:           22/03/2020               Yopal     casa 30 a 39    F Relacionado            Colombia
230:           22/03/2020             Armenia     casa 30 a 39    F Relacionado            Colombia
231:           22/03/2020                Cali     casa 40 a 49    F Relacionado            Colombia
Rafael Díaz
  • 2,134
  • 2
  • 16
  • 32