0

I am working on Rvest for web scraping. I have collected the list of outputs from the page but my functions fails to wrap it in dataframe. My code:

test_qry <- lapply(paste0('https://......&CurrentPage=', 1:3),
             function(url){
               c(
                hospital <- url %>% read_html() %>% 
                   html_nodes(".findcompare-results table th.fctitle a") %>% 
                   html_text(),
                Tele_no <- url %>% read_html() %>%
                   html_nodes(".findcompare-results table td p.fctel") %>%
                   html_text())
               
             })

I dont know how to create dataframe inside the function to wrap all the variables. I used this function to read all the pages(totally 3 pages). My output is like this

1 Coulsdon Dental Practice 
2 mydentist, Chipstead Valley Road, Coulsdon
3 Coulsdon Dental Clinic 
4 Ivory 
5 Crossways Dental Practice 
6 Confidental Clinic 
7 Azenabor, Ify 
8 Kerschbaumer, Andreas 
9 Paice, Andrew 
10 Kenley Dental Practice
11 Tel: 020 8668 2607
12 Tel: 02086686870
13 Tel: 020 8660 3308
14 Tel: 020 8668 2579
15 Tel: 01737 551622
16 Tel: 020 8660 8923
17 Tel: 01737 554177
18 Tel: 020 8660 0415
19 Tel: 020 8660 6565
20 Tel: 020 8668 2696

But I need two seperate variables, it writes to single list. Like data.frame(name,tele_no)

I have used unlist(test_qry) to create data frame.

Help me please!!!

prabhu
  • 103
  • 1
  • 3
  • 15
  • So, what is the logic to extract word? Will it always last three word like `Just waking up` – Harun24hr Nov 29 '16 at 16:33
  • @harun24hr No it changes, totally 12-time intervals, like this - Just wake up, For breakfast, Mid-morning break, other time in morning, Lunchtime etc... – prabhu Nov 29 '16 at 16:36
  • Can you put screenshot of some data and expected output. Better share an sample sheet via google drive or dropbox. – Harun24hr Nov 29 '16 at 16:45
  • If you want help with this then you will need to post a a more-detailed description of the rules you want to follow. – Tim Williams Nov 29 '16 at 17:40

1 Answers1

0

I suggest using tidyverse along with tibble() and bind_row() to create df.

Here's a reference:

Scraping webpage with react JS in R

ryanhnkim
  • 269
  • 1
  • 7