0

I use rvest to retrieve the titles from google query result. My code is like this:

> url = URLencode(paste0("https://www.google.com.au/search?q=","600d"))
> page <- read_html(url)
> page %>% 
     html_nodes("a") %>%
     html_text()

However, the result includes not only just titles, but also something else, like:

 [24] "Past month"                                                                        
 [25] "Past year"                                                                         
 [26] "Verbatim"                                                                              
 [27] "EOS 600D - Canon"                                                                  
 [28] "Similar"                                                                           
 [29] "Canon 600D | BIG W"                                                                
 [30] "Cached"                                                                                
 [31] "Similar"   
 ......
 [45] ""                                                                                          
 [46] ""                    

where what I need are [27] "EOS 600D - Canon" and [29] "Canon 600D | BIG W". They are shown in google query like this:enter image description here

All of others are just noises for me. Could anyone please tell me how to get rid of those?

Also, if I want the description part as well, what I should do?

Feng Chen
  • 2,139
  • 4
  • 33
  • 62

1 Answers1

2

To just get the titles, do not use <a> (=link) but <h3>:

page %>% 
  html_nodes("h3") %>%
  html_text()

 [1] "EOS 600D - Canon"                                                   
 [2] "Canon EOS 600D - Wikipedia"                                         
 [3] "Canon 600D | BIG W"                                                 
 [4] "Canon EOS 600D Digital SLR Camera with 18-55mm IS Lens Kit ..."     
 [5] "Canon Rebel T3i / EOS 600D Review: Digital Photography Review"      
 [6] "Canon EOS 600D review - CNET"                                       
 [7] "canon eos 600d | Cameras | Gumtree Australia Free Local Classifieds"
 [8] "Images for 600d"                                                    
 [9] "Canon 600D - Snapsort"                                              
[10] "Canon EOS 600D - Georges Cameras"  
HubertL
  • 19,246
  • 3
  • 32
  • 51