-4

I'm having trouble in reading the universities from the below link into R.

http://www.usnews.com/education/best-global-universities/rankings

I tried

readHTMLTable("http://www.usnews.com/education/best-global-universities/rankings") 

.... but it doesn't work.

All I need is to read the university rankings in the middle of the page into R.

IRTFM
  • 258,963
  • 21
  • 364
  • 487
Joe Gu
  • 3
  • 1
  • 1
    Perhaps you can show the code that you have already used? – Zizouz212 Mar 27 '15 at 20:51
  • You'll need more than simply the URL. Even if you manage to read the page, you'll then need to find your table among the other text. Check the library `XML` and some other questions here on SO, like this: http://stackoverflow.com/questions/1395528/scraping-html-tables-into-r-data-frames-using-the-xml-package – Molx Mar 27 '15 at 20:57
  • And you'll probably want to review the ToCs of the site. Their Copyright requires written permission to do what you're doing. – hrbrmstr Mar 28 '15 at 01:22

1 Answers1

2

As a starter:

library(XML)
doc <- htmlParse("http://www.usnews.com/education/best-global-universities/rankings")
res <- xpathApply(doc, "//div[@class='sep']", getChildrenStrings)
data.frame(uni = gsub("\\s\\s+", " ", gsub("[\n\t\r]", "", sapply(res, "[", 6))), 
           score = as.numeric(gsub("[^0-9.]", "", sapply(res, "[", 2))))
#                                                                               uni score
# 1                      Harvard University United States Cambridge, Massachusetts  100.0
# 2   Massachusetts Institute of Technology United States Cambridge, Massachusetts   88.9
# 3          University of California--Berkeley United States Berkeley, California   88.0
# 4                         Stanford University United States Stanford, California   85.1
# 5                                     University of Oxford United Kingdom Oxford   83.6
# 6                               University of Cambridge United Kingdom Cambridge   83.3
# 7          California Institute of Technology United States Pasadena, California   80.3
# 8    University of California--Los Angeles United States Los Angeles, California   80.1
# 9                          University of Chicago United States Chicago, Illinois   77.4
# 10                          Columbia University United States New York, New York   77.3
lukeA
  • 53,097
  • 5
  • 97
  • 100