1

I was trying to import (scrape) sets of tables on a news blog post online using xml2's read_html(), hence html.table() and XML::readHTMLTable() functions. I got no table or anything useful with

readHTMLTable("https://www.theabusites.com/197-nuc-approved-universities-in-nigeria-2021/",...,
               header=TRUE, stringAsFactor=FALSE)

#Note: not CSV or html tag. It returns ? unable to find the inherited method for the function 'readHTMLTable' for signature " NULL"', also <XML contents does not seem to be XML.> What can I do? Also, rvest, too, returns the error flag, "Error in open.connection(x, "rb") : Couldn't connect to server" - What is causing this error message? Thanks in advance

  • 1
    Are you trying to scrape a web page as XML? That might be problematic. But without details of both your code and your input data, it's going to be difficult to help you... – Limey Oct 10 '21 at 18:07
  • 1
    The code in the duplicate question seems to do the trick ; `library(xml2) ; library(rvest) url <- "https://www.theabusites.com/197-nuc-approved-universities-in-nigeria-2021/" ; page <- read_html(url) ; tables <- html_table(page, fill = TRUE)` – user20650 Oct 10 '21 at 20:03
  • @Limey, right. I was trying to scrape a web page and it contains multiple tables. – Ibiloye Abiodun Christian Oct 12 '21 at 09:25
  • Thank you @User20650, the trick didn't actually work and it warned that "no XML content found". I had to save the html page into my local computer, C: and read it from there. Thank you. It's really problematics scraping directly from online source - many adverts, menu tab address and empty tables were what I got. – Ibiloye Abiodun Christian Oct 12 '21 at 09:35
  • @IbiloyeAbiodunChristian ; just tried iust again and can confirm it loads 14 tables into a list (14 tables as each of the tables on the page are split) – user20650 Oct 12 '21 at 10:17
  • 1
    Thanks @user20650, it works pretty well. I looked into the network, and re-run on another PC, a laptop. I got all the tables. I also use nuc.data<- url("https://www.theabusites.com/197-nuc-approved-universities-in-nigeria-2021/", " rb"); page<-read_html(nuc.data); page. Thanks so much – Ibiloye Abiodun Christian Oct 12 '21 at 15:31
  • good stuff; glad you got it working – user20650 Oct 12 '21 at 15:36

0 Answers0