Error in HTML parsing in R

Question

I am getting following error in extracting webdata from cricinfo

> #Set internet
> Setinternet2=TRUE
> 
> #Loading Libraries
> library(XML)
> library(tm)
> library(RCurl)
> 
> #URL
> URL="http://stats.espncricinfo.com/ci/engine/records/batting/most_runs_career.html?class=1;id=2010;type=year"
> 
> #HTML parsing
> List=htmlParse(URL)
Error in htmlParse(URL) : 
  error in creating parser for http://stats.espncricinfo.com/ci/engine/records/batting/most_runs_career.html?class=1;id=2010;type=year
>

Any idea how to solve this?

Works for me. When I say 'works', I mean I get a whole bunch of htmlEntityParseRef messages, tag mismatches, and a bunch of other warnings. Do the examples in help(htmlParse) work? — Spacedman, Jun 29 '11 at 12:10

score 2 · Answer 1 · edited May 23 '17 at 12:27

2

Try

page <- getURL(URL)
htmlParse(page)

You may need options in the call to getURL as described in my answer to your other question.

edited May 23 '17 at 12:27

Community

1
1

answered Jun 29 '11 at 14:17

Richie Cotton

118,240
47
247
360

Error in HTML parsing in R

1 Answers1