0

I am getting following error in extracting webdata from cricinfo

> #Set internet
> Setinternet2=TRUE
> 
> #Loading Libraries
> library(XML)
> library(tm)
> library(RCurl)
> 
> #URL
> URL="http://stats.espncricinfo.com/ci/engine/records/batting/most_runs_career.html?class=1;id=2010;type=year"
> 
> #HTML parsing
> List=htmlParse(URL)
Error in htmlParse(URL) : 
  error in creating parser for http://stats.espncricinfo.com/ci/engine/records/batting/most_runs_career.html?class=1;id=2010;type=year
> 

Any idea how to solve this?

Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
AVSuresh
  • 1,562
  • 2
  • 15
  • 14
  • Works for me. When I say 'works', I mean I get a whole bunch of htmlEntityParseRef messages, tag mismatches, and a bunch of other warnings. Do the examples in help(htmlParse) work? – Spacedman Jun 29 '11 at 12:10

1 Answers1

2

Try

page <- getURL(URL)
htmlParse(page)

You may need options in the call to getURL as described in my answer to your other question.

Community
  • 1
  • 1
Richie Cotton
  • 118,240
  • 47
  • 247
  • 360