0
library(rvest)
url <- "http://bet.hkjc.com/racing/pages/odds_wp.aspx?date=14-12-2016&venue=HV&raceno=1&lang=en"

R1odds <- url %>% read_html() %>%
  html_nodes("table") %>%
  .[[2]] %>%
  html_table(fill=TRUE)
R1odds

I got this error message:

Error: input conversion failed due to input error, bytes 0x3C 0x2F 0x6E 0x6F [6003]

How to solve this?

Yuen Wa Ho
  • 63
  • 6
  • 1
    You may want to take a look at the following [link](https://github.com/hadley/rvest/issues/117). This is in a comment since I don't know what the correct output should be to give an answer. The link suggests you use the `httr` package and something like the following code: `x <- content(GET(url), "raw"); guess_encoding(x)`; this code returns a list of encodings, one of which is `ISO-8859-1`. So change `read_html()` to `read_html(encoding = "ISO-8859-1")`. – steveb Dec 14 '16 at 08:48
  • Thanks. Turns out that the rvest return character(0)... it's aspx.net so I can't scrape the data from it... any suggestion how to scrape aspx?? – Yuen Wa Ho Dec 14 '16 at 09:10
  • No, not at this point. – steveb Dec 14 '16 at 09:12

1 Answers1

0

For others who might run into something like this in a non-gambling context here's the solution to get round the nulls. You'll have to deal with your gambling data issue on your own:

library(rvest)
library(curl)

url <- "http://bet.hkjc.com/racing/pages/odds_wp.aspx?date=14-12-2016&venue=HV&raceno=1&lang=en"

pg <- curl_fetch_memory(url)

pg$content %>%
  readBin(what=character()) %>%
  read_html() -> doc

html_nodes(doc, "table")
## {xml_nodeset (47)}
##  [1] <table width="776" border="0" cellspacing="0" cellpadding="0">\n  < ...
##  [2] <table width="100%" border="0" cellspacing="0" cellpadding="0">\n   ...
##  [3] <table width="100%" border="0" cellspacing="0" cellpadding="0">\n   ...
##  [4] <table width="100%" border="0" cellspacing="0" cellpadding="0">\n   ...
##  [5] <table width="100%" border="0" cellspacing="0" cellpadding="0">\n   ...
##  [6] <table width="776" border="0" cellspacing="0" cellpadding="0">\n  < ...
##  [7] <table width="776" border="0" cellspacing="0" cellpadding="0">\n  < ...
##  [8] <table width="100%" border="0" cellspacing="0" cellpadding="0">\n   ...
##  [9] <table width="100%" border="0" cellspacing="0" cellpadding="0" clas ...
## [10] <table width="100%" border="0" cellspacing="0" cellpadding="0">\n   ...
## [11] <table width="100%" border="0" cellspacing="0" cellpadding="0">\n   ...
## [12] <table width="100%" border="0" cellspacing="0" cellpadding="0">\n   ...
## [13] <table width="100%" border="0" cellspacing="0" cellpadding="0">\n   ...
## [14] <table width="100%" border="0" cellspacing="0" cellpadding="0">\n   ...
## [15] <table width="100%" border="0" cellspacing="0" cellpadding="0">\n   ...
## [16] <table width="100%" border="0" cellspacing="0" cellpadding="0">\n   ...
## [17] <table width="100%" border="0" cellspacing="0" cellpadding="0">\n   ...
## [18] <table width="100%" border="0" cellspacing="0" cellpadding="0">\n   ...
## [19] <table width="100%" border="0" cellspacing="0" cellpadding="0">\n   ...
## [20] <table width="100%" border="0" cellspacing="0" cellpadding="0">\n   ...
## ...

It's likely the table you need is in there.

For others (since this code works for this site) you may also need to pipe your own data to iconv() to deal with other encoding issues.

hrbrmstr
  • 77,368
  • 11
  • 139
  • 205