1

I'm trying to download a page with httpclient and parse it useing oga (https://github.com/YorickPeterse/oga)

My program looks like this:

require 'httpclient'
require 'oga'
url = 'http://stackoverflow.com/questions/1496096/is-there-a-limit-to-the-length-of-html-attributes'
c = HTTPClient.new
content = c.get_content(url)
document = Oga.parse_html(content)

I get this error:

LL::ParserError: Unexpected end of input, expected element closing tag instead on line 431
  parser_error at /home/binaryplease/.rvm/gems/jruby-1.7.19/gems/oga-0.3.1-java/lib/oga/xml/parser.rb:255
each_token at /home/binaryplease/.rvm/gems/jruby-1.7.19/gems/oga-0.3.1-java/lib/oga/xml/parser.rb:231
     parse at org/libll/Driver.java:303
     parse at /home/binaryplease/.rvm/gems/jruby-1.7.19/gems/oga-0.3.1-java/lib/oga/xml/parser.rb:262
parse_html at /home/binaryplease/.rvm/gems/jruby-1.7.19/gems/oga-0.3.1-java/lib/oga/oga.rb:25
    (root) at test.rb:12

I verified that httpclient is downloading correctly and the file doesnt end there. I also tryed other links, some work but most of them give me this error. In general smaller pages seem to work just fine

Is there a problem with the library or am I making an error?

pinpox
  • 179
  • 2
  • 10

0 Answers0