3

Code:

require 'anemone'
Anemone.crawl("http://www.example.com/") do |anemone|
  anemone.on_every_page do |page|
    puts page.url
  end
end

When I try this code I should get a list of all the urls on that website but all I get is just the name of the website. What can possibly be the error and how do I get a list of all the urls?

louiscoquio
  • 10,638
  • 3
  • 33
  • 51
Anu11
  • 316
  • 1
  • 10
  • It works fine. Obviously, if you do that on example.com, it will just display http://www.example.com as there is only this page. – tomferon Sep 04 '12 at 09:09

1 Answers1

0

I guess anemone just can't follow redirects or something like this, cause "http://example.com" redirects me on other site. Have you tried to crawl other sites? http://stackoverflow.com, for example.

railscard
  • 1,848
  • 16
  • 12
  • This was just a proxy error and after setting the terminal proxy it is working fine.Is there any way to include proxy settings inside the script itself? – Anu11 Sep 07 '12 at 11:28
  • 1
    Sure, Anemone.crawl(url, {:proxy_host => 'your proxy host', :proxy_port => 'your proxy port'}) – railscard Sep 07 '12 at 11:38
  • 1
    require 'anemone' Anemone.crawl("http://www.stackoverflow.com/") do |anemone| {:proxy_host => 'proxy.xyz.com', :proxy_port => '9999'} anemone.on_every_page do |page| puts page.url end end tried this tto but it gives only the website name. – Anu11 Sep 07 '12 at 11:49