I'm using Anemone to spider a domain and it works fine.
the code to initiate the crawl looks like this:
require 'anemone'
Anemone.crawl("http://www.example.com/") do |anemone|
anemone.on_every_page do |page|
puts page.url
end
end
This very nicely prints out all the page urls for the domain like so:
http://www.example.com/
http://www.example.com/about
http://www.example.com/articles
http://www.example.com/articles/article_01
http://www.example.com/contact
What I would like to do is create an array of key value pairs using the last part of the url for the key, and the url 'minus the domain' for the value.
E.g.
[
['','/'],
['about','/about'],
['articles','/articles'],
['article_01','/articles/article_01']
]
Apologies if this is rudimentary stuff but I'm a Ruby novice.