I am using Anemone. How do I crawl sub-domain too? for e.g if I have website www.abc.com
my crawler should also crawl support.abc.com
or blah.abc.com
. I am using Ruby 1.8.7 and Rails 3.
Asked
Active
Viewed 743 times
3

David J.
- 31,569
- 22
- 122
- 174

Bhushan Lodha
- 6,824
- 7
- 62
- 100
-
1Why is this a Rails or Nokogiri question? – the Tin Man Feb 15 '12 at 18:14
-
1I removed the rails and nokogiri tags: they are not central to this question. – David J. Jun 21 '12 at 16:35
2 Answers
4
Here is a commit on Github that solves your problem.
https://github.com/runa/anemone/commit/91559bde052956cfc40ae62678ec2a61574cf928
Change your anemone gem files as per the link.

sunnyrjuneja
- 6,033
- 2
- 32
- 51
-2
According to the Anemone docs you can pass multiple sites into the crawl
command:
Anemone.crawl("http://www.abc.com/", "http://support.abc.com/", "http://blah.abc.com/")
Of course, your next problem will probably be ABC banning you for crawling their site, but that's a different question.

the Tin Man
- 158,662
- 42
- 215
- 303
-
-
If you don't know the subdomains you will have to try to locate them by searching through the links retrieved from the first page, looking for other sites that are sub-domains, or that appear to be sibling-domains, of the starting one. Then spawn secondary crawls. – the Tin Man Feb 17 '12 at 18:57