0

I'm trying to implement a tool using Ruby on Rails, which should crawl a webside and search for hyperlinks. There is a problem: if the website has a huge number of links, the user needs to wait a lot of time.

This is probably a naive question: how can I show results (for example 10 results) and the crawling process still running?

Then, the user click "Next" and it shows the next 10 links, and so on.

Hugo Sousa
  • 906
  • 2
  • 9
  • 27

1 Answers1

0

Imagine that a page has a list of links.

  1. Implement an action in your controller that, given the a position in the links list, gives the next 10 links and returns a json from the data to be displayed.
  2. With javascript, call the just implemented action with zero, get the json, parse it and display in the screen.
  3. Repeat number 2 adding the number of links as parameter to the ajax call until it receives zero links

This will be way more efficient if you get all links on a page in a call, show it to the user, and then repeat to the user. Like the following:

  1. For a given page, add an action that returns all links it have in json
  2. Do a ajax call to that action, take the links, display, and then use each of the given links as parameter, to crawl to these links.
  3. Do this while you have no more links. Keep a links blacklist to avoid cycles.

If you didn't got the ajax part, check the definition of ajax on wikipedia and this question

Community
  • 1
  • 1
fotanus
  • 19,618
  • 13
  • 77
  • 111