-4

i want to scrape some website, which contain pagination.

for example http://somesite.com/page/

i want scrape each post in each pagination.

so, in page/1 , there are about 5 posts.

how to scrape each data inside each pagination? until the end page?

i've search and research, and i found 2 similar question, but im still confuse it..

here >>

first way

second way

any idea how to combine it?

thanks before

Community
  • 1
  • 1

1 Answers1

0

You have to use mechanize gems? I'd strongly recommend you to use Nokogiri. It's very simple and easy to use.

You can have a loop that fetch the pages and stop when you can't find the page.

require 'open-uri'
require 'nokogiri'
pages_count = 1
loop do
    @html = Nokogiri::HTML(open("somepage.com/#{pages_count}"))
    ...
    pages_count = pages_count + 1
end