I am extracting data from a forum. My script based on is working fine. Now I need to extract date and time (21 Dec 2009, 20:39) from single post. I cannot get it work. I used FireXPath to determine the xpath.
Sample code:
require 'rubygems'
require 'mechanize'
post_agent = WWW::Mechanize.new
post_page = post_agent.get('http://www.vbulletin.org/forum/showthread.php?t=230708')
puts post_page.parser.xpath('/html/body/div/div/div/div/div/table/tbody/tr/td/div[2]/text()').to_s.strip
puts post_page.parser.at_xpath('/html/body/div/div/div/div/div/table/tbody/tr/td/div[2]/text()').to_s.strip
puts post_page.parser.xpath('//[@id="post1960370"]/tbody/tr[1]/td/div[2]/text()')
all my attempts end with empty string or an error.
I cannot find any documentation on using Nokogiri within Mechanize. The Mechanize documentation says at the bottom of the page:
After you have used Mechanize to navigate to the page that you need to scrape, then scrape it using Nokogiri methods.
But what methods? Where can I read about them with samples and explained syntax? I did not find anything on Nokogiri's site either.