Questions tagged [mechanize-ruby]

The Ruby library for automating interaction with websites.

The Mechanize library is used for automating interaction with websites. Mechanize automatically stores and sends cookies, follows redirects, can follow links, and submit forms. Form fields can be populated and submitted. Mechanize also keeps track of the sites that you have visited as a history.

193 questions
13
votes
1 answer

How to let Ruby Mechanize get a page which lives in a string

Generally Mechanize will get a webpage from a URL and the result of the get method is a Mechanize::Page object, from which you can use a lot of useful methods. If the page lives in a string, how do I get the same Mechanize::Page object? require…
Just a learner
  • 26,690
  • 50
  • 155
  • 234
13
votes
1 answer

Catching Mechanize 404 => Net::HTTPNotFound

I wrote simple function which handles fetching of the url: def tender_page_get url, agent sleep(rand(6)+2) begin return agent.get(url).parser rescue Errno::ETIMEDOUT, Timeout::Error, Net::HTTPNotFound EYE.debug "--winter sleep #{url}" …
spacemonkey
  • 19,664
  • 14
  • 42
  • 62
7
votes
2 answers

How to prevent error "code converter not found (UTF-8)"?

I'm getting this error in my production environment (CentOS 5.6), but it runs fine in development (Ubuntu 11.04). In both environments, the app is using Ruby 1.9.3 and Rails 3.0.9 and is served with passenger and nginx. My mechanize gem version is…
dgmdan
  • 405
  • 6
  • 14
7
votes
2 answers

I can't remove whitespaces from a string parsed by Nokogiri

I can't remove whitespaces from a string. My HTML is:

Cena pro Vás: 139 

My code is: #encoding: utf-8 require 'rubygems' require 'mechanize' agent = Mechanize.new site =…
A.D.
  • 4,487
  • 3
  • 38
  • 50
7
votes
2 answers

getaddrinfo error with Mechanize

I wrote a script that will go through all of the customers in our database, verify that their website URL works, and try to find a twitter link on their homepage. We have a little over 10,000 URLs to verify. After a fraction of if the urls are…
EricM
  • 264
  • 2
  • 13
5
votes
5 answers

SelectList with Mechanize in Ruby

I'm trying to set the value of a select list using Mechanize with Ruby. I can navigate to the page with the select list, grab the form using the .form method, and find the select list. report_form =page.form('form1') pp report_form.field_with(:name…
DNadel
  • 495
  • 1
  • 5
  • 13
5
votes
2 answers

Regulating / rate limiting ruby mechanize

I need to regulate how often a Mechanize instance connects with an API (once every 2 seconds, so limit connections to that or more) So this: instance.pre_connect_hooks << Proc.new { sleep 2 } I had thought this would work, and it sort of does BUT…
blueblank
  • 4,724
  • 9
  • 48
  • 73
5
votes
1 answer

Ruby Mechanize: Follow a Link

In Mechanize on Ruby, I have to assign a new variable to every new page I come to. For example: page2 = page1.link_with(:text => "Continue").click page3 = page2.link_with(:text => "About").click ...etc Is there a way to run Mechanize without…
themirror
  • 9,963
  • 7
  • 46
  • 79
5
votes
2 answers

Ruby Mechanize table scraping doesn't capture entire row

I am trying to scrape a table website with mechanize. I want to scrape the second row. When I run : agent.page.search('table.ea').search('tr')[-2].search('td').map{ |n| n.text } I would expect it to scrape the whole row. But instead it only scrapes:…
Rails beginner
  • 14,321
  • 35
  • 137
  • 257
5
votes
1 answer

How to scrape a website that requires login first with ruby Mechanize gem

I was trying to learn the usage of ruby Mechanize gem from which I was able to fill the form and login to the website. But I was not able to extract the after logging in. Basically that website is displaying data only after logged in else it shows…
Atchyut Nagabhairava
  • 1,295
  • 3
  • 16
  • 23
5
votes
3 answers

Use a Login form with Mechanize

I know there are very similar posts to this on Stackoverflow but I still can't seem to figure out what is wrong with my attempt. # login to the site mech.get(base_URL) do |page| l = page.form_with(:action => "/site/login/") do |f| …
Zach
  • 885
  • 2
  • 8
  • 27
5
votes
3 answers

Clicking link with JavaScript in Mechanize

I have this: Account Summary I want to click that link but I get an error when using link_to. I've tried: bot.click(page.link_with(:href =>…
user1198316
  • 267
  • 2
  • 3
  • 12
4
votes
1 answer

How can I get Mechanize objects from Mechanize::Page's search method?

I'm trying to scrape a site where I can only rely on classes and element hierarchy to find the right nodes. But using Mechanize::Page#search returns Nokogiri::XML::Elements which I can't use to fill and submit forms etc. I'd really like to use pure…
raphinesse
  • 19,068
  • 6
  • 39
  • 48
4
votes
2 answers

`sysread': Interrupted system call (Errno::EINTR) When using Ruby and mysql

I'm scraping a site with mechanize and pushing to a mysql db. I am getting these sys read errors a lot and I'm not sure what the solution is. I'm using the Ruby-mysql gem.
user491880
  • 4,709
  • 4
  • 28
  • 49
4
votes
1 answer

Ruby Mechanize 404 => Net::HTTPNotFound

I have an URL that I can't access with Mechanize and I don't know why: # Use ruby 2.1.6 require 'mechanize' require 'axlsx' # 2.0.1 require 'roo' # 1.13.2 mechanize = Mechanize.new mechanize.request_headers = { "Accept-Encoding" => ""…
Ismael Bourg
  • 197
  • 11
1
2 3
12 13