1

I'm not sure if this should be asked on ServerFault or here as it's a server problem that's specific to Mechanize and Nokogiri in a Rails 3.2.3 application.

I have a rake task that scrapes a Latitude and Longitude from one of our service providers websites.

I have set the task up in 'crontab -e' along with the other tasks. For some reason, on two supposidly idential servers, one the the servers fails to complete the rake task with the following error:

X-Cron-Env: <PATH=/usr/bin:/bin>
X-Cron-Env: <LOGNAME=root>
Message-Id: <20120410134631.2CFA624B76@localhost>
Date: Tue, 10 Apr 2012 14:46:30 +0100 (BST)

rake aborted!
/var/www/railsapp/lib/tasks/peoplesafelocation.rake:29: undefined (?...) sequence: /new GLatLng\(\s*(?<lat>.+?)\s*,\s*(?<long>.+?)\s*\)/

Both servers are running Rails 3.2.3, Ruby 1.9.2.

I can't understand why it would fail with 'undefined (?...) sequence' on one server but not the other.

Both servers are using RVM and are running Ubuntu 10.04.

The full rake task is as follows:

desc "Import Peoplesafe Location"
task :fetch_peoplesafelocation => :environment do

# Logs into provider.co.uk/live and retrieved latitude and longitude.
require 'rubygems'
require 'mechanize'
require 'logger'
require 'nokogiri'

# Create a new mechanize object
agent = Mechanize.new

# Load the Peoplesafe website
page = agent.get("http://provider.co.uk/live/")

# Select the first form
form = agent.page.forms.first
form.username = 'User'
form.password = 'Password'

# Submit the form
page = form.submit form.buttons.first

page = agent.get("http://provider.co.uk/live/?gps&cid=AAXA-PJZM6M")

html_doc = page.root

script = page.at('/html/head/script[not(@src)]')
parts = script.text.match(/new GLatLng\(\s*(?<lat>.+?)\s*,\s*(?<long>.+?)\s*\)/)

#puts parts[:lat], parts[:long]

Location.create(:latitude => parts[:lat], :longitude => parts[:long])
puts 'Location Updated'

end

Any pointers would be appreciated!

dannymcc
  • 3,744
  • 12
  • 52
  • 85

2 Answers2

2

The issues stems from the regex and using and older ruby.

This is easily caused by using RVM.

By default RVM only loads it's config with it's an interactive shell. Cronjobs by default use the sh shell.

RVM ships with a shell wrapper to handle to help with this. At the top of your cronjob add SHELL=/path/to/rvm/bin/rvm-shell (b/c I logged into the server to help) The path of this was /usr/local/bin/rvm/bin/rvm-shell Setting this will cause the correct rvm paths to be included.

The next step was to fix the cron commands. Since we are using rvm-shell we want to remove the paths to use the correct gems (rake, etc) with your rvm.

After remove the absolute path to rake, and adding the SHELL variable at the top of the crontab, all the crons will start to fire correctly.

Ryan Gibbons
  • 3,511
  • 31
  • 32
1

This error message is raised by they regex engine in ruby 1.8.7; see for example this question. So this should work if you are in fact using ruby 1.9.3 on both machines.

Ruby 1.8.7:

$ rvm 1.8.7-p334
$ irb
1.8.7 :002 > "foo".match(/new GLatLng\(\s*(?<lat>.+?)\s*,\s*(?<long>.+?)\s*\)/)
SyntaxError: compile error
(irb):2: undefined (?...) sequence: /new GLatLng\(\s*(?<lat>.+?)\s*,\s*(?<long>.+?)\s*\)/
from (irb):2

Ruby 1.9.2:

$ rvm 1.9.2-p290
$ irb
1.9.2p290 :001 > "foo".match(/new GLatLng\(\s*(?<lat>.+?)\s*,\s*(?<long>.+?)\s*\)/)
=> nil 

So double check if you are in fact using the right rvm ruby on your failing server. For one thing, check if you've set 1.9.2 or higher as the default using

rvm 1.9.2 --default

and that the rvm executables are in the path before any potentially installed system ruby. Also, be aware that cronjobs do not by default have their user's environment available - you need to pass that explicitly or execute the cron job from within a login shell (see for example http://danielsz.posterous.com/how-to-run-rvm-scripts-as-cron-jobs).

Community
  • 1
  • 1
Thilo
  • 17,565
  • 5
  • 68
  • 84