0

When I try this:

item.css("a").each do |a|
  if !a.starts_with? 'http://'
     a.replace a.content
  end
end

I get:

NoMethodError: undefined method 'starts_with?' for #<Nokogiri::XML::Element:0x1b48a60> 

EDIT:

Sure there is a cleaner way, but this seems to be working.

item.css("a").each do |a|
  unless a["href"].blank?
    if !a["href"].starts_with? 'http://' 
      a.replace a.content
    end
  end
end
the Tin Man
  • 158,662
  • 42
  • 215
  • 303
pcasa
  • 3,710
  • 7
  • 39
  • 67

2 Answers2

1

The problem is you're trying to use the starts_with method on an object that doesn't implement it.

item.css("a").each do |a|

will return XML nodes in a. Those belong to Nokogiri. What you want to do is convert the node to text, but only the part you want to check, which, because it's a parameter of the node, can be accessed like this:

a['href']

So, you want to use something like this:

item.css("a").each do |a|
  if !(a.starts_with?['href']('http://'))
     a.replace(a.content)
  end
end

The downside to this is you have to walk through every <a> tag in the document, which can be slow on a big page with lots of links.

An alternate way to go about it is to use XPath's starts-with function:

require 'nokogiri'

item = Nokogiri::HTML('<a href="doesnt_start_with">foo</a><a href="http://bar">bar</a>')
puts item.to_html

which outputs:

>> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
>> <html><body>
>> <a href="doesnt_start_with">foo</a><a href="http://bar">bar</a>
>> </body></html>

Here's how to do it using XPath:

item.search('//a[not(starts-with(@href, "http://"))]').each do |a|
  a.replace(a.content)
end
puts item.to_html

Which outputs:

>> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
>> <html><body>foo<a href="http://bar">bar</a>
>> </body></html>

The advantage to using XPath to find the nodes is it all runs in compiled C, rather than letting Ruby do it.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
0

Shouldn't that method be start_with?

buruzaemon
  • 3,847
  • 1
  • 23
  • 44
  • tried that to just in case, but same error. using rails 1.9.2. Edited question, meant !a.starts_with? – pcasa May 07 '11 at 15:38