I know how to find an element using Nokogiri. I know how to click a link using Mechanize. But I can't figure out how to find a specific link and click it. This seems like it should be really easy, but for some reason I can't find a solution.
Let's say I'm just trying to click on the first result on a Google search. I can't just click the first link with Mechanize, because the Google page has a bunch of other links, like Settings. The search result links themselves don't seem to have class names, but they're enveloped in <h3 class="r"></h3>
.
I could just use Nokogiri to follow the href
value of the link like so:
document = open("https://www.google.com/search?q=stackoverflow")
parsed_content = Nokogiri::HTML(document.read)
href = parsed_content.css('.r').children.first['href']
new_document = open(href)
# href is equal to "/url?sa=t&rct=j&q=&esrc=s&source=web&url=https%3A%2F%2Fstackoverflow.com%2F"
but it's not a direct url, and going to that url gives an error. The data-href
value is a direct url, but I can't figure out how to get that value - doing the same thing except with ...first['data-href']
returns nil.
Anyone know how I can just find the first .r
element on the page and click the link inside it?
Here's the start to my action:
require 'open-uri'
require 'nokogiri'
require 'mechanize'
document = open("https://www.google.com/search?q=stackoverflow")
parsed_content = Nokogiri::HTML(document.read)
Here's the .r
element on the Google search results page:
<h3 class="r">
<a href="/url?sa=t&rct=j&q=&esrc=s&source=web&url=https%3A%2F%2Fstackoverflow.com%2F" data-href="https://stackoverflow.com/">Stack Overflow</a>
</h3>