I wrote a simple Ruby script that talks to the suggestions API of the Google search.
By changing the "query" variable you can define what to ask the API. Works great with English but German umlauts seem to cause some encoding issues. In the example below I used the word "Tür" (door) to demonstrate the problem.
#!/usr/bin/env ruby
# encoding: UTF-8
require 'nokogiri'
require 'open-uri'
query = 'Tür'
uri = URI.encode("http://suggestqueries.google.com/complete/search?output=toolbar&hl=de&q=#{query}")
puts uri
puts '----------'
xml_doc = Nokogiri::XML(open(uri))
puts xml_doc
puts '----------'
xml_doc.xpath('.//suggestion').each do |suggestion|
puts suggestion.attr('data')
end
Output:
http://suggestqueries.google.com/complete/search?output=toolbar&hl=de&q=T%C3%BCr
----------
element suggestion: output error : invalid character value
<?xml version="1.0"?>
<toplevel>
<CompleteSuggestion>
<suggestion data="türkei"/>
</CompleteSuggestion>
<CompleteSuggestion>
<suggestion data="t?rkis"/>
</CompleteSuggestion>
<CompleteSuggestion>
<suggestion data="t?rkei news"/>
</CompleteSuggestion>
<CompleteSuggestion>
<suggestion data="t?rkiye"/>
</CompleteSuggestion>
<CompleteSuggestion>
<suggestion data="t?ren"/>
</CompleteSuggestion>
<CompleteSuggestion>
<suggestion data="t?rstopper"/>
</CompleteSuggestion>
<CompleteSuggestion>
<suggestion data="t?rschloss"/>
</CompleteSuggestion>
<CompleteSuggestion>
<suggestion data="t?rkisch deutsch"/>
</CompleteSuggestion>
<CompleteSuggestion>
<suggestion data="t?renheld"/>
</CompleteSuggestion>
<CompleteSuggestion>
<suggestion data="t?rkisch"/>
</CompleteSuggestion>
</toplevel>
----------
t?rkei
t?rkis
t?rkei news
t?rkiye
t?ren
t?rstopper
t?rschloss
t?rkisch deutsch
t?renheld
t?rkisch
As you can see the uri is valid and the API returns XML data. But the printed data already has these encoding errors and I suspect that Nokogiri is configured wrongly because it works perfectly in Chrome. It also says this:
element suggestion: output error : invalid character value
Does anyone have an idea how to solve this? Would be great!