I am trying to access an online API that returns an .xml from a Perl Script and it uses the Catalan alphabet: à,é,è,í,ò,ó,ú,·,ç .
I am using Perl's URI::Escape, so a "MWE" (without the actual URL of the dictionary I am trying to access, just in case it is considered spam or whatever) of what I am trying to do would be
use LWP::Simple;
use URI::Escape;
use utf8;
my $word = <STDIN>;
$word = uri_escape_utf8($word);
my $xmlweb = get("http://www.urlofthedictionary.com/search?q=$word&format=text/xml");
It "works", i.e. no error shows up, but it does not work properly (no results for the word are given if it contains any of these special characters). For example if I write país
then uri_escape_utf8()
returns pa%C2%A1s%0A
, but I have seen that if I copy that exact same string to the url in my navigator, then it searches pais
(instead of país
) giving no results, and even in the URL it gets "translated" to pais
. If I just use uri_escape()
then the website gives an error:
Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '='
This is driving me insane, I always have problems with encodings. Does anybody know what am I doing wrong? If the dictionary's url is needed I will provide it, no problem with that.