2

I'm using Mojolicious (not Lite) together with CPAN::Redis.

I'm storing some data which is Japanese encoded in this way:

use Redis;
my $redis = Redis->new;
$redis->set("mykey",$val); 
# $val contains a string which was read from a file. 
# The value looks like: テスト

Later in the code I read that value from redis:

my $val = $redis->get("mykey");
print Dumper($val); #the value prints correctly in terminal
$self->stash(
    myvalue => $val
);
$self->render(
    template => "/pages/test"
);

And the template:

<!DOCTYPE html>
<html>
  <head>
      <title>Test</title>
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
  </head>
  <body>
  <div><%= $myvalue %></div>
  ...

But it display the value like: ãã¹ã.

Changing the charset manually in the browser makes no difference (it is not displayed as expected).

Why if its displayed correctly in the terminal, it is not displayed correctly in the template?

Notes:

  • I used base64 encode/decode and it didn't change (I'm sure its not Redis).
  • I have Japanese fonts and settings installed correctly (I have been working with Japanese encodings for many years but first time I use Mojolicious templates for this task).
  • All files are saved in UTF-8 (no other encoding is being used).
  • If I write something in Japanese inside the template (hard coded) it displays correctly.
Community
  • 1
  • 1
lepe
  • 24,677
  • 9
  • 99
  • 108

2 Answers2

3

I hate to answer my own questions.. but I found the solution:

use Encode qw(decode_utf8);
...
$self->stash(
    myvalue => decode_utf8($val)
);

Simple as that. Not sure why its displayed correctly on the terminal... Probably "Dumper" is converting it?

lepe
  • 24,677
  • 9
  • 99
  • 108
3

Why it is not displayed correctly in the template?

When you get value from redis - you get sequence of bytes. you should decode that octets into utf8. as you did it by decode_utf8($val).

Not sure why its displayed correctly on the terminal... Probably "Dumper" is converting it?

You terminal opened with utf8 flag. while dumping you just pass your octets to terminal and Wide character in print at is issued. but characters are displayed correct because terminal understands utf8

The main rule is: when you get bytes from external source you must convert them into internal representation.

Here is full list of recommendations

Eugen Konkov
  • 22,193
  • 17
  • 108
  • 158
  • Thank you for clarifying those doubts I had, I think its useful for anyone with a similar issue. – lepe Jun 15 '18 at 08:23