7

it seems like template toolkit isn't handling encoding properly.

I am passing template->process a filename (where to get the template), a hash reference (containing all the parameters), and a scalar reference (where to put the output) then I am returning that and in turn displaying it to the user.

When I give it a string with umlauts in it, the html output includes a black diamond with a white question mark in place of every letter (but the correct number of letters). Any other character comes out fine.

I am using a warn to print out the string before I make the call to template->process and at this point it is fine, from what I can tell it is during the template->process call that things get turned into garbage.

Any ideas? I have tried using ENCODING => "utf8" as well as binmode => ":utf8" but neither have any affect on the output.

Here is my code with some of the fat trimmed out just to show my call to template->process, note that if I leave out the {binmode => 'utf8'} it has no effect.

<put variables in hash referenced to by vars>
<print out variables in has referenced to by $var>
my $data;
$template->process( $self->filename, $vars, \$data, {binmode => ':utf8'}) || die "Template process failed: ", $template->error();
return $data;

SOLVED Hey all thanks for your answers, the problem turned out to be that after template process had done its thing, we then wrote the string to a temporary file before outputting it so we also needed to set binmode for the file, the code now looks like:

<put variables in hash referenced to by vars>
<print out variables in has referenced to by $var>
my $data;
binmode( STDOUT, ":utf8" );
$template->process( $self->filename, $vars, \$data, {binmode => ':utf8'}) || die "Template process failed: ", $template->error();
return $data;

I thank you all for your time :)

Code Magician
  • 23,217
  • 7
  • 60
  • 77
cjh
  • 1,113
  • 1
  • 9
  • 21
  • Probably unrelated, by try passing UTF-8 instead of utf8. Sometimes it matters, eg http://jeremy.zawodny.com/blog/archives/010546.html – Hugmeir Dec 20 '10 at 01:50
  • 2
    Is the UTF-8 text getting mangled or is the browser just not picking up the correct encoding? – mu is too short Dec 20 '10 at 02:08
  • If it were the latter how could I tell the browser what encoding to use? Using HTML template pro (the system we are moving from) the characters are displayed correctly. This lead me to believe it was TT but I could be wrong. On the train now so I will have to try out any suggestions tomorrow morning. – cjh Dec 20 '10 at 02:18
  • Hugmeir: I tried both. I also tested passing it an obviously invalid encoding 'futf8' and for ENCODING it complained but for binmode it did not. – cjh Dec 20 '10 at 02:21
  • 1
    Please **show the real code** and a template and the fill-in data that together exhibit the problem. Also provide a dump of the fill-in data made with Devel::Peek or Devel::StringInfo. As the question stands now, it is guesswork. – daxim Dec 20 '10 at 08:17

2 Answers2

12

The code below works. $data, specifically the strings contained must be Perl strings, i.e. properly decoded. See the introduction to encoding in the official documentation.

use Template '2.21_02';

my $tt = Template->new({
    ENCODING     => 'utf8',
    # other options …
});

$tt->process(
    $template, $data, $output, {binmode => ':utf8'}
) or die $tt->error . ' in ' . $template;
daxim
  • 39,270
  • 4
  • 65
  • 132
  • 1
    I think this is exactly what the suicideducky said he tried in the original question, no? – Chris Dolan Dec 20 '10 at 20:04
  • 2
    No. He said it did not work, I asserted it does work. I also wrote about a missing piece in the puzzle. – daxim Dec 21 '10 at 10:56
  • Accepted, even though it was what I wrote, this also answers the question that I asked. Template toolkit is handling the encoding fine. Thanks all. – cjh Dec 21 '10 at 20:45
1

My solution was to punt, and all non-latin characters go in en.po to be included at runtime via [% loc('string') %] and that works fine for me since my templates had to be localized anyway.

Chris Dolan
  • 8,905
  • 2
  • 35
  • 73
  • Thanks for your reply however in my case this is user supplied data and there are many many templates so doing this is not really an option – cjh Dec 20 '10 at 02:25
  • 1
    @suicideducky - Then try Template::Plugin::Decode and use `| decode` after each of your user strings – Chris Dolan Dec 20 '10 at 20:36