Decode your inputs, encode your outputs. You have two bugs related to failure to properly decode and encode.
Specifically, you're missing
use utf8;
use open ":std", ":encoding(UTF-8)";
Details follow.
Perl source code is expected to be ASCII (with 8-bit clean string literals) unless you use use utf8
to tell Perl it's UTF-8.
I believe you have a UTF-8 terminal. We can conclude from the fact that cat 02.pl
works that your source code is encoded using UTF-8. This means Perl sees the equivalent of this:
print "pr\x{C3}\x{A9}f\x{C3}\x{A9}r\x{C3}\x{A9}\n"; # C3 A9 = é encoded using UTF-8
You should be using use utf8;
so Perl sees the equivalent of
print "pr\x{E9}f\x{E9}r\x{E9}\n"; # E9 = Unicode Code Point for é
You correctly decode the file you read.
The file presumably contains
70 72 C3 A9 66 C3 A9 72 C3 A9 0A # préféré␊ encoded using UTF-8
Because of the encoding layer you add, you are effectively doing
$_ = decode( "UTF-8", "\x{70}\x{72}\x{C3}\x{A9}\x{66}\x{C3}\x{A9}\x{72}\x{C3}\x{A9}\x{0A}" );
or
$_ = "pr\x{E9}f\x{E9}r\x{E9}\n";
This is correct.
Finally, you fail to encode your outputs.
The following does what you want:
#!/usr/bin/env perl
use strict;
use warnings;
use utf8;
BEGIN {
binmode( STDIN, ":encoding(UTF-8)" ); # Well, not needed here.
binmode( STDOUT, ":encoding(UTF-8)" );
binmode( STDERR, ":encoding(UTF-8)" );
}
print "préféré\n";
open( my $fh, '<:encoding(UTF-8)', 'text' ) or die $!;
while ( <$fh> ) { print $_ }
close $fh;
But the open
pragma makes it a lot cleaner.
The following does what you want:
#!/usr/bin/env perl
use strict;
use warnings;
use utf8;
use open ":std", ":encoding(UTF-8)";
print "préféré\n";
open( my $fh, '<', 'text' ) or die $!;
while ( <$fh> ) { print $_ }
close $fh;