4

I Have The following block in the beginning of my script:

#!/usr/bin/perl5 -w
use strict;
binmode(STDIN, ":utf8");
binmode(STDOUT, ":utf8");
binmode(STDERR, ":utf8");

In some subroutines when there is other encoding(from a distant subroutine), the data will not display correctly, when receiving cyrillic or other characters. It is the "binmode", that causes the problem.

Can I "turn off" the binmode utf8 locally, for the subroutine only?

I can't remove the global binmode setting and I can't change the distant encoding.

DanielLazarov
  • 160
  • 3
  • 13

3 Answers3

7

One way to achieve this is to "dup" the STD handle, set the duplicated filehandle to use the :raw layer, and assign it to a local version of the STD handle. For example, the following code

binmode(STDOUT, ':utf8');
print(join(', ', PerlIO::get_layers(STDOUT)), "\n");

{
    open(my $duped, '>&', STDOUT);
    # The ':raw' argument could also be omitted.
    binmode($duped, ':raw');
    local *STDOUT = $duped;
    print(join(', ', PerlIO::get_layers(STDOUT)), "\n");
    close($duped);
}

print(join(', ', PerlIO::get_layers(STDOUT)), "\n");

prints

unix, perlio, utf8
unix, perlio
unix, perlio, utf8

on my system.

nwellnhof
  • 32,319
  • 7
  • 89
  • 113
  • This didn't work for me using ActiveState Perl 5.20 on Windows. Nothing I tried would remove the utf8 layer from STDOUT or a dupe of it once utf8 had been added by 'use open' or a previous binmode call. – Freon Sandoz Feb 18 '22 at 03:27
3

I like @nwellnhof's approach. Dealing only with Unicode and ASCII - a luxury few enjoy - my instinct would be to leave the bytes as is and selectively make use of Encode to decode()/encode() when needed. If you are able to determine which of your data sources are problematic you could filter/insert decode when dealing with them.

% file koi8r.txt 
koi8r.txt: ISO-8859 text
% cat koi8r.txt 
������ �� ����� � ������� ���. ���
���� ����� ������ ����� �����.
% perl -CO -MEncode="encode,decode" -E 'say decode("koi8-r", <>) ;' koi8r.txt
Американские суда находятся в международных водах. Япония
G. Cito
  • 6,210
  • 3
  • 29
  • 42
0

You could use something like Scope::Guard - lexically-scoped resource management to ensure it gets set back to :utf8 when you leave the scope, regardless of how (return, die, whatever):

#!/usr/bin/perl -w
use strict;

use Scope::Guard qw(guard);

binmode(STDOUT, ':utf8');
print(join(', ', PerlIO::get_layers(STDOUT)), "\n");

{
    # When guard goes out of scope, this sub is guaranteed to be called:
    my $guard = guard {
        binmode(STDOUT, ':utf8');
    };
    binmode(STDOUT, ':raw');
    print(join(', ', PerlIO::get_layers(STDOUT)), "\n");
}

print(join(', ', PerlIO::get_layers(STDOUT)), "\n");

Or, if you don't want to include a new dependency like Scope::Guard (Scope::Guard is awesome for this kind of localizing...):

#!/usr/bin/perl -w
use strict;

binmode(STDOUT, ':utf8');
print(join(', ', PerlIO::get_layers(STDOUT)), "\n");

{
    my $guard = PoorMansGuard->new(sub {
        binmode(STDOUT, ':utf8');
    });
    binmode(STDOUT, ':raw');
    print(join(', ', PerlIO::get_layers(STDOUT)), "\n");
}

print(join(', ', PerlIO::get_layers(STDOUT)), "\n");

package PoorMansGuard;

sub new {
    my ($class, $sub) = @_;
    bless { sub => $sub }, $class;
}

sub DESTROY {
    my ($self) = @_;
    $self->{sub}->();
}
Peter V. Mørch
  • 13,830
  • 8
  • 69
  • 103