2

I'm having trouble reading Cyrillic characters from a file in perl.

The text file is written in Notepad and contains "абвгдежзийклмнопрстуфхцчшщъьюя". Here's my code:

#!/usr/bin/perl

use warnings;
use strict;

open FILE, "text.txt" or die $!;

while (<FILE>) {
    print $_;   
}

If I save the text file using the ANSI encoding, I get:

рстуфхцчшщъыьэюяЁёЄєЇїЎў°∙·№■

If I save it using the UTF-8 encoding, and I use the function decode('UTF-8', $_) from the package Encode, I get:

Wide character in print at test.pl line 11, <TEXT> line 1.

and a bunch of unreadable characters.

I'm using the command prompt in windows 7x64

Daniel Rusev
  • 1,331
  • 2
  • 16
  • 35

1 Answers1

5

You're decoding your inputs, but "forgot" to encode your outputs.

Your file is probably encoded using cp1251.

Your terminal expects cp866.

Use

use open ':std', ':encoding(cp866)';
use open IO => ':encoding(cp1251)';
open(my $FILE, '<', 'text.txt')
   or die $!;

or

use open ':std', ':encoding(cp866)';
open(my $FILE, '<:encoding(cp1251)', 'text.txt')
   or die $!;

Use :encoding(UTF-8) instead of :encoding(cp1251) if you saved as UTF-8.

ikegami
  • 367,544
  • 15
  • 269
  • 518