12

I have a simple POD text file:

$ cat test.pod 
=encoding UTF-8

Münster

It is encoded in UTF-8, as per this literal hex dump of the file:

00000000  3d 65 6e 63 6f 64 69 6e  67 20 55 54 46 2d 38 0a  |=encoding UTF-8.|
00000010  0a 4d c3 bc 6e 73 74 65  72 0a                    |.M..nster.|
0000001a

The "ü" is being encoded as the two bytes C3 and BC.

But when I run perldoc on the file it is turning my lovely formatted UTF-8 characters into ASCII.

What's more, it is correctly handling the German language convention of representing "ü" as "ue".

$ perldoc test.pod | cat
TEST(1)               User Contributed Perl Documentation              TEST(1)

Muenster

perl v5.16.3                      2014-06-10                           TEST(1)

Why is it doing this?

Is there an additional declaration I can put into my file to stop it from happening?


After additional investigation with App::perlbrew I've found the difference comes from having a particular version of Pod::Perldoc.

perl-5.10.1    3.14_04    Muenster
perl-5.12.5    3.15_02    Muenster
perl-5.14.4    3.15_04    Muenster
perl-5.16.2    3.17       Münster
perl-5.16.3    3.19       Muenster
perl-5.16.3    3.17       Münster
perl-5.17.3    3.17       Münster
perl-5.18.0    3.19       Muenster
perl-5.18.1    3.23       Münster

However I would still like, if possible, a way to make Pod::Perldoc 3.14, 3.15, and 3.19 behave "correctly".

Kara
  • 6,115
  • 16
  • 50
  • 57
Kaoru
  • 1,540
  • 11
  • 14

1 Answers1

6

Found this RT ticket http://rt.cpan.org/Public/Bug/Display.html?id=39000

This "bug" seems to be introduced with Perl 5.10 and perhaps this was solved in later versions.

Also see: How can I use Unicode characters in Perl POD-derived man pages? and incorrect behaviour of perldoc with UTF-8 texts.

You should add the latest available version of Pod::Perldoc as a dependency.

Community
  • 1
  • 1
Chankey Pathak
  • 21,187
  • 12
  • 85
  • 133
  • Interesting that according to https://metacpan.org/source/MALLEN/Pod-Perldoc-3.23/Changes RT #39000 was fixed in 3.15_12 but I see a regression in 3.19. – Kaoru Jun 10 '14 at 12:05
  • 1
    The solution I settled on was to add `Pod::Perldoc 3.21` as a dependency, which should ensure that anybody using the module has a recent enough version of `perldoc` to actually read the POD! – Kaoru Jun 10 '14 at 12:10
  • Yes, something strange is cooking there. Using the latest version is what I can suggest. – Chankey Pathak Jun 10 '14 at 12:11
  • 1
    If you edit your answer to include "add the latest version of Pod::Perldoc as a dependency" I'll mark it as **Accepted** :-) – Kaoru Jun 10 '14 at 12:16
  • 1
    **Update:** Pod::Perldoc 3.24, released August 19th 2014, now supports full UTF-8 text! Read more [from Mark Allen on blogs.perl.org](http://blogs.perl.org/users/mark_allen/2014/09/if-your-core-perl-documentation-uses-encoding-please-test-the-new-perldoc-release.html). – Kaoru Sep 29 '14 at 12:03