1

When I dump xml using Simple::XML I end up with strings that contain escaped characters such as \x{e6}. Here is an example

#!/usr/bin/perl
use Data::Dumper;
use Encode;

$s="sel\x{e6}re";
decode_utf8($s);
print Dumper $s;

outputs

$VAR1 = 'sel�re';

Question

How can I get the weird character into UTF-8?

Update

Here is the full xml output. http://pastebin.com/Sitm01kh

Update 2

As pointed out in the comments, the XML is fine, but the problem comes when I

my $ref = XMLin($xml, ForceArray => 1, KeyAttr => { Element => 'Id' });
print Dumper $ref;

http://pastebin.com/7KDB50fd

Jasmine Lognnes
  • 6,597
  • 9
  • 38
  • 58

2 Answers2

1
#!/usr/bin/perl

use DDP;
use XML::Simple;

my $xml = '<Element Id="496669" ParentId="495555" Name="Klasselærere" ContextName="01005 Advanced Engineering Mathematics 1 E15/Klasselærere" IsArchived="false" SubgroupCount="0" />';

my $result = XMLin($xml);

binmode(STDOUT, ":utf8");
print p($result)

produces the following output

{
   ContextName     "01005 Advanced Engineering Mathematics 1 E15/Klasselærere",
   Id              496669,
   IsArchived      "false",
   Name            "Klasselærere",
   ParentId        495555,
   SubgroupCount   0
   }

Data::Dumper itself works weirdly with unicode. Use Data::Printer to see unicode characters

sotona
  • 1,731
  • 2
  • 24
  • 34
1

I guess that your terminal is not able to display the caracter \xe6.

If you are on linux, type 'locale' to see what are the settings of your terminal.

You can try to set the terminal encoding like that :

export LC_ALL=utf-8

jhoran
  • 190
  • 1
  • 8
  • locales influence the settings of your shell, not the terminal. The terminal is usually configured via its menu. – choroba Mar 16 '16 at 13:54
  • you're right; I meant the terminal'shell. AFAIK, Data::Dumper works fine with unicode, so it seems to me the problem is more on the shell side. – jhoran Mar 16 '16 at 13:59
  • to work with Data::Dumper and unicode you should configure Data::Dumper::AutoEncode first, which makes code quite noisy – sotona Mar 16 '16 at 14:27