0

I am working on a thing which will need to get at the path data of individual paths in an SVG file, so that I can generate similar paths.

After looking in horror at the plethora of c/c++ libraries for handling SVG files, I decided to use perl (since it is more suited to the job of sorting through sausages of SVG path data) with an XML parser. For some reason I chose XML::Easy (I had a good reason. That was yesterday and I can't remember what it was. Probably the fact that it's supposed to be Easy) as the parser.

However, I have not found any tutorials or documentation other than perldoc and metacpan.

I have managed to read my file into an XML::Easy::Element reference using xml10_read_document, but I have no idea how to get at the actual path data.

How do I get the path sausage from the XML::Easy::Element reference?

Edit: the sausage I am referring to is the d attribute in the path. EG: from

<svg>
[...]
<g id=something>
<path d="M350.41,62.567v0.135l1.118,0.04v-0.135L350.41,62.567z
    M351.898,60.655c-0.242,0-0.433,0.059-0.572,0.175c-0.089,0.104-0.179,0.207-0.269,0.311l-0.014-0.445l-0.875-0.013v0.31
        l0.458,0.067l-0.041,1.421l0.498,0.006l0.014-0.754c0.197-0.449,0.438-0.673,0.72-0.673c0.193,0,0.29,0.11,0.29,0.329
        c0,0.108-0.025,0.223-0.074,0.344l0.316,0.081c0.085-0.148,0.128-0.312,0.128-0.491c0-0.185-0.047-0.34-0.142-0.465
        C352.228,60.723,352.082,60.655,351.898,60.655z
      M350.572,62.816l-0.027,0.922l0.525-0.08l0.006-0.835L350.572,62.816z"/>
</g>
[...]
</svg>

I would like to extract

"M350.41,62.567v0.135l1.118,0.04v-0.135L350.41,62.567z
 M351.898,60.655c-0.242,0-0.433,0.059-0.572,0.175c-0.089,0.104-0.179,0.207-0.269,0.311l-0.014-0.445l-0.875-0.013v0.31
 l0.458,0.067l-0.041,1.421l0.498,0.006l0.014-0.754c0.197-0.449,0.438-0.673,0.72-0.673c0.193,0,0.29,0.11,0.29,0.329      c0,0.108-0.025,0.223-0.074,0.344l0.316,0.081c0.085-0.148,0.128-0.312,0.128-0.491c0-0.185-0.047-0.34-0.142-0.465
        C352.228,60.723,352.082,60.655,351.898,60.655z
      M350.572,62.816l-0.027,0.922l0.525-0.08l0.006-0.835L350.572,62.816z"
Mark Gardner
  • 442
  • 1
  • 6
  • 18
  • 2
    Sausages? I'm German and I appreciate sausages, especially _mit Senf_, but I have no idea what you are talking about. ;) – simbabque Jan 31 '19 at 08:36
  • Can you add a short SVG example file to show which bits you are actually trying to access? See [mcve]. – simbabque Jan 31 '19 at 08:38
  • 1
    Hmmmm, sausages!! Please provide a [mcve], thank you. – Stefan Becker Jan 31 '19 at 08:38
  • @simbabque added an example file – Mark Gardner Jan 31 '19 at 10:46
  • If you want those paths and nothing else, you don't even need to use an XML parser. This is a case were I think a simple regex is fine. – simbabque Jan 31 '19 at 11:00
  • @simbabque Might do that, but then I would have missed a learning opportunity. This is the first time I have ever used an XML parser, and I don't want to give up just because the documentation is complicated. – Mark Gardner Jan 31 '19 at 13:13
  • Even though a regex would probably be able to do this reasonably, you would still have to deal with things like XML decoding, and it wouldn't extend well to more complicated requirements. – Grinnz Jan 31 '19 at 18:31

3 Answers3

3

Here is how I'd do it with Mojo::DOM:

use strict;
use warnings;
use Mojo::DOM;

my $svg = <<'SVG';
<svg>
[...]
<g id="something">
<path d="M350.41,62.567v0.135l1.118,0.04v-0.135L350.41,62.567z
    M351.898,60.655c-0.242,0-0.433,0.059-0.572,0.175c-0.089,0.104-0.179,0.207-0.269,0.311l-0.014-0.445l-0.875-0.013v0.31
        l0.458,0.067l-0.041,1.421l0.498,0.006l0.014-0.754c0.197-0.449,0.438-0.673,0.72-0.673c0.193,0,0.29,0.11,0.29,0.329
        c0,0.108-0.025,0.223-0.074,0.344l0.316,0.081c0.085-0.148,0.128-0.312,0.128-0.491c0-0.185-0.047-0.34-0.142-0.465
        C352.228,60.723,352.082,60.655,351.898,60.655z
      M350.572,62.816l-0.027,0.922l0.525-0.08l0.006-0.835L350.572,62.816z"/>
</g>
[...]
</svg>
SVG

my $dom = Mojo::DOM->new->xml(1)->parse($svg);
my $sausage = $dom->at('path')->{d};

Or if you want to get it from a specific <g> tag instead of the first one:

my $sausage = $dom->at('g#something path')->{d};
Grinnz
  • 9,093
  • 11
  • 18
  • One of the reasons I chose XML::Easy is that it is in apt. I can't find Mojo::DOM in apt, so even though it looks easier, I'll accept the other answer. – Mark Gardner Feb 01 '19 at 12:27
  • @MarkGardner It would be part of the Mojolicious distribution. – Grinnz Feb 01 '19 at 21:20
  • @MarkGardner Another common option that would be nearly as simple would be [XML::LibXML](https://metacpan.org/pod/XML::LibXML). I'm not quite as familiar with XPath to show an example. – Grinnz Feb 01 '19 at 21:25
2

After diving into XML::Easy's docs for a bit, I think this would work:

use strict;
use warnings;
use XML::Easy::Text 'xml10_read_document';
use XML::Easy::NodeBasics qw(xml_e_content_twine xml_e_type_name xml_e_attribute);
use List::Util 'first';

my $svg = <<'SVG';
<svg>
[...]
<g id="something">
<path d="M350.41,62.567v0.135l1.118,0.04v-0.135L350.41,62.567z
    M351.898,60.655c-0.242,0-0.433,0.059-0.572,0.175c-0.089,0.104-0.179,0.207-0.269,0.311l-0.014-0.445l-0.875-0.013v0.31
        l0.458,0.067l-0.041,1.421l0.498,0.006l0.014-0.754c0.197-0.449,0.438-0.673,0.72-0.673c0.193,0,0.29,0.11,0.29,0.329
        c0,0.108-0.025,0.223-0.074,0.344l0.316,0.081c0.085-0.148,0.128-0.312,0.128-0.491c0-0.185-0.047-0.34-0.142-0.465
        C352.228,60.723,352.082,60.655,351.898,60.655z
      M350.572,62.816l-0.027,0.922l0.525-0.08l0.006-0.835L350.572,62.816z"/>
</g>
[...]
</svg>
SVG

my $root = xml10_read_document($svg);
my $contents = xml_e_content_twine($root);
my $g = first { ref $_ and xml_e_type_name($_) eq 'g' and xml_e_attribute($_, 'id') eq 'something' } @$contents;
my $g_contents = xml_e_content_twine($g);
my $path = first { ref $_ and xml_e_type_name($_) eq 'path' } @$g_contents;
my $sausage = xml_e_attribute($path, 'd');

Hardly seems "easy" to me. I'd recommend any of the XPath or CSS equipped parsers instead.

Grinnz
  • 9,093
  • 11
  • 18
0

Since you don't remember the reason you choose to use XML::Easy, consider using XML::Simple instead :-) :

use XML::Simple qw( XMLin ) ;

my $svg = q{
    <svg width="190" height="160" xmlns="http://www.w3.org/2000/svg">
      <path d="M10 10 C 20 20, 40 20, 50 10" stroke="black" fill="transparent"/>
      <path d="M70 10 C 70 20, 120 20, 120 10" stroke="black" fill="transparent"/>
      <path d="M70 10 C 70 20, 120 20, 120 10" stroke="black" fill="transparent"/>
    </svg>
};

my $decoded = XMLin( $svg, ForceArray => 1 ) ;

foreach my $path ( @{ $decoded->{ path } } ){
    my $data = $path->{d} ;
    my $stroke = $path->{stroke} ;
    my $fill = $path->{ fill } ;
    ...
}

Where $decoded->{ path } is:

 [
     {
         d      => "M10 10 C 20 20, 40 20, 50 10",
         fill   => "transparent",
         stroke => "black"
     },
     {
         d      => "M70 10 C 70 20, 120 20, 120 10",
         fill   => "transparent",
         stroke => "black"
     },
     {
         d      => "M70 10 C 70 20, 120 20, 120 10",
         fill   => "transparent",
         stroke => "black"
     }
 ]

BEWARE : As pointed by Grinnz, don't use XML::Simple for anything else more complex than a trivial SVG.

Hannibal
  • 445
  • 4
  • 13
  • Please don't suggest XML::Simple for any reason. See [the answers here for why](https://stackoverflow.com/questions/33267765/why-is-xmlsimple-discouraged). – Grinnz Jan 31 '19 at 18:21
  • @Grinnz really? For a simple SVG? No one reason in the posted link meets this very trivial case. No one. – Hannibal Jan 31 '19 at 18:46
  • Yes, because every other option (aside from XML::Easy, apparently) is simpler and won't break when trying to do something more complex. – Grinnz Jan 31 '19 at 18:53
  • You're right as always, off course.. but just for doing easy philosophy: Isn't like saying to use the chainsaw to cut the bread, because one day it could happen to have to cut something harder? :-) – Hannibal Jan 31 '19 at 19:07
  • In my opinion XML::Simple is the chainsaw, but it's the one without a handle. ;) – Grinnz Jan 31 '19 at 19:14
  • But what if I have multiple paths? – Mark Gardner Feb 01 '19 at 12:23