4

I am new to transformations between different formats. My goal is to transfer a notation from a toolkit which is in a plain text format to svg. An easy example would be that I have an orange ellipse and the notation would be like this (x and y is the coordinate system so 0 and 0 means the ellipse is in the middle):

GRAPHREP
PEN color:$000000 w:2pt
FILL color:$ff7f00
ELLIPSE x:0pt y:0pt rx:114pt ry:70pt

and my desired output would be an svg code something like this(the cx and cy coordinate are randomly selected for the example):

<svg width="400" height="400" xmlns="http://www.w3.org/2000/svg" xmlns:svg="http://www.w3.org/2000/svg">
 <g>
  <ellipse fill="#ff7f00" stroke="#000000" stroke-width="2" stroke-dasharray="null" stroke-linejoin="null" stroke-linecap="null" cx="250" cy="250" id="svg_1" rx="114" ry="70"/>
 </g>
</svg>

I found these two threads Parse text file with XSLT and XSL transform on text to XML with unparsed-text: need more depth where they transform plain text to xml with XSLT 2.0 and the unparsed-text() function and regex. In my example how would it be possible to get the commands like ELLIPSE(is a regex which recognizes the all uppercase words possible?) and the parameters(is it possible to get with Xpath from plain text anyhow?)? Is a good implementation doable in XSLT 2.0 or should I look for another method? Any help would be appreciated!

Community
  • 1
  • 1
Randy Random
  • 69
  • 2
  • 8

1 Answers1

6

Below is an example of how you can load the text file using unparsed-text(), and parse the content using xsl:analyze-text to produce an intermediate XML document, and then transform that XML using a "push"-style stylesheet.

It shows an example of how to support ELLIPSE, CIRCLE and RECTANGLE text conversion. You may need to customize it a bit, but should give you an idea of what is possible. With the addition of regex and unparsed-text(), XSLT 2.0 and 3.0 makes all sorts of text transformations possible that would have been extremely cumbersome or difficult in XSLT 1.0.

With a file called "drawing.txt" with the following content:

GRAPHREP
PEN color:$000000 w:2pt
FILL color:$ff7f00
ELLIPSE x:0pt y:0pt rx:114pt ry:70pt

GRAPHREP
PEN color:$000000 w:2pt
FILL color:$ff7f00
CIRCLE x:0pt y:0pt rx:114pt ry:70pt

GRAPHREP
PEN color:$000000 w:2pt
FILL color:$ff7f00
RECTANGLE x:0pt y:0pt width:114pt height:70pt

Executing the following XSLT in the same directory:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:local="local"
    exclude-result-prefixes="xs"
    version="2.0"
    xmlns:svg="http://www.w3.org/2000/svg">
    <xsl:output indent="yes"/>

    <!--matches sequences of UPPER-CASE letters -->
    <xsl:variable name="label-pattern" select="'[A-Z]+'"/>
    <!--matches the "attributes" in the line i.e. w:2pt,
        has two capture groups (1) => attribute name, (2) => attribute value -->
    <xsl:variable name="attribute-pattern" select="'\s?(\S+):(\S+)'"/> 
    <!--matches a line of data for the drawing text, 
        has two capture groups (1) => label, (2) attribute data-->
    <xsl:variable name="line-pattern" select="concat('(', $label-pattern, ')\s(.*)\n?')"/>

    <xsl:template match="@* | node()">
        <xsl:copy>
            <xsl:apply-templates select="@* | node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="/">
        <svg width="400" height="400">
            <g>
                <!-- Find the text patterns indicating the shape -->
                <xsl:analyze-string select="unparsed-text('drawing.txt')"
                    regex="{concat('(', $label-pattern, ')\n((', $line-pattern, ')+)\n?')}">
                    <xsl:matching-substring>
                        <!--Convert text to XML -->
                        <xsl:variable name="drawing-markup" as="element()">
                            <!--Create an element for this group, using first matched pattern as the element name 
                                (i.e. GRAPHREP => <GRAPHREP>) -->
                            <xsl:element name="{regex-group(1)}">
                                <!--split the second matched group for this shape into lines by breaking on newline-->
                                <xsl:variable name="lines" select="tokenize(regex-group(2), '\n')"/>
                                <xsl:for-each select="$lines">
                                    <!--for each line, run through this process to create an element with attributes
                                        (e.g. FILL color:$frf7f00 => <FILL color=""/>
                                    -->
                                    <xsl:analyze-string select="." regex="{$line-pattern}">
                                        <xsl:matching-substring>
                                            <!--create an element using the UPPER-CASE label starting the line -->
                                            <xsl:element name="{regex-group(1)}">
                                                <!-- capture each of the attributes -->
                                                <xsl:analyze-string select="regex-group(2)" regex="\s?(\S+):(\S+)">
                                                  <xsl:matching-substring>
                                                  <!--convert foo:bar into attribute foo="bar", 
                                                            translate $ => # 
                                                            and remove the letters 'p' and 't' by translating into nothing"-->
                                                  <xsl:attribute name="{regex-group(1)}" select="translate(regex-group(2), '$pt', '#')"/>
                                                  </xsl:matching-substring>
                                                  <xsl:non-matching-substring/>
                                                </xsl:analyze-string>
                                            </xsl:element>
                                        </xsl:matching-substring>
                                        <xsl:non-matching-substring/>
                                    </xsl:analyze-string>
                                </xsl:for-each>
                            </xsl:element>
                        </xsl:variable>
                        <!--Uncomment the copy-of below if you want to see the intermediate XML $drawing-markup-->
                        <!--<xsl:copy-of select="$drawing-markup"/>-->

                        <!-- Transform XML into SVG -->
                        <xsl:apply-templates select="$drawing-markup"/>

                    </xsl:matching-substring>
                    <xsl:non-matching-substring/>
                </xsl:analyze-string>
            </g>
        </svg>
    </xsl:template>

    <!--==========================================-->
    <!-- Templates to convert the $drawing-markup -->
    <!--==========================================-->

    <!--for supported shapes, create the element using
        lower-case value, and change rectangle to rect
        for the svg element name-->
    <xsl:template match="GRAPHREP[ELLIPSE | CIRCLE | RECTANGLE]">
        <xsl:element name="{replace(lower-case(local-name(ELLIPSE | CIRCLE | RECTANGLE)), 'rectangle', 'rect', 'i')}">
            <xsl:attribute name="id" select="concat('id_', generate-id())"/>
            <xsl:apply-templates />
        </xsl:element>
    </xsl:template>

    <xsl:template match="ELLIPSE | CIRCLE | RECTANGLE"/>

    <!-- Just process the content of GRAPHREP.
        If there are multiple shapes and you want a new 
        <svg><g></g></svg> for each shape, 
        then move it from the template for "/" into this template-->
    <xsl:template match="GRAPHREP/*">
        <xsl:apply-templates select="@*"/>
    </xsl:template>

    <xsl:template match="PEN" priority="1">
        <!--TODO: test if these attributes exist, if they do, do not create these defaults.
            Hard-coding for now, to match desired output, since I don't know what the text
            attributes would be, but could wrap each with <xsl:if test="not(@dasharray)">-->
        <xsl:attribute name="stroke-dasharray" select="'null'"/>
        <xsl:attribute name="stroke-linjoin" select="'null'"/>
        <xsl:attribute name="stroke-linecap" select="'null'"/>
        <xsl:apply-templates select="@*"/>
    </xsl:template>

    <!-- conterts @color => @stroke -->
    <xsl:template match="PEN/@color">
        <xsl:attribute name="stroke" select="."/>
    </xsl:template>

    <!--converts @w => @stroke-width -->
    <xsl:template match="PEN/@w">
        <xsl:attribute name="stroke-width" select="."/>
    </xsl:template>

    <!--converts @color => @fill and replaces $ with # -->
    <xsl:template match="FILL/@color">
        <xsl:attribute name="fill" select="translate(., '$', '#')"/>
    </xsl:template>

    <!-- converts @x => @cx with hard-coded values. 
        May want to use value from text, but matching your example-->
    <xsl:template match="ELLIPSE/@x | ELLIPSE/@y">
        <!--not sure if there was a relationship between ELLIPSE x:0pt y:0pt, and why 0pt would be 250, 
            but just an example...-->
        <xsl:attribute name="c{name()}" select="250"/>
    </xsl:template>

</xsl:stylesheet>

Produces the following SVG output:

<?xml version="1.0" encoding="UTF-8"?>
<svg xmlns:local="local"
     xmlns:svg="http://www.w3.org/2000/svg"
     width="400"
     height="400">
   <g>
      <ellipse id="id_d2e0"
               stroke-dasharray="null"
               stroke-linjoin="null"
               stroke-linecap="null"
               stroke="#000000"
               stroke-width="2"
               fill="#ff7f00"
               cx="250"
               cy="250"
               rx="114"
               ry="70"/>
      <circle id="id_d3e0"
              stroke-dasharray="null"
              stroke-linjoin="null"
              stroke-linecap="null"
              stroke="#000000"
              stroke-width="2"
              fill="#ff7f00"
              x="0"
              y="0"
              rx="114"
              ry="70"/>
      <rect id="id_d4e0"
            stroke-dasharray="null"
            stroke-linjoin="null"
            stroke-linecap="null"
            stroke="#000000"
            stroke-width="2"
            fill="#ff7f00"
            x="0"
            y="0"
            width="114"
            height="70"/>
   </g>
</svg>
Mads Hansen
  • 63,927
  • 12
  • 112
  • 147
  • Wow. Thanks a lot for your help. Now I have an idea how the transformation works. I already gave up hope on XSLT for the task. I will look into it. Thanks! – Randy Random Jan 10 '16 at 17:40
  • 2
    I have updated the XSLT with some inline comments, to try to explain what is going on. It could probably be refactored a little, but wasn't sure what other text descriptions you expect to handle or whether there are would be multiple sets of shapes in the text input. Just wanted to show that it is possible to do in XSLT. There are lots of text matching and manipullation capabilities added in XSLT 2.0 and beyond that make this sort of work possible. – Mads Hansen Jan 10 '16 at 18:16
  • Thanks so much for the update. I have one more question. I used the same drawing.txt and the xsl but when I tried to run the transformation I get this Error: SXXP0003: Error reported by XML parser: content is not allowed in Prolog.org.xml.sax.SAXParseException; systemId: file:/C:/Saxon/drawing.txt; lineNumber: 1; columnNumber: 1; content is not allowed in prolog. The drawing.txt is in UTF-8 without BOM, I also looked in a hex editor and there are no invisible chars in the beginning. Thanks again for your time, it helps me a lot to understand the transformations! – Randy Random Jan 10 '16 at 20:49
  • 1
    The text file is loaded with the I parsed-text() function within the XSLT. It cannot be used as the source to transform, since it is not a well-formed XML document. Use the XSLT as your input XML, since it is an XML file. That will allow you to execute the XSLT and then the template for the root node loads the text file. – Mads Hansen Jan 10 '16 at 20:53
  • You could also make the file name a parameter passed into the XSLT – Mads Hansen Jan 10 '16 at 20:54
  • Ups, I totally overlooked that. Do I understand it correctly that I run the transformation with the XSLT as the source and as the stylesheet, because when I do this it runs perfectly fine, but the output is only the xml and svg headers and a . – Randy Random Jan 10 '16 at 21:19
  • Yes, but then make sure it is loading the text file. Either put it in the same directory as your XSLT, or adjust the path to the text file. – Mads Hansen Jan 10 '16 at 21:22
  • 1
    To debug, echo out the content of the text file in the root template prior to using `xsl:analyze-text` with `` – Mads Hansen Jan 10 '16 at 21:32
  • The files are in the same folder and it seems to find the files, I don't get any errors. It is weird to me that in the output file there isn't even the opening element. Edit: I just saw the new comment I will try that. – Randy Random Jan 10 '16 at 21:38
  • The regex may not be matching. Try with the sample input I had used, to verify. You may want to experiment with the regex patterns if it isn't working exactly with your doc. – Mads Hansen Jan 10 '16 at 22:09
  • I only tested the transformation with the drawing.txt you postet. When I debug it with the posted command, then I get an the same output file but with the drawing.txt content between elements. So it seems there is no problem. I ran the transformation on my other computer with linux but I get the same result, the empty svg with one element. Thanks a lot for your patience I really appreciate it. – Randy Random Jan 10 '16 at 22:43
  • If you are using the same input and XSLT, and you are seeing that txt file is being read, then...which processor and version are you using? I have been running this with Saxon 9.6 through oXygen. – Mads Hansen Jan 10 '16 at 23:03
  • I use SaxonHE 9.7 and just run it in the windows console with the command: java -jar saxon9he.jar transformation.xsl transformation.xsl -o:output.txt. – Randy Random Jan 10 '16 at 23:07
  • 1
    Here is an online working example (using a local string variable instead of `unparsed-text()` ) http://xsltransform.net/gWvjQfE – Mads Hansen Jan 10 '16 at 23:11
  • Thanks for this example, I tested some transformations with this tool and it works really well. I will still try to get everything to run locally but meanwhile I will use this for testing. Thanks! – Randy Random Jan 11 '16 at 00:36