0

I have XML form data coming in from a vendor app that allows me to select values for fields on the form by using this format.

<?xml version="1.0" encoding="UTF-8" standalone="no"?> 
<xsl:stylesheet version="1.0" xmlns="http://www.w3.org/1999/XSL/Transform" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1"
       xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> 
<xsl:output method="text" /><xsl:template match="/"><xsl:apply-templates/></xsl:template>
<xsl:template match="DataSet/diffgr:diffgram/Forms">

<xsl:for-each select="Form">
  <xsl:value-of select="FieldName1"/>
  <xsl:value-of select="FieldName2"/>
</xsl:for-each>

</xsl:template>
</xsl:stylesheet>

I can't see the XML, but I can surmise from this that all of the fields are siblings within the <form> and looks something like this:

<form>
<CosignerName1 value="John Doe"/>
<CosignerDate1 value="1/1/1970"/>
<CosignerTime1 value="8:46 PM"/>
<CosignerName2 value="Jane Smith"/>
<CosignerDate2 value="2/2/1972"/>
<CosignerTime2 value="11:46 AM"/>
...
<CosignerName12 value="Will Hunting"/>
<CosignerDate12 value="12/12/1982"/>
<CosignerTime12 value="1:00 AM"/>
</form>
<form>
<CosignerName1 value="Bill Thomas"/>
<CosignerDate1 value="5/5/2020"/>
<CosignerTime1 value="8:46 PM"/>
<CosignerName2 value="Bev Poole"/>
<CosignerDate2 value="6/6/2022"/>
<CosignerTime2 value="11:46 AM"/>
...
<CosignerName12 value="Bob Ross"/>
<CosignerDate12 value="12/12/1982"/>
<CosignerTime12 value="1:00 AM"/>
</form>

Known factors:

  • Each form has 12 cosigners, each with numbered field names i.e. CosignerName1, CosignerName2, etc.
  • Each named cosigner also has a CosignerDate and a CosignerTime field associated by number. i.e. CosignerName1 has CosignerDate1 and CosignerTime1 fields that, when combined, show when that user signed the form.
  • I am using the word 'idnum' in my code and in this question to refer to the number at the end of each label that shows they are part of a set i.e CosignerName1, CosignerDate1, and CosignerTime1 all share idnum=1

Goal: Identify and select only the most recent CosignerName, based on CosignerDate and CosignerTime.

To do this, I figure I need to:

  1. Find the most recent CosignerDate/CosignerTime combination
  2. Read the label of either of the identified fields to extract the idnum from the end
  3. Use the idnum to to identify the correct CosignerName and output it

I have some pieces of the solution, but no way to bring them together:

Create a Timestamp I can combine a set of CosignerDate[idnum] and CosignerTime[idnum] fields, then translate them into a timestamp. If I could create an array of these timestamps, then I could sort and find the most recent one, but as I understand it XSLT 1.0 doesn't have arrays. So I am not sure what to do with this.

<xsl:variable name="mm" select="substring-before(CosignerDate1,'/')" />
<xsl:variable name="mmyyyy" select="substring-after(CosignerDate1,'/')"/>
<xsl:variable name="dd" select="substring-before($mmyyyy,'/')" />
<xsl:variable name="yyyy" select="substring-after($mmyyyy,'/')" />
<xsl:variable name="ampm" select="substring-after(CosignerTime1,' ')" />
<xsl:variable name="time"> 
  <xsl:choose>
    <xsl:when test="$ampm = 'AM'">
      <xsl:value-of select="number(translate(substring-before(CosignerTime1,' '),':',''))" />
    </xsl:when>
    <xsl:when test="$ampm = 'PM'">
      <xsl:value-of select="number(translate(substring-before(CosignerTime1,' '),':',''))+number('1200')" />
    </xsl:when>
  </xsl:choose>
</xsl:variable>
<xsl:value-of select="concat('Timestamp:',$yyyy,$mm,$dd,$time)"/>

The code above outputs something like: Timestamp:202209191128

Get idnums I can loop through all of the Cosigners and extract the identifying number shared by their labels.

<xsl:for-each select="*[starts-with(local-name(), 'CosignerName')]">
  <xsl:variable name="idnum">
    <xsl:value-of select="translate(local-name(),'CosignerName','')" />
  </xsl:variable>
  <xsl:value-of select="$idnum"/>
<xsl:value-of select="','"/>
</xsl:for-each>

This code above outputs something like: 1,4,7,12

But, this is where I get lost.

Failed Attempt I tried something like the code below. The logic was to loop through each Cosigner name (12 iterations), get the idnum from the label, use that to get the CosignerDate and CosignerTime, create a timestamp, then sort the timestamps, if the timestamp we are looking at is the top one (most recent) then use the idnum to select the correct CosignerName and output it.

<?xml version="1.0" encoding="UTF-8" standalone="no"?> 
<xsl:stylesheet version="1.0" xmlns="http://www.w3.org/1999/XSL/Transform" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1"
       xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> 
<xsl:output method="text" /><xsl:template match="/"><xsl:apply-templates/></xsl:template>
<xsl:template match="DataSet/diffgr:diffgram/Forms">
<xsl:for-each select="Form">

<xsl:for-each select="*[starts-with(local-name(), 'CosignerName')]">
  <xsl:variable name="idnum">
    <xsl:value-of select="translate(local-name(),'CosignerName','')" />
  </xsl:variable>
  <xsl:variable name="mm" select="substring-before(concat(CosignerDate,$idnum),'/')" />
  <xsl:variable name="mmyyyy" select="substring-after(concat(CosignerDate,$idnum),'/')"/>
  <xsl:variable name="dd" select="substring-before($mmyyyy,'/')" />
  <xsl:variable name="yyyy" select="substring-after($mmyyyy,'/')" />
  <xsl:variable name="ampm" select="substring-after(concat(CosignerTime,$idnum),' ')" />
  <xsl:variable name="time"> 
    <xsl:choose>
      <xsl:when test="$ampm = 'AM'">
        <xsl:value-of select="number(translate(substring-before(concat(CosignerTime,$idnum),' '),':',''))" />
      </xsl:when>
      <xsl:when test="$ampm = 'PM'">
        <xsl:value-of select="number(translate(substring-before(concat(CosignerTime,$idnum),' '),':',''))+number('1200')" />
      </xsl:when>
    </xsl:choose>
  </xsl:variable>
  <xsl:variable name="timestamp" select="concat('Timestamp: ',$yyyy,$mm,$dd,$time)"/>
  <xsl:sort select="$timestamp" data-type="number" order="descending"/>
  <xsl:if test="position() = 1"><xsl:value-of select="."/></xsl:if>
</xsl:for-each>

<xsl:text>&#xD;&#xA;</xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>

This of course fails for several reasons:

  1. the concat with the $idnum doesn't actually pull values from fields
  2. the sort has to be the first thing in the for-each
  3. the sort has to happen after all of the timestamps are created
  4. the sorted list of timestamps still needs to reference which idnum they were generated from

Without being able to create an array of timestamps and idnums, I don't know how you can accomplish this.

Does anyone know how can I get at the CosignerName[idnum] that is correlated with the most recent CosignerDate[idnum]/CosignerTime[idnum]?

Jeramy
  • 450
  • 1
  • 5
  • 19
  • See if this helps: https://stackoverflow.com/a/59288030/3016153. If not, please edit your question and add an example of the input - see: [mcve]. – michael.hor257k Sep 28 '22 at 22:51
  • And also: https://stackoverflow.com/a/30631073/3016153 – michael.hor257k Sep 28 '22 at 23:41
  • Thanks. I have read through those, but they really don't address the issues I am facing. I have updated my question to include more detail, and hopefully to add clarity to the problem I am trying to address. Essentially I am trying to identify 1 name per form that has a label containing the same number as the pair of name and date fields that containing the most recent date. – Jeramy Sep 29 '22 at 16:27
  • "*I can't see the XML*" Why not? What happens if you apply a stylesheet with only the identity transform template? It would also be useful to identify the processor - see here how: https://stackoverflow.com/a/25245033/3016153 – michael.hor257k Sep 29 '22 at 17:01
  • Microsft XSL version 1.0 – Jeramy Sep 29 '22 at 18:17

1 Answers1

1

I believe you could just sort the CosignerNameX nodes by the individual components of their associated dates and times, as shown in the two other answers I linked to:

However, it might be more efficient - and certainly more readable - to do this in two steps:

  1. Construct a dateTime value for each cosigner;
  2. Find the cosigner with the most recent value.

Consider the following example:

XML

<DataSet xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1">
    <diffgr:diffgram>
        <Forms>
            <form>
                <CosignerName1 value="John Doe"/>
                <CosignerDate1 value="1/1/1970"/>
                <CosignerTime1 value="8:46 PM"/>
                <CosignerName2 value="Jane Smith"/>
                <CosignerDate2 value="2/2/1972"/>
                <CosignerTime2 value="11:46 AM"/>
                <CosignerName12 value="Will Hunting"/>
                <CosignerDate12 value="12/12/1982"/>
                <CosignerTime12 value="1:00 AM"/>
            </form>
            <form>
                <CosignerName1 value="Bill Thomas"/>
                <CosignerDate1 value="5/5/2020"/>
                <CosignerTime1 value="8:46 PM"/>
                <CosignerName2 value="Bev Poole"/>
                <CosignerDate2 value="6/6/2022"/>
                <CosignerTime2 value="11:46 AM"/>
                <CosignerName12 value="Bob Ross"/>
                <CosignerDate12 value="12/12/1982"/>
                <CosignerTime12 value="1:00 PM"/>
            </form>
        </Forms>
    </diffgr:diffgram>
</DataSet>

XSLT 1.0 + EXSLT node-set()

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1"
xmlns:exsl="http://exslt.org/common"
exclude-result-prefixes="diffgr exsl">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

<xsl:template match="/DataSet">
    <output>
        <xsl:for-each select="diffgr:diffgram/Forms/form">
            <xsl:variable name="datetimes">
                <xsl:for-each select="*[starts-with(local-name(), 'CosignerName')]">
                    <!-- identify values -->
                    <xsl:variable name="i" select="substring-after(local-name(), 'CosignerName')" />
                    <xsl:variable name="date" select="../*[local-name()=concat('CosignerDate', $i)]/@value" />
                    <xsl:variable name="time" select="../*[local-name()=concat('CosignerTime', $i)]/@value" />
                    <!-- extract date components -->
                    <xsl:variable name="year" select="substring-after(substring-after($date, '/'), '/')" />
                    <xsl:variable name="month" select="substring-before($date, '/')" />
                    <xsl:variable name="day" select="substring-before(substring-after($date, '/'), '/')" />
                    <!-- extract time components -->
                    <xsl:variable name="hour12" select="substring-before($time, ':')" />
                    <xsl:variable name="minute" select="substring-before(substring-after($time, ':'), ' ')" />
                    <xsl:variable name="pm" select="contains($time,'PM')" />
                    <xsl:variable name="hour" select="$hour12 mod 12 + 12*$pm"/>
                    <!-- construct dateTime  -->
                    <datetime cosigner="{@value}" index="{$i}">
                        <xsl:value-of select="$year"/>
                        <xsl:value-of select="format-number($month, '-00')"/>
                        <xsl:value-of select="format-number($day, '-00')"/>
                        <xsl:value-of select="format-number($hour, 'T00')"/>
                        <xsl:value-of select="format-number($minute, ':00')"/>
                        <xsl:text>:00</xsl:text>
                    </datetime>
                </xsl:for-each>
            </xsl:variable>
            <!-- output -->
            <form>
                <xsl:for-each select="exsl:node-set($datetimes)/datetime">
                    <xsl:sort select="." data-type="text" order="descending"/>
                    <xsl:if test="position()=1">
                        <last-cosigner index="{@index}" dateTime="{.}" >
                            <xsl:value-of select="@cosigner"/>
                        </last-cosigner>
                    </xsl:if>
                </xsl:for-each>
            </form>
        </xsl:for-each>
    </output>
</xsl:template>

</xsl:stylesheet>

Result

<?xml version="1.0" encoding="UTF-8"?>
<output>
   <form>
      <last-cosigner index="12" dateTime="1982-12-12T01:00:00">Will Hunting</last-cosigner>
   </form>
   <form>
      <last-cosigner index="2" dateTime="2022-06-06T11:46:00">Bev Poole</last-cosigner>
   </form>
</output>

P.S. With a Microsoft processor you may need to change the namespace declaration:

xmlns:exsl="http://exslt.org/common"

to:

xmlns:msxsl="urn:schemas-microsoft-com:xslt"

and then change the exsl prefix in lines #5 and #39 to msxsl.

michael.hor257k
  • 113,275
  • 6
  • 33
  • 51
  • This looks promising. I see what I was missing to get values in the ```local-name()=concat('CosignerDate', $i)``` and it looks like you are building key/value pairs and adding them to the $datetimes variable. I am a little unclear on how the `````` part works but it looks like a property of $datetimes? My output is a text index file, so I don't actually need the xml structure, just csv. Do I still need to use node-set()? I will give it a go. Thanks! This is a huge help. – Jeramy Sep 29 '22 at 18:39
  • Running it I get 'datetimes' cannot be a child of the 'xsl:for-each' element. – Jeramy Sep 29 '22 at 18:52
  • fwiw the vendor-provided header for the xlst is: ``` ``` – Jeramy Sep 29 '22 at 18:55
  • You do need the `node-set()` extension function in order to process (i.e. sort) the contents of the `$datetimes` variable. --- I am afraid I cannot help you further because I do not see the entire picture. The code above works in every XSLT 1.0 processor I know (you can see it working with a Microsoft processor here: https://xsltfiddle.liberty-development.net/jypqQ9c). I do not understand the error message you get and I don't know what you mean by "vendor-provided header for the xslt". – michael.hor257k Sep 29 '22 at 20:22
  • Thanks for your help. This has me pointed in the right direction. The comment about the header was just that the vendor provided us with a template xslt sheet containing the structure up to the `````` so I have only ever worked within that loop. Things like the xmlns:msxsl settings were provided. I also updated my example in the question to show these settings in case they are important. Thank you! – Jeramy Sep 29 '22 at 20:26
  • Okay, Thanks to your working example I think I am seeing the problem, your header contains ```xmlns:msxsl="urn:schemas-microsoft-com:xslt"``` vs mine which contains ```xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1"``` My template also includes `````` which wont work with the header you are using, and I assume is required by the vendor. This extends outside of my original question, which you have answered. Thanks! – Jeramy Sep 29 '22 at 21:01
  • I am afraid you are mixing two separate issues. Your header declares the `diffgr` namespace because apparently a `diffgram` element in that namespace is the parent of `form`. You need to **keep** this namespace declaration in order to use any path that goes through this element. You need to **add** the `exsl` or the `msxsl` namespace declaration in order to use the `node-set()` extension function. – michael.hor257k Sep 29 '22 at 21:10
  • P.S. I have edited my example to be closer to what seems to be your real input (I would still prefer to see an actual example). – michael.hor257k Sep 29 '22 at 21:20
  • I went to the tool you were using and updated to what I am using https://xsltfiddle.liberty-development.net/jypqQ9c/1 The only difference in my actual environment is that instead of `````` I have `````` In the tool, this outputs exactly as I expect. In my environment, the output I get is ```
    ``` in a text file. I am assuming at this point it is some weirdness from my vendor messing up the results.
    – Jeramy Sep 29 '22 at 21:56
  • Again, that's something I cannot comment on because I know nothing about the tool you're using. In general, having a template matching `Forms` as the only template is considered bad practice, because then the ancestor nodes will be processed by the built-in templates that will copy all their text content to the output. – michael.hor257k Sep 29 '22 at 22:09