In your environment you can use XSLT 1.0 to transform the document and generate IDs during the process. See: DBMS_XSLPROCESSOR.
With a XSLT stylesheet you can copy the nodes from your XML source to a result tree, creating unique IDs in the process. The IDs will not be sequential numbers, but unique string sequences generated by the generate-id()
method. You can't control what they look like, but you can guarantee they are unique. (XSLT also allows you to get rid of duplicate nodes (using a key) if that's your intention, but from your example I understood that duplicate *ID*s doesn't actually mean the node is a duplicate, since you want to generate a new ID for it.)
The stylesheet below has two templates. The second one is an identity transform: it simply copies elements and attributes to the result tree. The first template creates an attribute named id
containing an unique ID.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:strip-space elements="*"/>
<xsl:output indent="yes"/>
<xsl:template match="book">
<xsl:copy>
<xsl:attribute name="id">
<xsl:value-of select="generate-id(.)"/>
</xsl:attribute>
<xsl:apply-templates select="node()|@*[name() != 'id']"/>
</xsl:copy>
</xsl:template>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
The other templates (in this case only the identity template) are called for all nodes and attributes, except the id
attribute by <xsl:apply-templates ...>
. The result is a copy of your original XML file with generated unique IDs for the book
elements.
If you had a XML such as this one:
<bookstore>
<books>
<book id="1" other="123"/>
<book id="2"/>
<book id="2"/>
<book id="3">
<chapter number="123" id="ch1">Text</chapter>
</book>
<book id="10"/>
</books>
<magazines>
<mag id="non-book-id"></mag>
</magazines>
</bookstore>
the XSLT above would transform it into this XML:
<bookstore>
<books>
<book id="d2e3" other="123"/>
<book id="d2e4"/>
<book id="d2e5"/>
<book id="d2e6">
<chapter number="123" id="ch1">Text</chapter>
</book>
<book id="d2e9"/>
</books>
<magazines>
<mag id="non-book-id"/>
</magazines>
</bookstore>
(the string sequences are arbitrary, and might be different in your implementation).
For creating ID/IDREF links the generated string sequences are better than numbers since you can use them anywhere (numbers and identifiers that start with numbers can't always be used as IDs). But if string sequences are not acceptable and you need sequential numbers, you can use XPath node position()
in XQuery or XSLT to generate a number based on the element's position in the whole document (which will be unique). If all books are siblings in the same context, you can simply replace the generate-id(.)
in the stylesheet above for position()
:
<xsl:template match="book">
<xsl:copy>
<xsl:attribute name="id">
<xsl:value-of select="position()"/>
</xsl:attribute>
<xsl:apply-templates select="node()|@*[name() != 'id']"/>
</xsl:copy>
</xsl:template>
(if the books are not siblings, you will need to do it in a slightly different way, using a variable).
If you want to retain the existing IDs and only generate sequential ones for the duplicates, it will be a bit more complicated but you can achieve that with keys (or XQuery instead of XSLT). The maximum id
can be obtained in XPath 2.0 using the max()
function:
max(//book/@id)
That function does not exist in XPath 1.0, but you can obtain the maximum ID by using:
//book[not(@id < //book/@id)]/@id