0

I am using solr-8.8 . I hav an XML data and want to import it to solr index. I am using
curl "http://<host_name>:8983/solr/<core_name>/update?commit=true&tr=updateXSLDatasource1.xsl" -H "Content-Type: text/xml" --data-binary @data1.xml to import my data. Here my data:

<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
<responseDate>2021-06-29T14:39:46Z</responseDate>
<request verb="ListRecords" metadataPrefix="oai_dc" set="p15869coll19" from="2020-09-01" until="2020-09-30">http://digital.americanancestors.org/oai/oai.php</request>
<ListRecords>
    <record>
        <header>
            <identifier>oai:digital.americanancestors.org:p15869coll19/17</identifier>
            <datestamp>2020-09-21</datestamp>
            <setSpec>p15869coll19</setSpec>
        </header>
        <metadata>
            <oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
                <dc:title>Papers of Abraham C. Ratshesky</dc:title>
                <dc:identifier>P-586</dc:identifier>
                <dc:relation>01</dc:relation>
                <dc:relation>01</dc:relation>
                <dc:subject>Halifax Explosion, Halifax, N.S. 1917; Massachusetts, General Court. Senate; American Red Cross. Boston Metropolitan Chapter</dc:subject>
                <dc:description>Photographs; Morse Family</dc:description>
                <dc:rights>Open access</dc:rights>
                <dc:rights>User has an obligation to determine copyright or other use restrictions prior to publication or distribution. Please contact the archives at jhcreference@nehgs.org or 617-226-1245 for more information.</dc:rights>
                <dc:language>English</dc:language>
                <dc:source>Wyner Family Jewish Heritage Center, New England Historic Genealogical Society</dc:source>
                <dc:identifier>http://digital.americanancestors.org/cdm/ref/collection/p15869coll19/id/17</dc:identifier>
            </oai_dc:dc>
        </metadata>
    </record>
   </ListRecords>
  </OAI-PMH>

Here my XSLT file(XSL generated with Saxon 9.5.1.6 HE engine):

<?xml version="1.0" encoding="UTF-8" ?>
 <xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0"
xpath-default-namespace="http://www.openarchives.org/OAI/2.0/"
xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
exclude-result-prefixes="oai_dc dc">

<xsl:output method="xml" doctype-public="XSLT-compat" omit-xml-declaration="yes" encoding="UTF-8" indent="yes"/>

<xsl:template match="/">
    <xsl:element name="add">
        <xsl:apply-templates select="OAI-PMH/ListRecords/record"/>
    </xsl:element>
</xsl:template>

<xsl:template match="OAI-PMH/ListRecords/record">
    <xsl:element name="doc">
        <xsl:element name="field">
            <xsl:attribute name="name">id</xsl:attribute>
            <xsl:value-of select="header/identifier"/>
        </xsl:element>
        <xsl:element name="field">
            <xsl:attribute name="name">tcngrams_title</xsl:attribute>
            <xsl:value-of select="metadata/oai_dc:dc/dc:title"/>
        </xsl:element>
        <!--  <field name="tcngrams_content">
              <xsl:variable name="description" select="metadata/oai_dc:dc/dc:description"/>
              <xsl:variable name="subject" select="metadata/oai_dc:dc/dc:subject"/>
              <xsl:variable name="coverage" select="metadata/oai_dc:dc/dc:coverage"/>
              <xsl:value-of
                      select="string-join(($description,$subject,$coverage),' ')"/>
          </field>-->

        <xsl:for-each select="metadata/oai_dc:dc/dc:identifier">
            <xsl:choose>
                <xsl:when test="starts-with(., 'http://')">
                    <xsl:element name="field">
                        <xsl:attribute name="name">sm_url</xsl:attribute>
                        <xsl:value-of select="replace(., 'http://', 'https://')"/>
                    </xsl:element>
                </xsl:when>
                <xsl:otherwise>
                    <xsl:if test="starts-with(., 'https://')">
                        <xsl:element name="field">
                            <xsl:attribute name="name">sm_url</xsl:attribute>
                            <xsl:value-of select="."/>
                        </xsl:element>
                    </xsl:if>
                </xsl:otherwise>
            </xsl:choose>
        </xsl:for-each>
        <xsl:element name="field">
            <xsl:attribute name="name">ds_date_created</xsl:attribute>
            <xsl:value-of select="concat(header/datestamp,'T00:00:00Z')"/>
        </xsl:element>
        <xsl:element name="field">
            <xsl:attribute name="name">ds_date_updated</xsl:attribute>
            <xsl:value-of select="concat(header/datestamp,'T00:00:00Z')"/>
        </xsl:element>
        <xsl:if test="metadata/oai_dc:dc/dc:type">
            <xsl:element name="field">
                <xsl:attribute name="name">ss_category</xsl:attribute>
                <xsl:value-of select="metadata/oai_dc:dc/dc:type"/>
            </xsl:element>
        </xsl:if>
        <xsl:element name="field">
            <xsl:attribute name="name">ss_topic</xsl:attribute>
            <xsl:for-each select="metadata/oai_dc:dc/dc:subject">
                <xsl:variable name="subject" select="normalize-space(substring-before(.,'--'))"/>

                <xsl:choose>
                    <xsl:when test="$subject">
                        <xsl:value-of select="$subject"/>
                    </xsl:when>
                    <xsl:otherwise>
                        <xsl:value-of select="normalize-space(.)"/>
                    </xsl:otherwise>
                </xsl:choose>
            </xsl:for-each>
        </xsl:element>
        <xsl:if test="metadata/oai_dc:dc/dc:creator">
            <xsl:element name="field">
                <xsl:attribute name="name">tcngrams_author_name</xsl:attribute>
                <xsl:value-of select="metadata/oai_dc:dc/dc:creator"/>
            </xsl:element>
        </xsl:if>
        <xsl:for-each select="metadata/oai_dc:dc/dc:source">
            <xsl:element name="field">
                <xsl:attribute name="name">sm_source</xsl:attribute>
                <xsl:value-of select="."/>
            </xsl:element>
        </xsl:for-each>
        <xsl:for-each select="metadata/oai_dc:dc/dc:relation">
            <xsl:element name="field">
                <xsl:attribute name="name">sm_source</xsl:attribute>
                <xsl:value-of select="."/>
            </xsl:element>
         </xsl:for-each>
        </xsl:element>
     </xsl:template>
    </xsl:stylesheet>

Running the above command I am getting this error message:

Caused by: javax.xml.transform.TransformerConfigurationException: line 8: Illegal attribute 'xpath-default-namespace'. at java.xml/com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl.passWarningsToListener(TransformerFactoryImpl.java:841) at java.xml/com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl.newTemplates(TransformerFactoryImpl.java:1037) ... 49 more

What is the reason for this error message?

As Vard
  • 35
  • 5
  • 3
    You are trying to run XSLT 2 code with Xalan, an XSLT 1 processor. – Martin Honnen Jun 30 '21 at 16:18
  • You are mean that solr uses Xalan instead of Saxon? – As Vard Jun 30 '21 at 16:20
  • 1
    The built-in XSLT processor/Transformer in Java is usually Xalan, in your case you see `java.xml/com.sun.org.apache.xalan.` in your stack trace indicating that Xalan is used to execute your XSLT. In the Java world it is usually sufficient to put Saxon 9 or 10 on the classpath to get XSLT 2 or 3 support, I don't know how easy that is in the context of Solr. – Martin Honnen Jun 30 '21 at 16:24
  • What do you think about this? https://cwiki.apache.org/confluence/display/SOLR/XsltResponseWriter – As Vard Jun 30 '21 at 16:27
  • 1
    Well, it has a section "Using Saxon for XSLT" that hopefully helps you setting up Solr to use Saxon, even if that section talks about Saxon B, the main approach should work with Saxon HE 9.9 or 10, the current versions of Saxon, as well. – Martin Honnen Jun 30 '21 at 16:34
  • So this is the right way to solve my problem? – As Vard Jun 30 '21 at 16:36
  • @AsVard If you want, you could "downgrade" your stylesheet to XSLT 1.0 by declaring a prefix instead of a default namespace - see: https://stackoverflow.com/a/34762628/3016153 – michael.hor257k Jun 30 '21 at 16:39
  • BTW, you could shorten it significantly by using literal result elements and attribute value templates. – michael.hor257k Jun 30 '21 at 16:41
  • 1
    If you wan to use that stylesheet you have posted you need to use an XSLT 2 processor, with Java that is usually done by putting the selected Saxon 9 or 10 HE jar on the class path. – Martin Honnen Jun 30 '21 at 16:55
  • Thank you for the suggested solutions. I have solved this problem by adding Saxon 10 to solr. – As Vard Jul 01 '21 at 11:59

0 Answers0