0

I'm trying to find some ways to search a vendor-sku inside a product name. I am matching the value of the <vendor-sku>HT900C</vendor-sku> from retailer's feed inside this <vendor-product-name>Ventilateur TurboForce&#7481;&#7472; HT900C Honeywell</vendor-product-name> on vendor's feed.

Vendor's Feed:

<?xml version="1.0" encoding="UTF-8"?>
<products module-id="kazfanscafr">
<product type="product" wcpc="1562772927361"><gtin>00092926109004</gtin><vendor-product-name>Ventilateur TurboForce&#7481;&#7472; **HT900C** Honeywell</vendor-product-name><provided-by>Kaz</provided-by>
<product type="product" wcpc="1562774715788"><gtin>00092926310905</gtin><vendor-product-name>Ventilateur Turbo&#7481;&#7472; On the GO! HTF090BC Honeywell</vendor-product-name><vendor-clean-product-name>Ventilateur Turbo&#7481;&#7472; On the GO **HTF090BC** Honeywell</vendor-clean-product-name><provided-by>Kaz</provided-by>
</products>

Retailer's feed:

<product><vendor>KAZ CANADA INC</vendor><vendor-sku>**HT900C**</vendor-sku><channel-product-name>Fan, High Performance, 8", Black</channel-product-name><channel-product-id>KAZHT900C</channel-product-id><on-sale>true</on-sale><product-url>https://www.eway.ca/Eway/Product/KAZHT900C.aspx</product-url></product>
<product><vendor>KAZ CANADA INC</vendor><vendor-sku>**HTF090BC**</vendor-sku><channel-product-name>Honeywell Turbo on the Go, portable fan</channel-product-name><channel-product-id>KAZHTF090BC</channel-product-id><on-sale>true</on-sale><product-url>https://www.eway.ca/Eway/Product/KAZHTF090BC.aspx</product-url></product>
<product><vendor>KAZ CANADA INC</vendor><vendor-sku>HTF1220C</vendor-sku><channel-product-name>HONEYWELL 12" Portable Table Fan</channel-product-name><channel-product-id>KAZHTF1220C</channel-product-id><on-sale>true</on-sale><product-url>https://www.eway.ca/Eway/Product/KAZHTF1220C.aspx</product-url></product>
<product><vendor>KAZ CANADA INC</vendor><vendor-sku>HTF210BC</vendor-sku><channel-product-name>Quietset table fan</channel-product-name><channel-product-id>KAZHTF210BC</channel-product-id><on-sale>true</on-sale><product-url>https://www.eway.ca/Eway/Product/KAZHTF210BC.aspx</product-url></product>

So my work is basically to find a match between these two feeds, I need to match the vendor's SKU/GTIN to the product's SKU/GTIN posted on retailer's site/feed. I am injecting enriched content to the products, so to do that, I need to match these IDs between the two feeds as a channel or a bridge. But since on this case, I asked help because the SKU was inserted on the product name.

Usually, I can use my default operation to search for their IDs:

<xsl:call-template name="search-feeds-by-sku"> <xsl:with-param name="vendor-data-feed-field-to-compare" select="'gtin'" wcmt:editorDisplay="hidden"/> <xsl:with-param name="product-data-feed-field-to-compare" select="'gtin'" wcmt:editorDisplay="hidden"/> </xsl:call-template>

but on this instance. I need to do a substring or a regex to manipulate the results

I have already tried different substring functions. I couldn't make it work for substring-after and substring-before because of inconsistent format on the product name.

<method confidence="0.9" display-name="map-feed-by-name" xsi:type="map-by-virtual-feed"><product-data-matcher>/products/product[contains(vendor-sku, '{concat('vendor-product-name', " ")}')]</product-data-matcher>
            </method>

So I expect to find the vendor-sku (HT900C) inside the product-name since I concatenate by " " (whitespace).

Output should be:

Ventilateur

TurboForce&#7481;&#7472;

HT900C

Honeywell

and by then I should get a match HT900C, but it returns nothing. I'm wondering if I missed something or this whole approach is not recommended at all. I'm using XPath 1.0 and the processor is XSLT 2.0. Thanks for your help in advance!

Here's my current solution

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
   <xsl:import href="eway-fr-ca-fr/map-common.xml" xml:base="{$common-folder-uri}/"/>
   <xsl:template match="/"<map-operation xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" channel-id="eway-fr-ca-fr" module-id="kazfanscafr">
         <skip-if-no-new-channel-product-found ttl-hours="720"/>
         <allow-multiple-mappings/>
         <methods>
            <xsl:call-template name="search-feeds-by-sku"/>
            <xsl:call-template name="search-feeds-by-sku">
               <xsl:with-param name="vendor-data-feed-field-to-compare" select="'gtin'" wcmt:editorDisplay="hidden"/>
               <xsl:with-param name="product-data-feed-field-to-compare" select="'gtin'" wcmt:editorDisplay="hidden"/>
            </xsl:call-template>
            <method confidence="0.9" display-name="map-feed-by-name" xsi:type="map-by-virtual-feed">
               <product-data-matcher>/products/product[contains(vendor-sku, '{concat(vendor-product-name, " ")}')]</product-data-matcher>
            </method>
         </methods>
      </map-operation>
    </xsl:template>
</xsl:stylesheet>
  • Please say which version of XPath you are using. Many people are still using XPath 1.0, but this kind of problem is much easier with later versions. – Michael Kay Sep 03 '19 at 10:58
  • If the input format is inconsistent, it cannot be parsed reliably. In the given example, it seems you could simply *tokenize* the input using space as the delimiter. Tokenizing is done differently in XSLT 1.0 and XSLT 2.0 - please tell us which processor you're using. – michael.hor257k Sep 03 '19 at 11:00
  • Hi. I'm using XPath 1.0 and processor is XSLT 2.0 – Luigi Pamiloza Sep 03 '19 at 14:46
  • That makes no sense. If the processor supports XSLT 2.0, then it also supports XPath 2.0. Please identify your processor as explained here: https://stackoverflow.com/questions/25244370/how-can-i-check-which-xslt-processor-is-being-used-in-solr/25245033#25245033 – michael.hor257k Sep 04 '19 at 04:43
  • Apologies! I'm using a legacy system so I just had to ask the R&D team and that's what they have provided me. But upon checking: Version: 1.0 Vendor: libxslt – Luigi Pamiloza Sep 04 '19 at 07:16

2 Answers2

0

You should share the XML till that vendor-product-name for exact solution.

If the XML is as below:

<vendor-sku>HT900C</vendor-sku>
 <vendor-product-name>Ventilateur TurboForce&#7481;&#7472; HT900C Honeywell</vendor-product-name>

The data you have share the below xpath I have created as below if <vendor-product-name> is a sibling not a child:

//vendor-sku[contains(.,'HT900C')]//following-sibling::vendor-product-name

If <vendor-product-name> is a child

 //vendor-sku[contains(.,'HT900C')]//vendor-product-name

If <vendor-product-name> is a parent

//vendor-sku[contains(.,'HT900C')]//../self::vendor-product-name
Shubham Jain
  • 16,610
  • 15
  • 78
  • 125
  • Shubham, thank you for your solution. To me, this seems to be more of a specific approach. What I'm trying to look for is the inside the – Luigi Pamiloza Sep 04 '19 at 07:23
  • anser updated .. xpath : //vendor-sku[contains(.,'HT900C')]//../self::vendor-product-name – Shubham Jain Sep 04 '19 at 07:33
  • Shubham, I applied your algorithm on my operation: /products/product['{vendor-product-name}'[contains(., vendor-sku)]] I wouldn't want to set a specific string to search and map, i'm using the fields '<>' to create matches – Luigi Pamiloza Sep 04 '19 at 07:53
0

This part of your question is not quite clear:

I'm trying to find some ways to search a vendor-sku inside a product name.

If you have several vendor-product-name nodes, you can select the one that contains a known value as shown in the following example:

XML

<input>
    <vendor-product-name>Gadget Cornballer100 CBL0100 Acme</vendor-product-name>
    <vendor-product-name>Widget Sabor5000 SBRX5 Roxxon</vendor-product-name>
    <vendor-product-name>Ventilateur TurboForce&#7481;&#7472; HT900C Honeywell</vendor-product-name>
    <vendor-product-name>Thingy Opti-Grab OPG-45A Zaibatsu</vendor-product-name>
</input>

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>

<xsl:param name="sku">HT900C</xsl:param>

<xsl:template match="/input">
    <xsl:variable name="my-product" select="vendor-product-name[contains(concat(' ', ., ' '), concat(' ', $sku, ' '))]" />
    <xsl:value-of select="translate($my-product, ' ', '&#10;')"/>
</xsl:template>

</xsl:stylesheet>

Result

Ventilateur
TurboForceᴹᴰ
HT900C
Honeywell

If you are using the libxslt processor, you can reduce the chances of getting a false positive by targeting specifically the 3rd token in vendor-product-name:

XSLT 1.0 + EXSLT str:tokenize() function

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:str="http://exslt.org/strings"
extension-element-prefixes="str">
<xsl:output method="text"/>

<xsl:param name="sku">HT900C</xsl:param>
<xsl:key name="product-by-sku" match="vendor-product-name" use="str:tokenize(., ' ')[3]" />

<xsl:template match="/input">
    <xsl:variable name="my-product" select="key('product-by-sku', $sku)" />
    <xsl:value-of select="translate($my-product, ' ', '&#10;')"/>
</xsl:template>

</xsl:stylesheet>
michael.hor257k
  • 113,275
  • 6
  • 33
  • 51
  • We are almost there! So, I have two feeds with me: one is what we called the vendor's feed(Samsung, P&G, mattel, etc.) and the other one is the retailer's feed (Walmart, CVS, Target, Newegg) So my work is basically to find a match between these two feeds, I need to match the vendor's SKU/GTIN to the product posted on retailer's site/feed. I am injecting enriched content to the products, so to do that, I need to match these IDs between these two feeds as a channel or a bridge. But since on this case, I asked help because the SKU was inserted on the product name. – Luigi Pamiloza Sep 04 '19 at 10:16
  • Please don't post code in comments. Edit your question and add **all** the relevant information there - see: [mcve]. -- I have no idea what you mean by "feed". XSLT processes a single XML document. You can add more information by passing a parameter at runtime and/or you can point it to another XML document by using the `document()` function. – michael.hor257k Sep 04 '19 at 10:23
  • Hi Michael, thank you for your patience! I made some updates on my query. What I mean by feeds, they are .xml files that contain the information such as products of the vendor and retailer – Luigi Pamiloza Sep 04 '19 at 10:51
  • I am afraid I cannot follow all of that. – michael.hor257k Sep 04 '19 at 15:38
  • That is fine, Michael. Thank you very much for helping! :) – Luigi Pamiloza Sep 05 '19 at 07:47