0

I have a trouble figuring out how to properly convert a list of product data from XML into CSV format.

My source is a XML file containing a list of products with attributes like color, size, material etc. with the following structure:

<?xml version="1.0" encoding="utf-8" ?>
<store>
    <products>
        <product>
            <name>T-Shirt</name>
            <price>19.00</price>
            <attributes>
                <attribute>
                    <name>Color</name>
                    <options>
                        <option>
                            <name>White</name>
                            <price>0.00</price>
                        </option>
                        <option>
                            <name>Black</name>
                            <price>0.00</price>
                        </option>
                        <option>
                            <name>Blue</name>
                            <price>0.00</price>
                        </option>
                    </options>
                </attribute>
                <attribute>
                    <name>Size</name>
                    <options>
                        <option>
                            <name>XS</name>
                            <price>-5.00</price>
                        </option>
                        <option>
                            <name>S</name>
                            <price>-5.00</price>
                        </option>
                        <option>
                            <name>M</name>
                            <price>0.00</price>
                        </option>
                        <option>
                            <name>L</name>
                            <price>0.00</price>
                        </option>
                        <option>
                            <name>XL</name>
                            <price>5.00</price>
                        </option>
                    </options>
                </attribute>
            </attributes>
        </product>
        <product>
            <name>Sweatshirt</name>
            <price>49.00</price>
            <attributes>
                <attribute>
                    <name>Color</name>
                    <options>
                        <option>
                            <name>White</name>
                            <price>0.00</price>
                        </option>
                        <option>
                            <name>Black</name>
                            <price>0.00</price>
                        </option>
                    </options>
                </attribute>
                <attribute>
                    <name>Size</name>
                    <options>
                        <option>
                            <name>XS</name>
                            <price>-10.00</price>
                        </option>
                        <option>
                            <name>M</name>
                            <price>0.00</price>
                        </option>
                        <option>
                            <name>XL</name>
                            <price>10.00</price>
                        </option>
                    </options>
                </attribute>
                <attribute>
                    <name>Material</name>
                    <options>
                        <option>
                            <name>Cotton</name>
                            <price>10.00</price>
                        </option>
                        <option>
                            <name>Polyester</name>
                            <price>0.00</price>
                        </option>
                    </options>
                </attribute>                
            </attributes>
        </product>
        <product>
            <name>Earrings</name>
            <price>29.00</price>
        </product>
    </products>
</store>

Each product has a number of elements like name, price etc. but also a random number of attributes (like color, size, material etc.) that also have a random number of options. Each option can affect the price of the product, so ordering a XS sized t-shirt can be cheaper than ordering a XL sized t-shirt.

I would like to end up with a CSV representing one attribute combination on each line.

In my example that would result in 3 colors x 5 sizes = 15 lines for the T-Shirt , 2 colors x 3 sizes x 2 materials = 12 lines for the Sweatshirt and 1 line for the Earrings without any attributes:

name,price,color,size,material
T-Shirt,14.00,White,XS,
T-Shirt,14.00,Black,XS,
T-Shirt,14.00,Blue,XS,
T-Shirt,14.00,White,S,
T-Shirt,14.00,Black,S,
T-Shirt,14.00,Blue,S,
T-Shirt,19.00,White,M,
T-Shirt,19.00,Black,M,
T-Shirt,19.00,Blue,M,
T-Shirt,19.00,White,L,
T-Shirt,19.00,Black,L,
T-Shirt,19.00,Blue,L,
T-Shirt,24.00,White,XL,
T-Shirt,24.00,Black,XL,
T-Shirt,24.00,Blue,XL,
Sweatshirt,49.00,White,XS,Cotton
Sweatshirt,49.00,Black,XS,Cotton
Sweatshirt,59.00,White,M,Cotton
Sweatshirt,69.00,Black,M,Cotton
Sweatshirt,69.00,White,XL,Cotton
Sweatshirt,69.00,Black,XL,Cotton
Sweatshirt,39.00,White,XS,Polyester
Sweatshirt,39.00,Black,XS,Polyester
Sweatshirt,49.00,White,M,Polyester
Sweatshirt,49.00,Black,M,Polyester
Sweatshirt,59.00,White,XL,Polyester
Sweatshirt,59.00,Black,XL,Polyester
Earrings,29.00,,,

I already managed to generate the CSV Output for simple products like the Earrings and products with just one attribute, but am struggling to come up with a way to generate all possible product attribute combinations for products with more than one attribute.

My miserable attempts at this so far have produced following code:

<?php
mb_internal_encoding("UTF-8");
header('Content-Type: text/html; charset=utf-8');

$source = "example.xml";
$handle = fopen($source, "r");
$fp = fopen('export.csv', 'w');
$xml = simplexml_load_file($source);

// Generate list of attributes (for csv header etc.)
$header_attributes = array();
foreach ($xml->products->product as $product) {
    if(isset($product->attributes)) {
        foreach($product->attributes->attribute as $attribute) {
            array_push($header_attributes, $attribute->name);
        }
    }
}
$header_attributes = array_unique($header_attributes);

$csvheader = array(
    'name','price' // these exist for all products, could also include weight, image, description, special price etc...
);

$static_csvheadercount = count($csvheader);

foreach($header_attributes as $attribute) {
    array_push($csvheader, $attribute); // add variable number of attribute fields to csv header
}

fputcsv($fp, $csvheader);

foreach ($xml->products->product as $product) {  // loop through each product
    if(isset($product->attributes)) $simple = 0;
    else $simple = 1;
    if($simple == 1) { // if product is a simple product with no attributes
        $output=array();
        array_push($output,(string)$product->name);
        array_push($output,(string)$product->price);
        for($i = $static_csvheadercount + $attribute_position; $i < count($csvheader); $i++) {
                    array_push($output, '');
        }
        fputcsv($fp, $output);
    }
    else { // is a configurable product with attributes
        $json = json_encode($product->attributes);
        $attributes = json_decode($json, TRUE);
        $attributes_number = count($product->attributes->attribute);
        if($attributes_number > 1) { // if product has more than 1 attributes so we have to generate each attribute combination
            //
            //  I'm trying to figure out what should happen here
            //
        }       
        else { // if product has only one attribute
            $attributename =  (string)$product->attributes->attribute->name;
            $attribute_position = array_search($attributename, $header_attributes);
            $options_number = count($product->attributes->options->option);
            $pos = 1;
            foreach($attributes['attribute']['options']['option'] as $option) { 
                $output=array();
                array_push($output,(string)$product->name);
                array_push($output,(string)$product->price);
                for($i = $static_csvheadercount - 1; $i < ($static_csvheadercount + $attribute_position); $i++) {
                    array_push($output, '');
                }

                $output[$static_csvheadercount + $attribute_position] = $option['name'];
                for($i = $static_csvheadercount + $attribute_position; $i < count($csvheader) - 1 ; $i++) {
                    array_push($output, '');
                }
                fputcsv($fp, $output);
                $pos++;
            }
            $output=array();
            array_push($output,(string)$product->name);
            array_push($output,(string)$product->price);
            for($i = $static_csvheadercount; $i < count($csvheader); $i++) {
                array_push($output, '');
            }
            fputcsv($fp, $output);
        }       
    }
}

?>

I've been stuck at this problem for hours unable to figure out a solution.

Can someone give a few tips or pointer how to achieve the output for products with multiple attributes?

2 Answers2

0

Here are useful links: link 1 or link 2

If didn't helped. In each attribute you have x options. So for example you have 2 color and 3 size options. So total you will have 2x3=6 products. WIth first for cycle you can count how much products you will have in the end and how many attributes you will have. So you can define how your product will look like. For example like this:

$products = array();
$product = array("size"=>,"color"=>); total 2 attributes
$products[] = $product;

Which means that each of your products will have exactly 2 attributes. Well, therefore you should hop over all attribute options and gather all possible variations of products. So ideally by this simple solution you might get something like this: Size: XS S or 1 2 Color: Red Blue Gren or 3 4 5 Then we will have something like this:

3 4 5 3 4 5

1 2 1 2 1 2

Perhaps my solution is not the best. I even don't know if it work or not. But you can get some useful info for you.

$variant_count = 1;
$opts = array();
for($attributes->attribute as $attr){
    $variant_count *= count($attr->option);
    $opts[] = $attr->option;
}
$products = array();
for($i = 0; $i < $variant_count; $i++){
    $product = array();
    for($x = 0; $x < count($opts); $x++){
        $y = $i...;// here should be some function which counts proper $y
        $product[$x] = $opts[$x][$y];
    }
}
Community
  • 1
  • 1
Jevgeni Smirnov
  • 3,787
  • 5
  • 33
  • 50
0

Pretty straight forward in xsl here's a XSL that does what you're looking for.

Just a couple XSL tricks to consider.

  • Output carriage returns with this

  • Set the output method to text, with no indendation

Good Luck!

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" > 

<xsl:output method="text" indent="no"/>

<xsl:template match="/">
    <xsl:for-each select='store/products/product'>
        <!-- Store off the current product name and price -->
        <xsl:variable name='name' select='name'/>
        <xsl:variable name='price' select='price'/>
        <!-- Store off the material, colors and sizes-->
        <xsl:variable name='materials' select='attributes/attribute[name="Material"]/options/option/name'/>
        <xsl:variable name='colors' select='attributes/attribute[name="Color"]/options/option/name'/>
        <xsl:variable name='sizes' select='attributes/attribute[name="Size"]/options/option/name'/>

        <!-- If no colors (earrings), jump ahead -->
        <xsl:if test='$colors'>
            <!-- Iterate the colors -->
            <xsl:for-each select='$colors'>
                <!-- Store off the current color -->
                <xsl:variable name='color' select='.'/>
                <!-- Iterate the sizes -->
                <xsl:for-each select='$sizes'>
                    <xsl:variable name='size' select='.'/>
                    <!-- If materials, iterate through them too -->
                    <xsl:if test='$materials'>
                        <xsl:for-each select='$materials'>
                            <xsl:variable name='material' select='.'/>
                            <xsl:value-of select='$name'/>,<xsl:value-of select='$price'/>,<xsl:value-of select='$color'/>,<xsl:value-of select='$size'/>,<xsl:value-of select='$material'/>
                            <xsl:text>&#xA;</xsl:text>
                        </xsl:for-each>
                    </xsl:if>
                    <!-- If not materials, output what we've got -->
                    <xsl:if test='not($materials)'>
                        <xsl:value-of select='$name'/>,<xsl:value-of select='$price'/>,<xsl:value-of select='$color'/>,<xsl:value-of select='$size'/>
                        <xsl:text>&#xA;</xsl:text>
                    </xsl:if>
                </xsl:for-each>
            </xsl:for-each>
        </xsl:if>
        <xsl:if test='not($colors)'>
            <xsl:value-of select='$name'/>,<xsl:value-of select='$price'/>,,,
            <xsl:text>&#xA;</xsl:text>
        </xsl:if>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>

If you don't know how to use XSL, here's a quick little script to get you started.

<?php

    try {
        $xml = new DOMDocument();
        $strFileName = "att.xml";
        $xml->load($strFileName);

        $xsl = new DOMDocument();
        $strFileName = "att.xsl";
        $xsl->load($strFileName);

        $proc = new XSLTProcessor();
        $proc->importStylesheet($xsl);

        echo $proc->transformToXML($xml);
    } catch( Exception $e ) {
        echo $e->getMessage();
    }

?>
William Walseth
  • 2,803
  • 1
  • 23
  • 25