0

I have the following performance problem in PHP code. An external API that I cannot edit, returns a JSON array like this one:

[{"name": "Name 1", "code": "Code 1", "attribute1": "Black", "attribute2": "32", "price": "10"},
 {"name": "Name 2", "code": "Code 2", "attribute1": "Yellow", "attribute2": "", "price": "15"},
{"name": "Name 1", "code": "Code 3", "attribute1": "Yellow", "attribute2": "32", "price": "20"},....
]

I want to group this by name and reformat it to a JSON array like this:

[{
   "name": "Name 1",
   "available_attributes": [ "size", "color" ],
   "variations": [ 
       { "attributes": { "size": "32", "color": "Black" }, "price": "10", "code": "Code 1"},
       { "attributes": { "size": "32", "color": "Yellow" }, "price": "20", "code": "Code 3"}
   ]
}, {
   "name": "Name 2",
   "available_attributes": [  "color" ],
   "variations": [ { "attributes": { "color": "Yellow" }, "price": "15", "code": "Code 2"}]
}]

My solution is ugly and time-consuming since I used a simple brute force to iterate on the response and then again every time on the array to update the one I have already there.

So, I am looking for a solution focused on performance and speed.

Edit. This is my code. The only difference is that in case of both attributes being empty, instead of the variations and available_attributes arrays, it has the price and the sku only.

function cmp( $a, $b ) {
    if ( $a['name'] == $b['name'] ) {
        return 0;
    }
    return ( $a['name'] < $b['name'] ) ? - 1 : 1;
}

function format_products_array($products) {
    usort( $products, "cmp" );
    $formatted_products = array();
    $new = true;
    $obj = array();

    for ( $i = 0; $i < count( $products ); $i++ ) {

        if ( $new ) {
            $obj = array();
            $attr = array();
            $obj['available_attributes'] = array();
            $obj['variations'] = array();

            $obj['name'] = $products[$i]['name'];
            if ( $products[$i]['attribute1'] != '' ) {
                array_push( $obj['available_attributes'], 'color' );
                $attr['color'] = $products[$i]['attribute1'];
            }
            if ( $products[$i]['attribute2'] != '' ) {
                array_push( $obj['available_attributes'], 'size' );
                $attr['size'] = $products[$i]['attribute2'];
            }   
        }

        if ( $products[ $i ]['name'] == $products[ $i + 1 ]['name']) {
            $new = false;
            $attr['size'] = $products[$i]['attribute2'];            
            $attr['color'] = $products[$i]['attribute1'];
            if ( empty($obj['available_attributes']) ) {
                $obj['price'] = $products[$i]['price'];
            } else {
                $var = array();
                $var['price'] = $products[$i]['price'];
                $var['code'] = $products[$i]['code'];
                $var['attributes'] = $attr;
                array_push($obj['variations'], $var);
            }
        } else {
            $new = true;
            if ( empty($obj['available_attributes']) ) {
                $obj['price'] = $products[$i]['price'];
            }
            $attr['size'] = $products[$i]['attribute2'];            
            $attr['color'] = $products[$i]['attribute1'];
            $var['attributes'] = $attr;
            array_push($obj['variations'], $var);
            array_push($formatted_products, $obj);              
        }
    }
    return $formatted_products;
}
Tasos
  • 7,325
  • 18
  • 83
  • 176

1 Answers1

0

A faster solution is when generating the array to store the unique identifies or each object eg to generate:

[
  "Name1":{
   "name": "Name 1",
   "code": "Code 1",
   "available_attributes": [ "size", "color" ],
   "variations": [ 
       { "attributes": { "size": "32", "color": "Black" }, "price": "10"},
       { "attributes": { "size": "32", "color": "Yellow" }, "price": "20"}
   ]
  },
  "Name2": {
   "name": "Name 2",
   "code": "Code 2",
   "available_attributes": [  "color" ],
   "variations": [ { "attributes": { "color": "Yellow" }, "price": "15"}]
}]

OR

[
  "Code 1":{
   "name": "Name 1",
   "code": "Code 1",
   "available_attributes": [ "size", "color" ],
   "variations": [ 
       { "attributes": { "size": "32", "color": "Black" }, "price": "10"},
       { "attributes": { "size": "32", "color": "Yellow" }, "price": "20"}
   ]
  },
  "Code 2": {
   "name": "Name 2",
   "code": "Code 2",
   "available_attributes": [  "color" ],
   "variations": [ { "attributes": { "color": "Yellow" }, "price": "15"}]
}]

Afterwards (optionally)remove any association.

Afterwards you may store them in a memcached/redis then when you need to re-retrieve the same data then just look in redis/memcached first.

So it may be time consuming at first but afterwards it will be ready to do that so they will be only on "unlucky" guy/girl who will do the very same thing.

In case it is extreemely time consuming loops then use a worker to generate theese data ans store them in an document-based storage such as mongodb/couchdb afterwards the site will look on the ready made document.

Dimitrios Desyllas
  • 9,082
  • 15
  • 74
  • 164