0

I have following array where I am trying to merge the elements which has shelf and weight value as duplicate and sum the value of piece key.

Array
(
    [0] => Array
        (
            [shelf] => Left
            [weight] => 10.000
            [piece] => 1
        )

    [1] => Array
        (
            [shelf] => Right
            [weight] => 04.000
            [piece] => 12
        )

    [2] => Array
        (
            [shelf] => Right
            [weight] => 04.000
            [piece] => 4
        )

    [3] => Array
        (
            [shelf] => Right
            [weight] => 07.000
            [piece] => 8
        )
)

Currently I am getting following desired output with help of following SQL statement by creating the temporary table with following fields shelf, weight and piece and inserting all the four values and parsing the result in PHP with following select query

SELECT shelf, weight, SUM(piece) FROM temp GROUP BY CONCAT(shelf, weight)

Array
(
    [0] => Array
        (
            [shelf] => Left
            [weight] => 10.000
            [piece] => 1
        )

    [1] => Array
        (
            [shelf] => Right
            [weight] => 04.000
            [piece] => 16
        )

    [3] => Array
        (
            [shelf] => Right
            [weight] => 07.000
            [piece] => 8
        )
 )

However I am believe that this can be simply achieved by PHP, but can't get my head around. Can somebody please point what I might be missing ?

Note to Moderators and SO Vigilante

Please don't take it personal but before marking or even saying this is duplicate, read my question thoroughly and understand what I am trying to achieve and only then if you agree kindly explain in detail why do you think its duplicate, rather than simply arguing on base of question with similar title

I have gone through these questions, but they don't work in my scenario, as they try to merge array and sum value based on one specific element which is usually either ID, but in my case its uniqueness is judged on combination of elements (not one)

1, 2, 3

mickmackusa
  • 43,625
  • 12
  • 83
  • 136
Abhishek Madhani
  • 1,155
  • 4
  • 16
  • 26
  • Why not simply `GROUP BY shelf, weight`? There's no need to concatenate anything – Mark Baker Jun 10 '14 at 20:21
  • 1
    But why do you want to do it in PHP; it will certainly be more efficient doing it in SQL – Mark Baker Jun 10 '14 at 20:22
  • As I said, ___why___ do you want to do it slowly and inefficiently in PHP, when it can be done quickly and efficiently in SQL? – Mark Baker Jun 10 '14 at 20:24
  • @MarkBaker, thanks will do that. However I am looking for PHP solution, because this code runs in loop more than 24k times, and parsing via SQL increases the network data transfer time, and also MySQL insert and select which makes it 24k*2 = 48k times query to database, which is increasing my speed execution time, hence. – Abhishek Madhani Jun 10 '14 at 20:27
  • Doing this in SQL ___reduces___ the data transfer time, because you're returning fewer data rows... and then why are you running this 24k times? That sounds like a bad idea... I suspect you could achieve your final objective a lot easier without this mass of inserts and selects – Mark Baker Jun 10 '14 at 20:29
  • @MarkBaker, If I manage it to do in PHP it simply happens in memory, where as If i use MySQL it has to write data to disk. Reading and writing data from Memory is any day faster than disk. http://blog.codinghorror.com/content/images/2014/May/storage-latency-how-far-away-is-the-data.png – Abhishek Madhani Jun 10 '14 at 20:32
  • If you're getting this data from a database, then you should be able to get exactly what you need __without__ the need for mass inserts of 24k rows into temporary tables then reselecting it in the first place – Mark Baker Jun 10 '14 at 20:38
  • @MarkBaker, it did sounded a bad idea to me as well, but there was no another way, as for each record from 24k, this array is then saved in serialized format in DB, to be further used by other programs. I am doing sort of pre-calculations to save the overhead of joins on run time. – Abhishek Madhani Jun 10 '14 at 20:40
  • But if you're selecting this data from a database in the first place, then do all your aggregations on the database (which has built-in and highly optimised methods for doing precisely that).... it's faster than selecting larger volumes of data, then aggregating it in PHP – Mark Baker Jun 10 '14 at 20:40
  • Your approach does that select from the database in every case (so you're always reading from disk), but moves the aggregations to PHP, which isn't optimised for such tasks, and needs to work with large volumes of data (which requires time-expensive memory allocations) – Mark Baker Jun 10 '14 at 20:42
  • @MarkBaker, I could do all this directly in database which I agree is precisely built for that, but If I go this way or that way, I have to pass the data trough PHP, as I need to serialize it before saving in it DB. – Abhishek Madhani Jun 10 '14 at 20:45
  • @MarkBaker, and comparing expensiveness I think memory allocation is affordable to me, than DB read and write just for this task if anyways it would be possible because it can drastically reduce the script execution time, hence freeing up the disk IO much quicker, where then they can quickly serve to more important jobs. – Abhishek Madhani Jun 10 '14 at 20:48
  • You're not freeing up disk IO any quicker, you're simply adding extra overhead to PHP... and storing serialized data in a database is always a really bad idea that simply leads to more and more problems in the future – Mark Baker Jun 10 '14 at 20:57

1 Answers1

2

If you absolutely have to do this in PHP, then something like:

$data = array(
    array(
        'shelf' => 'Left',
        'weight' => '10.000',
        'piece' => 1,
    ),
    array(
        'shelf' => 'Right',
        'weight' => '04.000',
        'piece' => 12,
    ),
    array(
        'shelf' => 'Right',
        'weight' => '04.000',
        'piece' => 4,
    ),
    array(
        'shelf' => 'Right',
        'weight' => '07.000',
        'piece' => 8,
    ),
);


$newData = array();
$result = array_reduce(
    $data,
    function($newData, $value) {
        $key = $value['shelf'].$value['weight'];
        if (!isset($newData[$key])) {
            $newData[$key] = $value;
        } else {
            $newData[$key]['piece'] += $value['piece'];
        }
        return $newData;
    },
    $newData
);
var_dump($result);

will work, but I really do believe that you're creating major performance problems for yourself with your whole approach to this problem

Mark Baker
  • 209,507
  • 32
  • 346
  • 385
  • I really appreciate your inputs in the questions comment section and thanks for the above code. With my MySQL code the scripts completes in 400-450 seconds, whereas with your PHP suggestion it completes in 200-230 seconds. Which approach would you still suggest ? – Abhishek Madhani Jun 10 '14 at 21:48
  • I still believe that your whole approach with the temporary table is flawed, and that the overhead of creating, populating and then selecting from that is your major overhead, and one that could probably be eliminated completely by the right initial SQL query – Mark Baker Jun 10 '14 at 22:01
  • I completely understand your point, and given the chance I would use MySQL any day over PHP to achieve this. However I am required to serialized the data (I know its not good idea), hence I insisted with PHP. Though will give a re-look over my whole approach and will try with pure MySQL, will keep you updated. – Abhishek Madhani Jun 10 '14 at 22:06