1

Structure $data->products (There are about 10,000 products). Each of the products has parameters.

Array
(
    [0] => Array
        (
            [id] => 440
            [name] => Product1
            [parameters] => Array
                (
                    [0] => Array
                        (
                            [id] => 1
                            [name] => Parameter1
                            [value] => Array
                                (
                                    [0] => Array
                                        (
                                            [id] => 1
                                            [name] => ValueParameter1
                                        )

                                )

                        )
                    [1] => Array
                        (
                            [id] => 2
                            [name] => Parameter2
                            [value] => Array
                                (
                                    [0] => Array
                                        (
                                            [id] => 2
                                            [name] => ValueParameter2
                                        )

                                )

                        )
                    [2] => Array
                        (
                            [id] => 3
                            [name] => Parameter3
                            [value] => Array
                                (
                                    [0] => Array
                                        (
                                            [id] => 3
                                            [name] => ValueParameter3
                                        )

                                )

                        )
                    [3] => .........
                        ...
                )
        )

    [1] => Array
        (
            [id] => 14
            [name] => Product2
            [parameters] => Array
                (
                    [0] => Array
                        (
                            [id] => 2
                            [name] => Parameter2
                            [value] => Array
                                (
                                    [0] => Array
                                        (
                                            [id] => 2
                                            [name] => ValueParameter2
                                        )

                                )

                        )

                    [2] => Array
                        (
                            [id] => 3
                            [name] => Parameter3
                            [value] => Array
                                (
                                    [0] => Array
                                        (
                                            [id] => 3
                                            [name] => ValueParameter3
                                        )

                                )

                        )
                    [2] => Array
                        (
                            [id] => 35
                            [name] => Parameter35
                            [value] => Array
                                (
                                    [0] => Array
                                        (
                                            [id] => 64
                                            [name] => ValueParameter35
                                        )

                                )

                        )
                    [3] => .........
                        ...

                )

        )
    [2] => ....
    .....

What do I want to get?

  • $data->products is an array to be filtered on.
  • $filterContains parameters ID array - filtering products that contain parameters with given IDs.
  • $filterExclude parameters ID array - filtering products excluding the specified IDs.

I want to get an array of products taking into account the IDs of the parameters, which are specified in the filter arrays ($filterContains & $filterExclude).

Products that contain parameters,$filterContains, excluding parameters from $filterExclude.

Code:

function getFilteredData($data, array $filterContains = [], array $filterExclude = []): array
{
    $result = [];
    $keyCount = count($filterContains);
    foreach ($data->products as $product) {
        if (isset($product['parameters'])) {
            $match = 0;
            $product['parameters'] = array_values($product['parameters']);
            foreach ($product['parameters'] as $parameter) {
                foreach ($filterContains as $value) {
                    if ($parameter['id'] == $value && !in_array($parameter['id'], $filterExclude)) {
                        $match++;
                    }
                }
                if ($match == $keyCount) {
                    $result[] = $product;
                }
            }
        }
    }
    $unique_array = [];
    foreach ($result as $element) {
        $hash = $element['id'];
        $unique_array[$hash] = $element;
    }
    $result = array_values($unique_array);
    return $result;
}

It seems to me only works for products that contain the given parameters, but it doesn't exclude the parameters given in $filterExclude.

I hope I have described the problem quite clearly. Regards.

nxx
  • 11
  • 3

1 Answers1

1

Problems

I believe that the implementation is overcomplicated, making the code harder to follow and slower than needed. For instance:

  • You don't need to filter unique products.
  • You don't need to call array_values.
  • If you found a parameter ID in $filterExclude, you don't need to keep checking if there's a match.

But why is not working?

The code checks if there isn't an ID in $filterExclude only when it's in $filterContains:

diagram showing the condition is only checking IDs in $filterContains

So, it only seems to work when the function when the IDs in $filterExclude also exist in $filterContains, for instance:

getFilteredData($data, [1, 2], [1]);

Simpler solution

function array_some(array $array, callable $function): bool
{
    foreach ($array as $item) {
        if ($function($item)) {
            return true;
        }
    }
    return false;
}

function getFilteredData($data, array $filterContains = [], array $filterExclude = []): array
{
    $result = [];
    foreach ($data->products as $product) {
        if (
            isset($product['parameters'])
            && !array_some($product['parameters'], function ($parameter) use ($filterExclude) { return in_array($parameter['id'], $filterExclude); })
            && array_some($product['parameters'], function ($parameter) use ($filterContains) { return in_array($parameter['id'], $filterContains); })
        ) {
            $result []= $product;
        }
    }
    return $result;
}

Or this one, if you want each product having all the parameters IDs from $filterContains:

function getFilteredData($data, array $filterContains = [], array $filterExclude = []): array
{
    $result = [];
    foreach ($data->products as $product) {
        if (!isset($product['parameters'])) {
            continue;
        }
        $ids = array_map(function ($parameter) { return $parameter['id']; }, $product['parameters']);
        if (
            !array_some($ids, function ($id) use ($filterExclude) { return in_array($id, $filterExclude); })
            && !array_some($filterContains, function ($id) use ($ids) { return !in_array($id, $ids); })
        ) {
            $result []= $product;
        }
    }
    return $result;
}

I think PHP doesn't have functions like array_some and array_all. Also, anonymous functions are ugly (they're better in JavaScript), especially when we need use. PHP has arrow functions since PHP 7.4, though.

Maybe with better performance

Calling array_some twice implies transversing the array twice. We can check if each parameter ID is in $filterContains and it's not in $filterExclude while transversing the array once.

You don't need to count the parameters to check if the conditions are matched. Use a flag (a boolean) to check if a parameter ID in $filterContains was found. If a parameter ID is in $filterExclude, set that flag to false and break the loop. If that flag is true, there's a match.

Also, array_in is transversing the filters several times. That is not required when a set is used. The class Ds\Set might not be available, but with array_flip you get an associative array and used it as a set with is_set. But building the associative array and generating hashes to access might be slower than using array_in if there are few products and the filters have few items.

function getFilteredData($data, array $filterContains = [], array $filterExclude = []): array
{
    $result = [];
    $filterContainsSet = array_flip($filterContains);
    $filterExcludeSet = array_flip($filterExclude);
    foreach ($data->products as $product) {
        if (!isset($product['parameters'])) {
            continue;
        }
        $found = false;
        foreach ($product['parameters'] as $parameter) {
            $id = $parameter['id'];
            if (isset($filterExcludeSet[$id])) {
                $found = false;
                break;
            } elseif (isset($filterContainsSet[$id])) {
                $found = true;
            }
        }
        if ($found) {
            $result []= $product;
        }
    }
    return $result;
}

Note that, despite the nested loop, there are only two loops in that function.

If you want each product having all the parameters IDs in $filterContains:

function getFilteredData($data, array $filterContains = [], array $filterExclude = []): array
{
    $result = [];
    $filterContainsSet = array_flip($filterContains);
    $filterExcludeSet = array_flip($filterExclude);
    foreach ($data->products as $product) {
        if (!isset($product['parameters'])) {
            continue;
        }
        $found = [];
        foreach ($product['parameters'] as $parameter) {
            $id = $parameter['id'];
            if (isset($filterExcludeSet[$id])) {
                $found = [];
                break;
            } elseif (isset($filterContainsSet[$id])) {
                $found[$id] = true;
            }
        }
        if (count($found) >= count($filterContains)) {
            $result []= $product;
        }
    }
    return $result;
}

Note: count reuses a cached integer.

If you rather have an implementation that does not have count:

function getFilteredData($data, array $filterContains = [], array $filterExclude = []): array
{
    $result = [];
    foreach ($data->products as $product) {
        if (!isset($product['parameters'])) {
            continue;
        }
        $idSet = array_flip(array_map(function($parameter) { return $parameter['id']; }, $product['parameters']));
        if (
            !array_some($filterExclude, function($id) use ($idSet) { return isset($idSet[$id]); })
            && !array_some($filterContains, function($id) use ($idSet) { return !isset($idSet[$id]); })
        ) {
            $result []= $product;
        }
    }
    return $result;
}
Pedro Amaral Couto
  • 2,056
  • 1
  • 13
  • 15