5

I used numpy to do large scale data analysis, with lots of matrix implementations (e.g., dot, count_nonzero, linalg.svd). After %prun in Jupyter notebook, I found that numpy.core._multiarray_umath.implement_array_function costs lots of time, 38 sec out of total 250 sec cumtime with large number of ncall (67139/66979). I know other functions should be optimized, but I think is it possible to suppress this as well, and what is this used for?

Here is my %prun outputs:

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 1848  203.845    0.110  242.582    0.131 stacking.py:130(_rda_cv)
 67139/66979   27.980    0.000   38.901    0.001 {built-in method numpy.core._multiarray_umath.implement_array_function}
    4    8.181    2.045  251.415   62.854 stacking.py:192(_model_selection)
14883    7.942    0.001    7.942    0.001 {method 'reduce' of 'numpy.ufunc' objects}
11096    2.107    0.000    2.353    0.000 linalg.py:1468(svd)
    4    0.154    0.038    0.188    0.047 stacking.py:20(_get_qvalues)
    1    0.149    0.149  251.887  251.887 stacking.py:255(fit)
   16    0.149    0.009    0.508    0.032 stacking.py:70(_construct_cov)
26341    0.140    0.000    0.140    0.000 {built-in method numpy.array}
    4    0.132    0.033    0.609    0.152 stacking.py:89(_construct_cov_cv)
11164    0.114    0.000    0.367    0.000 _methods.py:134(_mean)
 1919    0.102    0.000    0.102    0.000 {built-in method numpy.empty}
36989    0.073    0.000    0.073    0.000 {method 'astype' of 'numpy.ndarray' objects}
11132    0.052    0.000    0.383    0.000 fromnumeric.py:3153(mean)
   32    0.052    0.002    0.302    0.009 function_base.py:2245(cov)
38870    0.052    0.000   27.967    0.001 <__array_function__ internals>:2(dot)
11164    0.051    0.000    0.054    0.000 _methods.py:50(_count_reduce_items)
11096    0.043    0.000    0.070    0.000 linalg.py:144(_commonType)
   13    0.036    0.003    0.036    0.003 {method 'argsort' of 'numpy.ndarray' objects}
 3696    0.035    0.000    7.909    0.002 numeric.py:409(count_nonzero)
11096    0.033    0.000    0.064    0.000 linalg.py:116(_makearray)
66728    0.031    0.000    0.031    0.000 {built-in method builtins.issubclass}
11096    0.027    0.000    2.407    0.000 <__array_function__ internals>:2(svd)
11145    0.026    0.000    0.026    0.000 {method 'flatten' of 'numpy.ndarray' objects}
11096    0.024    0.000    0.024    0.000 linalg.py:111(get_linalg_error_extobj)
348583    0.023    0.000    0.023    0.000 {method 'append' of 'list' objects}
11132    0.021    0.000    0.421    0.000 <__array_function__ internals>:2(mean)
 7408    0.018    0.000    0.034    0.000 numerictypes.py:293(issubclass_)
 3696    0.017    0.000    7.940    0.002 <__array_function__ internals>:2(count_nonzero)
 3704    0.017    0.000    0.053    0.000 numerictypes.py:365(issubdtype)
 5544    0.017    0.000    0.017    0.000 stacking.py:146(<dictcomp>)
22192    0.016    0.000    0.025    0.000 linalg.py:134(_realType)
   40    0.016    0.000    0.016    0.000 {method 'sort' of 'numpy.ndarray' objects}
 3702    0.013    0.000    7.795    0.002 {method 'sum' of 'numpy.ndarray' objects}
15009    0.012    0.000    0.028    0.000 _asarray.py:88(asanyarray)
    5    0.012    0.002    0.053    0.011 _split.py:628(_make_test_folds)
22192    0.010    0.000    0.013    0.000 linalg.py:121(isComplexType)
22602    0.010    0.000    0.010    0.000 {built-in method builtins.isinstance}
13199    0.010    0.000    0.010    0.000 {built-in method builtins.getattr}
11264    0.010    0.000    0.025    0.000 _asarray.py:16(asarray)
11096    0.009    0.000    0.009    0.000 linalg.py:203(_assertRankAtLeast2)
22196    0.009    0.000    0.009    0.000 {method 'get' of 'dict' objects}
 1964    0.009    0.000    0.009    0.000 {method 'argmax' of 'numpy.ndarray' objects}
11132    0.008    0.000    0.008    0.000 {built-in method __new__ of type object at 0x00007FF847CE9BA0}
38870    0.008    0.000    0.008    0.000 multiarray.py:707(dot)
11625    0.008    0.000    0.008    0.000 {built-in method builtins.hasattr}
   45    0.007    0.000    0.038    0.001 arraysetops.py:297(_unique1d)
60/20    0.006    0.000    0.059    0.003 _split.py:74(split)
 1964    0.006    0.000    0.034    0.000 <__array_function__ internals>:2(argmax)
 1964    0.006    0.000    0.023    0.000 fromnumeric.py:1091(argmax)
 3702    0.005    0.000    7.782    0.002 _methods.py:36(_sum)
    4    0.005    0.001    0.221    0.055 stacking.py:317(_normalizer)
 1982    0.004    0.000    0.044    0.000 fromnumeric.py:55(_wrapfunc)
22192    0.004    0.000    0.004    0.000 {method '__array_prepare__' of 'numpy.ndarray' objects}
11096    0.004    0.000    0.004    0.000 linalg.py:1464(_svd_dispatcher)
   40    0.003    0.000    0.004    0.000 _split.py:107(_iter_test_masks)
11132    0.003    0.000    0.003    0.000 fromnumeric.py:3149(_mean_dispatcher)
 3696    0.003    0.000    0.003    0.000 numeric.py:405(_count_nonzero_dispatcher)
    3    0.003    0.001    0.005    0.002 stacking.py:243(_rda_prediction)
   20    0.002    0.000    0.055    0.003 _split.py:680(_iter_test_masks)
    1    0.002    0.002  251.889  251.889 <string>:1(<module>)
   48    0.002    0.000    0.002    0.000 {built-in method numpy.zeros}
   25    0.002    0.000    0.002    0.000 {built-in method numpy.arange}
    4    0.001    0.000    0.001    0.000 {method 'partition' of 'numpy.ndarray' objects}
    5    0.001    0.000    0.001    0.000 {method 'cumsum' of 'numpy.ndarray' objects}
   45    0.001    0.000    0.039    0.001 arraysetops.py:151(unique)
 1964    0.001    0.000    0.001    0.000 fromnumeric.py:1087(_argmax_dispatcher)
    5    0.001    0.000    0.011    0.002 multiclass.py:174(type_of_target)
  116    0.001    0.000    0.002    0.000 fromnumeric.py:42(_wrapit)
   32    0.001    0.000    0.001    0.000 stride_tricks.py:116(_broadcast_to)
   32    0.000    0.000    0.038    0.001 function_base.py:293(average)
    4    0.000    0.000    0.001    0.000 stacking.py:107(_calculate_weights)
  120    0.000    0.000    0.001    0.000 <__array_function__ internals>:2(where)
  115    0.000    0.000    0.001    0.000 validation.py:127(_num_samples)
   40    0.000    0.000    0.001    0.000 _split.py:430(_iter_test_indices)
  135    0.000    0.000    0.000    0.000 {built-in method _abc._abc_instancecheck}
60/20    0.000    0.000    0.060    0.003 _split.py:299(split)
   30    0.000    0.000    0.001    0.000 validation.py:238(indexable)
    5    0.000    0.000    0.001    0.000 validation.py:362(check_array)
    1    0.000    0.000  251.889  251.889 {built-in method builtins.exec}
    5    0.000    0.000    0.000    0.000 {method 'nonzero' of 'numpy.ndarray' objects}
    4    0.000    0.000    0.002    0.001 function_base.py:3508(_median)
  130    0.000    0.000    0.000    0.000 {built-in method _abc._abc_subclasscheck}
    5    0.000    0.000    0.000    0.000 function_base.py:1147(diff)
    1    0.000    0.000    0.003    0.003 stacking.py:350(_check_y)
   32    0.000    0.000    0.321    0.010 <__array_function__ internals>:2(cov)
    4    0.000    0.000    0.000    0.000 utils.py:1142(_median_nancheck)
    5    0.000    0.000    0.001    0.000 _split.py:661(<listcomp>)
   32    0.000    0.000    0.038    0.001 <__array_function__ internals>:2(average)
   32    0.000    0.000    0.036    0.001 {method 'mean' of 'numpy.ndarray' objects}
   30    0.000    0.000    0.001    0.000 validation.py:220(check_consistent_length)
   32    0.000    0.000    0.000    0.000 {method 'copy' of 'numpy.ndarray' objects}
   32    0.000    0.000    0.001    0.000 <__array_function__ internals>:2(broadcast_to)
   15    0.000    0.000    0.000    0.000 fromnumeric.py:73(_wrapreduction)
    5    0.000    0.000    0.001    0.000 validation.py:40(_assert_all_finite)
   15    0.000    0.000    0.000    0.000 _split.py:277(__init__)
   45    0.000    0.000    0.040    0.001 <__array_function__ internals>:2(unique)
   32    0.000    0.000    0.001    0.000 stride_tricks.py:143(broadcast_to)
    4    0.000    0.000    0.002    0.001 function_base.py:3359(_ureduce)
   32    0.000    0.000    0.000    0.000 <__array_function__ internals>:2(result_type)
   32    0.000    0.000    0.000    0.000 <string>:1(__new__)
  135    0.000    0.000    0.000    0.000 abc.py:137(__instancecheck__)
    8    0.000    0.000    0.000    0.000 numeric.py:1273(normalize_axis_tuple)
   32    0.000    0.000    0.000    0.000 {built-in method builtins.any}
    4    0.000    0.000    0.000    0.000 numeric.py:1336(moveaxis)
  130    0.000    0.000    0.000    0.000 abc.py:141(__subclasscheck__)
   32    0.000    0.000    0.000    0.000 function_base.py:257(iterable)
  269    0.000    0.000    0.000    0.000 {built-in method builtins.len}
    5    0.000    0.000    0.000    0.000 validation.py:153(_shape_repr)
  120    0.000    0.000    0.000    0.000 multiarray.py:312(where)
   18    0.000    0.000    0.000    0.000 <__array_function__ internals>:2(copyto)
   32    0.000    0.000    0.000    0.000 {method 'conj' of 'numpy.ndarray' objects}
   95    0.000    0.000    0.000    0.000 base.py:1189(isspmatrix)
   45    0.000    0.000    0.000    0.000 arraysetops.py:138(_unpack_tuple)
    5    0.000    0.000    0.000    0.000 _split.py:622(__init__)
    5    0.000    0.000    0.000    0.000 warnings.py:474(__enter__)
   32    0.000    0.000    0.000    0.000 {method 'squeeze' of 'numpy.ndarray' objects}
   30    0.000    0.000    0.000    0.000 validation.py:231(<listcomp>)
   10    0.000    0.000    0.000    0.000 numeric.py:290(full)
   10    0.000    0.000    0.000    0.000 _split.py:423(__init__)
    8    0.000    0.000    0.026    0.003 fromnumeric.py:978(argsort)
    8    0.000    0.000    0.000    0.000 numeric.py:166(ones)
   64    0.000    0.000    0.000    0.000 stride_tricks.py:121(<genexpr>)
   32    0.000    0.000    0.000    0.000 stride_tricks.py:26(_maybe_view_as_subclass)
    5    0.000    0.000    0.000    0.000 warnings.py:181(_add_filter)
    4    0.000    0.000    0.000    0.000 {built-in method _bisect.bisect_left}
    5    0.000    0.000    0.001    0.000 _split.py:685(split)
    8    0.000    0.000    0.026    0.003 <__array_function__ internals>:2(argsort)
    5    0.000    0.000    0.000    0.000 _internal.py:865(npy_ctypes_check)
    5    0.000    0.000    0.000    0.000 fromnumeric.py:1648(ravel)
    4    0.000    0.000    0.002    0.000 fromnumeric.py:657(partition)
   10    0.000    0.000    0.000    0.000 validation.py:180(<genexpr>)
    5    0.000    0.000    0.000    0.000 fromnumeric.py:2629(amin)
    4    0.000    0.000    0.002    0.001 function_base.py:3419(median)
   32    0.000    0.000    0.000    0.000 {built-in method builtins.iter}
   10    0.000    0.000    0.000    0.000 {built-in method builtins.max}
    5    0.000    0.000    0.000    0.000 warnings.py:453(__init__)
    5    0.000    0.000    0.000    0.000 warnings.py:165(simplefilter)
   32    0.000    0.000    0.000    0.000 function_base.py:2240(_cov_dispatcher)
    5    0.000    0.000    0.000    0.000 <__array_function__ internals>:2(nonzero)
    5    0.000    0.000    0.000    0.000 fromnumeric.py:2189(any)
    5    0.000    0.000    0.000    0.000 validation.py:771(column_or_1d)
    5    0.000    0.000    0.000    0.000 {method 'remove' of 'list' objects}
   15    0.000    0.000    0.000    0.000 fromnumeric.py:74(<dictcomp>)
   32    0.000    0.000    0.000    0.000 function_base.py:289(_average_dispatcher)
    5    0.000    0.000    0.001    0.000 fromnumeric.py:2358(cumsum)
    4    0.000    0.000    0.002    0.001 <__array_function__ internals>:2(median)
    5    0.000    0.000    0.000    0.000 {method 'ravel' of 'numpy.ndarray' objects}
   13    0.000    0.000    0.000    0.000 {built-in method numpy.core._multiarray_umath.normalize_axis_index}
    4    0.000    0.000    0.002    0.000 <__array_function__ internals>:2(partition)
    5    0.000    0.000    0.001    0.000 <__array_function__ internals>:2(bincount)
    5    0.000    0.000    0.000    0.000 <__array_function__ internals>:2(concatenate)
    4    0.000    0.000    0.000    0.000 core.py:6251(isMaskedArray)
    5    0.000    0.000    0.000    0.000 <__array_function__ internals>:2(any)
    9    0.000    0.000    0.000    0.000 {method 'insert' of 'list' objects}
    5    0.000    0.000    0.000    0.000 {method 'join' of 'str' objects}
    5    0.000    0.000    0.002    0.000 <__array_function__ internals>:2(cumsum)
    5    0.000    0.000    0.000    0.000 <__array_function__ internals>:2(diff)
    4    0.000    0.000    0.000    0.000 {built-in method builtins.sorted}
    5    0.000    0.000    0.000    0.000 fromnumeric.py:1759(nonzero)
    5    0.000    0.000    0.000    0.000 <__array_function__ internals>:2(amin)
   32    0.000    0.000    0.000    0.000 stride_tricks.py:139(_broadcast_to_dispatcher)
   45    0.000    0.000    0.000    0.000 arraysetops.py:146(_unique_dispatcher)
    4    0.000    0.000    0.000    0.000 <__array_function__ internals>:2(moveaxis)
    5    0.000    0.000    0.000    0.000 _config.py:12(get_config)
    5    0.000    0.000    0.000    0.000 <__array_function__ internals>:2(shape)
    5    0.000    0.000    0.000    0.000 multiclass.py:111(is_multilabel)
    5    0.000    0.000    0.000    0.000 warnings.py:493(__exit__)
   32    0.000    0.000    0.000    0.000 multiarray.py:635(result_type)
    5    0.000    0.000    0.000    0.000 fromnumeric.py:2277(all)
    5    0.000    0.000    0.000    0.000 validation.py:355(_ensure_no_complex_data)
    5    0.000    0.000    0.000    0.000 <__array_function__ internals>:2(all)
    5    0.000    0.000    0.000    0.000 <__array_function__ internals>:2(ravel)
   18    0.000    0.000    0.000    0.000 multiarray.py:1043(copyto)
    8    0.000    0.000    0.000    0.000 numeric.py:1323(<listcomp>)
    5    0.000    0.000    0.000    0.000 fromnumeric.py:1755(_nonzero_dispatcher)
    4    0.000    0.000    0.000    0.000 {method 'transpose' of 'numpy.ndarray' objects}
    5    0.000    0.000    0.000    0.000 {method 'copy' of 'dict' objects}
   15    0.000    0.000    0.000    0.000 {method 'items' of 'dict' objects}
    8    0.000    0.000    0.000    0.000 fromnumeric.py:974(_argsort_dispatcher)
    1    0.000    0.000    0.000    0.000 _methods.py:32(_amin)
    8    0.000    0.000    0.000    0.000 {built-in method _operator.index}
   15    0.000    0.000    0.000    0.000 {built-in method _warnings._filters_mutated}
    5    0.000    0.000    0.000    0.000 fromnumeric.py:1856(shape)
    5    0.000    0.000    0.000    0.000 multiarray.py:145(concatenate)
    4    0.000    0.000    0.000    0.000 function_base.py:3414(_median_dispatcher)
    1    0.000    0.000    0.000    0.000 {method 'min' of 'numpy.ndarray' objects}
    5    0.000    0.000    0.000    0.000 fromnumeric.py:2185(_any_dispatcher)
    5    0.000    0.000    0.000    0.000 multiarray.py:853(bincount)
    5    0.000    0.000    0.000    0.000 fromnumeric.py:1852(_shape_dispatcher)
    5    0.000    0.000    0.000    0.000 fromnumeric.py:2354(_cumsum_dispatcher)
    5    0.000    0.000    0.000    0.000 function_base.py:1143(_diff_dispatcher)
    1    0.000    0.000    0.000    0.000 {method 'max' of 'numpy.ndarray' objects}
    4    0.000    0.000    0.000    0.000 numeric.py:1399(<listcomp>)
    5    0.000    0.000    0.000    0.000 fromnumeric.py:2273(_all_dispatcher)
    5    0.000    0.000    0.000    0.000 fromnumeric.py:2624(_amin_dispatcher)
    4    0.000    0.000    0.000    0.000 fromnumeric.py:653(_partition_dispatcher)
    5    0.000    0.000    0.000    0.000 fromnumeric.py:1644(_ravel_dispatcher)
    4    0.000    0.000    0.000    0.000 numeric.py:1332(_moveaxis_dispatcher)
    1    0.000    0.000    0.000    0.000 _methods.py:28(_amax)
    1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
Elkan
  • 546
  • 8
  • 23

1 Answers1

5

Recent versions of NumPy support an __array_function__ hook that objects can implement to customize what arbitrary NumPy callables do when called on them. Support is disabled by default in 1.16, enabled by default in 1.17, and expected to eventually be enabled unconditionally.

implement_array_function is the dispatcher that calls either a default implementation or an __array_function__ hook, to implement __array_function__ support. As designed, it is intended to be called once for literally every single call to a public NumPy callable, including calls happening within NumPy, and it has to do a lot of method lookups. Hopefully future optimization work will reduce some of this overhead.

You can see additional details in NEP 18, and you can check the function's docstring with help(numpy.core._multiarray_umath.implement_array_function):

Help on built-in function implement_array_function in module numpy.core._multiarray_umath:

implement_array_function(...)
    Implement a function with checks for __array_function__ overrides.

    All arguments are required, and can only be passed by position.

    Arguments
    ---------
    implementation : function
        Function that implements the operation on NumPy array without
        overrides when called like ``implementation(*args, **kwargs)``.
    public_api : function
        Function exposed by NumPy's public API originally called like
        ``public_api(*args, **kwargs)`` on which arguments are now being
        checked.
    relevant_args : iterable
        Iterable of arguments to check for __array_function__ methods.
    args : tuple
        Arbitrary positional arguments originally passed into ``public_api``.
    kwargs : dict
        Arbitrary keyword arguments originally passed into ``public_api``.

    Returns
    -------
    Result from calling ``implementation()`` or an ``__array_function__``
    method, as appropriate.

    Raises
    ------
    TypeError : if no implementation is found.
user2357112
  • 260,549
  • 28
  • 431
  • 505
  • 1
    Thanks. Any way to speed it up? – Elkan Nov 18 '19 at 07:55
  • 2
    @Elkan: Contribute optimizations to the NumPy project and push for them to get merged? Roll back to an old NumPy version and see if that helps? Those are my two primary ideas. – user2357112 Nov 18 '19 at 08:00