0

I am using python 3 with numpy, it seems that numpy operations are using all my cores well, but when I use a function with np.vectorize like so for instance:

f = lambda x: (x*1000) / 20 * 15 + 3
v_func = np.vectorize(f) 
v_func(arr) 

It uses only one core for a long time (according to the cpu utilization chart).

How can I use it with numpy's multi core capabilities?

yatu
  • 86,083
  • 12
  • 84
  • 139
thebeancounter
  • 4,261
  • 8
  • 61
  • 109
  • Neither lambda nor `vectorize` are actually vectorized in the numpy sense. `vectorize` is a confusingly-named convenience method that's basically a python `for` loop. – roganjosh Feb 07 '19 at 07:16
  • `arr = (arr * 1000) / 25 * 15 + 3` is vectorized. Treat the array as though it was a scalar. – roganjosh Feb 07 '19 at 07:20
  • For some reason many people stop reading the `np.vectorize` documentation half way through, and miss the disclaimer about performance. `The implementation is essentially a for loop.` – hpaulj Feb 07 '19 at 07:42
  • OK, so is there a way to implement something that is more effective using the existing numpy tools? – thebeancounter Feb 07 '19 at 07:48
  • Look into `numexpr` module. – Divakar Feb 07 '19 at 08:39
  • @Divakar would you like to post an answer with example and I will accept it? – thebeancounter Feb 07 '19 at 09:10
  • Should be pretty straight-forward. Would encourage you to post your own answer on this. [`Related post`](https://stackoverflow.com/a/49901875/3293881) on how to control multi-core functionality. – Divakar Feb 07 '19 at 09:18

1 Answers1

1

This can be done using numexpr library using the following code:

import numexpr as ne
import numpy as np

data = list(range(1000))
arr = np.array(range(100000))
b = ne.evaluate("(arr * 1000) / 25 * 15 + 3")
print(b)

This library is creating a vectorized function that does utilize multithreading capabilities as explained here

Mark Setchell
  • 191,897
  • 31
  • 273
  • 432
thebeancounter
  • 4,261
  • 8
  • 61
  • 109