1

I tested the code for both numpy.take and slice as follows:

import numpy as np
import time


a = np.random.randn(4000000,500)
b = np.arange(0, len(a))
t1 = time.time()
for i in range(10):
        a[b!=2]
t2 = time.time()
print(t2-t1)

t1 = time.time()
for i in range(10):
        a.take(b!=2, axis=0)
t2 = time.time()
print(t2-t1)

I checked my CPU and the most of them are idle. Only 1 CPU is used. As a result, the timing is very slow.

65.91494154930115
47.01117730140686

It seems to me that slicing is a parallelizable operation. Why is numpy not parallelizing it? Is it that numpy doens't support parallelizable slice or that I need to use some special function in numpy?

user10024395
  • 1
  • 1
  • 3
  • 20
  • https://stackoverflow.com/questions/16617973/why-isnt-numpy-mean-multithreaded – NPE Jun 05 '18 at 04:27
  • I wouldn't call this `slicing`. `slicing` is basic indexing, using notation like `a[1:100,:]`, which returns a `view` and is quite fast, since it doesn't have to copy any data. With `b!=2`, you are doing advanced indexing, with a boolean mask. It does copy data, in this case all elements except one row. – hpaulj Jun 05 '18 at 04:44

0 Answers0