4

I have a generator that returns numpy arrays. For example sake, let it be:

import numpy as np
a = np.arange(9).reshape(3,3)
gen = (x for x in a)

Calling:

np.sum(gen)

On numpy 1.17.4:

DeprecationWarning: Calling np.sum(generator) is deprecated, and in the future will give a different result. Use np.sum(np.fromiter(generator)) or the python sum builtin instead.

Trying to refactor the above:

np.sum(np.fromiter(gen, dtype=np.ndarray))

I get:

ValueError: cannot create object arrays from iterator

What is wrong in the above statement?

T81
  • 171
  • 1
  • 3
  • 12
  • For the given example, what's the expected result? Is it 36 (summing all the elements) or is it `[ 3, 12, 21]` (summing row-wise)? – Andreas K. Feb 10 '20 at 12:31

3 Answers3

4

The problem is the second argument, np.ndarray in the fromiter(). Numpy fromiter expected a 1D and returns a 1D array:

Create a new 1-dimensional array from an iterable object.

Therefore, you cannot create object arrays from iterator. Furthermore the .reshape() will also raise an error, because of what I stated in the first line. All in all, this works:

import numpy as np
a = np.arange(9)
gen = (x for x in a)
print(np.sum(np.fromiter(gen,float)))

Output:

36
Celius Stingher
  • 17,835
  • 6
  • 23
  • 53
  • The error in the OP has nothing to do with dimensionality of the input. You just can't create object arrays from iterators (because it's not implemented), e.g. `np.fromiter(iter(range(3)), dtype=object)`. Reasons are probably because object arrays need to deal with reference counting rather than convert the items to an internal data type. If you specified `dtype=float` only then you would run into the dimension problem. However I don't see how your answer addresses the OP's question. The OP states that they have a generator that returns arrays while you return numbers (this is just `a.sum()`). – a_guest Feb 10 '20 at 15:39
  • I think we're addressing different issues. The title of the questions refers to `np.sum(np.fromiter())` and that's why I'm answering with the same tools OP uses. It seems you understood perfectly the intent of OPs answer because of the accepted answer, however there is no the relation to the title, nor making the functions OP asks about work. – Celius Stingher Feb 10 '20 at 15:50
1

Since you're summing instances of arrays you can just use the built-in sum:

result = sum(gen)
a_guest
  • 34,165
  • 12
  • 64
  • 118
  • Note that this gives the column-wise sum of the array a, i.e. the result is `[ 9, 12, 15]`. P.s. I didn't downvote your answer. – Andreas K. Feb 10 '20 at 12:47
  • @AndreasK. This gives exactly the same result as the OP's code which raised a warning: `np.sum(gen)`. And I cannot imagine what other result you would want to have (except of course summing everything together which is trivial by doing `sum(gen).sum()`). Since the generator yields *rows* of `a` I cannot imagine that the OP expected a row-wise sum, since that's just another generator: `(x.sum() for x in gen)`. – a_guest Feb 10 '20 at 15:15
1

What about simply converting your generator to a list and then passing it to the np.sum?

a = np.arange(9).reshape(3,3)
gen = (x for x in a)

Summing all the elements:

>>> np.sum(list(gen))
36

Summing column-wise:

>>> np.sum(list(gen), axis=0)
array([ 9, 12, 15])

Summing row-wise:

>>> np.sum(list(gen), axis=1)
array([ 3, 12, 21])
Andreas K.
  • 9,282
  • 3
  • 40
  • 45