9
import numpy as np
def gen_c():
    c = np.ones(5, dtype=int)
    j = 0
    t = 10
    while j < t:
        c[0] = j
        yield c.tolist()
        j += 1 

# What I did:
# res = np.array(list(gen_c())) <-- useless allocation of memory

# this line is what I'd like to do and it's killing me
res = np.fromiter(gen_c(), dtype=int) # dtype=list ?

The error said ValueError: setting an array element with a sequence.

This is a very stupid piece of code. I'd like to create an array of list(finally a 2D array) from a generator...

Although I searched everywhere, I still cannot figure out how to make it work.

hpaulj
  • 221,503
  • 14
  • 230
  • 353
XXXXXL
  • 125
  • 2
  • 7

2 Answers2

7

You can only use numpy.fromiter() to create 1-dimensional arrays (not 2-D arrays) as given in the documentation of numpy.fromiter -

numpy.fromiter(iterable, dtype, count=-1)

Create a new 1-dimensional array from an iterable object.

One thing you can do is convert your generator function to give out single values from c and then create a 1D array from it and then reshape it to (-1,5) . Example -

import numpy as np
def gen_c():
    c = np.ones(5, dtype=int)
    j = 0
    t = 10
    while j < t:
        c[0] = j
        for i in c:
            yield i
        j += 1

np.fromiter(gen_c(),dtype=int).reshape((-1,5))

Demo -

In [5]: %paste
import numpy as np
def gen_c():
    c = np.ones(5, dtype=int)
    j = 0
    t = 10
    while j < t:
        c[0] = j
        for i in c:
            yield i
        j += 1

np.fromiter(gen_c(),dtype=int).reshape((-1,5))

## -- End pasted text --
Out[5]:
array([[0, 1, 1, 1, 1],
       [1, 1, 1, 1, 1],
       [2, 1, 1, 1, 1],
       [3, 1, 1, 1, 1],
       [4, 1, 1, 1, 1],
       [5, 1, 1, 1, 1],
       [6, 1, 1, 1, 1],
       [7, 1, 1, 1, 1],
       [8, 1, 1, 1, 1],
       [9, 1, 1, 1, 1]])
Community
  • 1
  • 1
Anand S Kumar
  • 88,551
  • 18
  • 188
  • 176
  • 2
    PS: in fact, this solution(I'm not saying yours, i'm talking about mine) do not improve the performance in my case... – XXXXXL Oct 07 '15 at 16:29
  • Yes, because `c.tolist()` is muuuch faster than looping through `c` and yielding each value – Anand S Kumar Oct 07 '15 at 16:38
  • @XXXXXL Since numpy 1.23, [`fromiter()`](https://numpy.org/doc/stable/reference/generated/numpy.fromiter.html) can generate 2d arrays. For this example: `np.fromiter(gen_c(), np.dtype((int, 5)))`. However it is still slower than the alternative given by the OP. – Javier TG Nov 24 '22 at 15:20
1

As the docs suggested, np.fromiter() only accepts 1-dimensional iterables. You can use itertools.chain.from_iterable() to flatten the iterable first, and np.reshape() it back later:

import itertools
import numpy as np

def fromiter2d(it, dtype):

    # clone the iterator to get its length
    it, it2 = itertools.tee(it)
    length = sum(1 for _ in it2)

    flattened = itertools.chain.from_iterable(it)
    array_1d = np.fromiter(flattened, dtype)
    array_2d = np.reshape(array_1d, (length, -1))
    return array_2d

Demo:

>>> iter2d = (range(i, i + 4) for i in range(0, 12, 4))

>>> from_2d_iter(iter2d, int)
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

Only tested on Python 3.6, but should also work with Python 2.

Arnie97
  • 1,020
  • 7
  • 19