2

I am getting Numpy ValueError: setting an array element with a sequence when attempting to turn mysql data into a numpy array (eventually a list). Originally I wanted to have multiple fields but I've simplified the code to only making a list with one field (integer) for troubleshooting purposes.

I'm new to numpy so I'm not exactly sure what i4 represents (although docs say it means an integer.... ok). As for the count, that seems to refer to the number of fields. However, something on that fromiter line is still causing it to hit an exception.

import MySQLdb
import numpy

conn = MySQLdb.connect(host="localhost", user="x", passwd="x", db="x")
curs = conn.cursor() 
numrows = curs.execute("select id from table")

A = numpy.fromiter(curs.fetchall(), count=numrows, dtype=('i4'))

print A 
ids = A['f0'] 

Traceback:

A = numpy.fromiter(curs.fetchall(), count=1, dtype=('i4'))
ValueError: setting an array element with a sequence.
derpy
  • 67
  • 1
  • 6
  • Please add the full error traceback to your question. – cel Sep 22 '15 at 17:04
  • I added the relevant lines. It really doesn't show anything more than the errors I already quoted. – derpy Sep 22 '15 at 17:08
  • It does show the line numbers among other useful information. Just copy the complete output to your question. – cel Sep 22 '15 at 17:11
  • There are only 8 lines in this entire script. The fromiter one is the only exception raised. – derpy Sep 22 '15 at 17:26

1 Answers1

4

correction - to work with a list of tuples from fetchall, the dtype should produce a structured array

Looking at the documentation I see that fetch_all returns a list of tuples, not a generator. But that isn't the issue here. Both are iterables. The problem is with the dtype. To make a 1d array from a list of tuples, fromiter requires a structured, compound, dtype.

This dtype works with a 1 element tuple:

In [355]: np.fromiter([(1,)],dtype=[('f0','i4')])
Out[355]: 
array([(1,)], dtype=[('f0', '<i4')])

This works with 2 fields (columns)

In [356]: np.fromiter([(1,1)],dtype=('i4,i4'))
Out[356]: 
array([(1, 1)],   dtype=[('f0', '<i4'), ('f1', '<i4')])

But these are the same - a plain, non-structured array.

np.fromiter([(1,)],dtype=('i4'))
np.fromiter([(1,)],dtype=int)

[(1,)] is handled the same as [[1]], the input for a 2d array, not the 1d iterable that fromiter expects.


np.array works the same as fromiter for the structured cases:

 np.array([(1,)],dtype=[('f0','i4')])
 np.array([(1,1)],dtype=('i4,i4'))

It also works with int (or i4), but the result is a 2d array:

In [366]: np.array([(1,)],dtype=('i4'))
Out[366]: array([[1]])

(earlier version)

I can reproduce your error message by giving fromiter a [(1,)] iterable.

In [288]: np.fromiter([(1,)],dtype=int)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-288-ba24373a9489> in <module>()
----> 1 np.fromiter([(1,)],dtype=int)

ValueError: setting an array element with a sequence.

fromiter wants a 1d input, e.g. [1,2,3] (or the generator equivalent).

It's been a while since I worked with sql, but my guess is that curs.fetchall() gives a iterable of tuples, not an iterable of single numbers.

You need to display (print) curs.fetchall() or list(curs.fetchall()) to see what what is being passed to fromiter.

Why are you using fromiter? Have you tried np.array(curs.fetchall())?

Let's try a generator expression to better simulate a fetchall that generates a tuple:

In [298]: np.fromiter((i for i in [(1,2,3)]),dtype=int)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-298-f8fbf106b4d1> in <module>()
----> 1 np.fromiter((i for i in [(1,2,3)]),dtype=int)

ValueError: setting an array element with a sequence.
In [299]: np.array((i for i in [(1,2,3)]),dtype=int)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-299-71dd7463b539> in <module>()
----> 1 np.array((i for i in [(1,2,3)]),dtype=int)

TypeError: int() argument must be a string or a number, not 'generator'

This works:

In [300]: np.array(list(i for i in [(1,2,3)]),dtype=int)
Out[300]: array([[1, 2, 3]])

In [301]: list(i for i in [(1,2,3)])
Out[301]: [(1, 2, 3)]

The simplest way to create a numpy array is with a list - it can be a list of numbers, a list of lists (all the same size) or a list of tuples.

What's the most efficient way to convert a MySQL result set to a NumPy array? is an earlier discussion of using fetchall and fromiter.


Community
  • 1
  • 1
hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • 1
    Fromiter is what I found on google. All I want to do is move mysql data into some lists so I can graph the data. – derpy Sep 22 '15 at 17:44
  • In any, let's be clear about what you get from `cur`. Try to create a basic Python list and display that. Then we can address making an `np.array`. – hpaulj Sep 22 '15 at 17:47
  • Thanks. I'm still looking into it but it would seem as though this would relate to the fact I was only using one field (as a test), but it was actually expecting at least 2 or something. I created a new table with 2 fields of an integer each, to test, and was then able to run this code without an exception. – derpy Sep 22 '15 at 18:00
  • I figured out why 2 fields works, but 1 doesn't - `dtype=('i4')` does not produce a structured array, while `dtype=('i4,i4')` does. – hpaulj Sep 22 '15 at 21:29