Creating numpy array from dict

Question

Assume I have a dict, call it coeffs:

coeffs = {'X1': 0.1, 'X2':0.2, 'X3':0.4, ..., 'Xn':0.09}

How can I convert the values into a 1 x n ndarray?

Into a n x m ndarray?

We are stumped as well, because you don't explain what you want to. It's easy to a get a list of values from the dictionary, and an array from that. But that does nothing with the dictionary keys. — hpaulj, Nov 11 '15 at 20:01
@hpaulj "How can I convert the values into a 1 x n array" is not an explanation of what I want to do? Not all of us on this website are python experts... — GPB, Nov 12 '15 at 00:54
I had to guess that you wanted `coeffs['X1']` to be the 1st item in the list, `coeffs['Xn']` the nth. But I'd prefer that you made that kind of detail explicit. I'm still at a loss as to what an `nxm` array of this data would look like. — hpaulj, Nov 12 '15 at 03:00
@GPB: I, for one, was not aware that the numbers after the `X` were supposed to be indices. It should have been clear, but ... — serv-inc, Nov 12 '15 at 09:08
@user - point taken. But the way I notated it, I didn't think it mattered. — GPB, Nov 12 '15 at 20:34

score 2 · Answer 1 · edited Sep 20 '17 at 21:41

Here's an example of using your coeffs to fill in an array, with value indices derived from the dictionary keys:

In [591]: coeffs = {'X1': 0.1, 'X2':0.2, 'X3':0.4, 'X4':0.09}
In [592]: alist = [[int(k[1:]),v] for k,v in coeffs.items()]
In [593]: alist
Out[593]: [[4, 0.09], [3, 0.4], [1, 0.1], [2, 0.2]]

Here I stripped off the initial character and converted the rest to an integer. You could do your own conversion.

Now just initial an empty array, and fill in values:

In [594]: X = np.zeros((5,))
In [595]: for k,v in alist: X[k] = v
In [596]: X
Out[596]: array([ 0.  ,  0.1 ,  0.2 ,  0.4 ,  0.09])

Obviously I could have used X = np.zeros((1,5)). An (n,m) array doesn't make sense unless there's a basis for choosing n for each dictionary item.

Just for laughs, here's another way of making an array from a dictionary - put the keys and values into fields of structured array:

In [613]: X = np.zeros(len(coeffs),dtype=[('keys','S3'),('values',float)])
In [614]: X
Out[614]: 
array([(b'', 0.0), (b'', 0.0), (b'', 0.0), (b'', 0.0)], 
      dtype=[('keys', 'S3'), ('values', '<f8')])
In [615]: for i,(k,v) in enumerate(coeffs.items()):
    X[i]=(k,v)
   .....:     
In [616]: X
Out[616]: 
array([(b'X4', 0.09), (b'X3', 0.4), (b'X1', 0.1), (b'X2', 0.2)], 
      dtype=[('keys', 'S3'), ('values', '<f8')])
In [617]: X['keys']
Out[617]: 
array([b'X4', b'X3', b'X1', b'X2'], 
      dtype='|S3')
In [618]: X['values']
Out[618]: array([ 0.09,  0.4 ,  0.1 ,  0.2 ])

The scipy sparse module has a sparse matrix format that stores its values in a dictionary, in fact, it is a subclass of dictionary. The keys in this dictionary are (i,j) tuples, the indexes of the nonzero elements. Sparse has the tools for quickly converting such a matrix into other, more computational friendly sparse formats, and into regular dense arrays.

I learned in other SO questions that a fast way to build such a matrix is to use the regular dictionary update method to copy values from another dictionary.

Inspired by @user's 2d version of this problem, here's how such a sparse matrix could be created.

Start with @user's sample coeffs:

In [24]: coeffs
Out[24]: 
{'Y8': 22,
 'Y2': 16,
 'Y6': 20,
 'X5': 20,
 'Y9': 23,
 'X2': 17,
  ...
 'Y1': 15,
 'X4': 19}

define a little function that converts the X3 style of key to (0,3) style:

In [25]: def decodekey(akey):
    pt1,pt2 = akey[0],akey[1:]
    i = {'X':0, 'Y':1}[pt1]
    j = int(pt2)
    return i,j
   ....:

Apply it with a dictionary comprehension to coeffs (or use a regular loop in earlier Python versions):

In [26]: coeffs1 = {decodekey(k):v for k,v in coeffs.items()}
In [27]: coeffs1
Out[27]: 
{(1, 2): 16,
 (0, 1): 16,
 (0, 0): 15,
 (1, 4): 18,
 (1, 5): 19,
 ...
 (0, 8): 23,
 (0, 2): 17}

Import sparse and define an empty dok matrix:

In [28]: from scipy import sparse
In [29]: M=sparse.dok_matrix((2,10),dtype=int)
In [30]: M.A
Out[30]: 
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])

fill it with the coeffs1 dictionary values:

In [31]: M.update(coeffs1)
In [33]: M.A   # convert to dense array
Out[33]: 
array([[15, 16, 17, 18, 19, 20, 21, 22, 23, 24],
       [14, 15, 16, 17, 18, 19, 20, 21, 22, 23]])

Actually, I don't need to use sparse to convert coeffs1 into an array. The (i,j) tuple can index an array directly, A[(i,j)] is the same as A[i,j].

In [34]: A=np.zeros((2,10),int)
In [35]: for k,v in coeffs1.items():
   ....:     A[k] = v
   ....:     
In [36]: A
Out[36]: 
array([[15, 16, 17, 18, 19, 20, 21, 22, 23, 24],
       [14, 15, 16, 17, 18, 19, 20, 21, 22, 23]])

The interpretation of the numbers as indices was nice. See below for a nxm-solution. (with further interpretation) — serv-inc, Nov 12 '15 at 09:08
I found a couple of ways of creating your 2 row array. Once a key like 'X3' is converted to a tuple like `(0,3)`, making an array is almost trivial. — hpaulj, Nov 12 '15 at 19:36

score 1 · Answer 2 · edited Jun 20 '20 at 09:12

1

Concerning a `n` x `m` array

@hpaulj's answer assumed (rightly) that the numbers after the X were supposed to be positions. If you had data like

coeffs = {'X1': 3, 'X2' : 5, ..., 'Xn' : 34, 'Y1': 5, 'Y2' : -3, ..., 'Yn': 32}

You could do as follows. Given sample data like

{'Y3': 17, 'Y2': 16, 'Y8': 22, 'Y5': 19, 'Y6': 20, 'Y4': 18, 'Y9': 23, 'Y1': 15, 'X8': 23, 'X9': 24, 'Y7': 21, 'Y0': 14, 'X2': 17, 'X3': 18, 'X0': 15, 'X1': 16, 'X6': 21, 'X7': 22, 'X4': 19, 'X5': 20}

created by

a = {}
for i in range(10):
    a['X'+str(i)] = 15 + i
for i in range(10):
    a['Y'+str(i)] = 14 + i

Put it in some ordered dictionary (inefficient, but easy)

b = {}
for k, v in a.iteritems():
    letter = k[0]
    index = float(k[1:])
    if letter not in b.keys():
        b[letter] = {}
    b[letter][index] = v

gives

>>> b
{'Y': {0: 14, 1: 15, 2: 16, 3: 17, 4: 18, 5: 19, 6: 20, 7: 21, 8: 22, 9: 23}, 'X': {0: 15, 1: 16, 2: 17, 3: 18, 4: 19, 5: 20, 6: 21, 7: 22, 8: 23, 9: 24}}

Find out the target dimesions of the array. (This assumes all params are the same length and you have all values given).

row_length = max(b.values()[0])
row_indices = b.keys()
row_indices.sort()

Create the array via

X = np.empty((len(b.keys()), max(b.values()[0])))

and insert the data:

for i,row in enumerate(row_indices):
    for j in range(row_length):
        X[i,j] = b[row][j]

Result

>>> X
array([[ 15.,  16.,  17.,  18.,  19.,  20.,  21.,  22.,  23.],
       [ 14.,  15.,  16.,  17.,  18.,  19.,  20.,  21.,  22.]])

Old answer

coeffs.values() is an array of the dict's values. Just create a

np.array(coeffs.values())

In general, when you have an object like coeffs, you can type

help(coeffs)

in the interpreter, to get a list of all it can do.

edited Jun 20 '20 at 09:12

Community

1
1

answered Nov 11 '15 at 18:49

serv-inc

35,772
9
166
188

Tnx...it seems obvious. Maybe thats why someone downvoted my question? – GPB Nov 11 '15 at 18:51
1

What's not so obvious is how the values get ordered in the array. Creating an array from the dict's values above seems to scramble the order of the dict. How is this done? – GPB Nov 11 '15 at 19:13
@GPB: there is no guaranteed order (except for the order always being the same if you do not alter the dict: http://stackoverflow.com/questions/835092/python-dictionary-are-keys-and-values-always-the-same-order). If you need insertion order, use a https://docs.python.org/2/library/collections.html#collections.OrderedDict – serv-inc Nov 11 '15 at 19:20
@GPB: (re downvote) sometimes they are random. What might help is 1) showing research effort (saying what you tried ) 2) your code was edited (proposed by me) to follow naming conventions. that sometimes helps, too 3) 1 upvote is "worth" some downvotes. – serv-inc Nov 11 '15 at 19:22
@GPB: any you might say what you meant by m x n-array? – serv-inc Nov 11 '15 at 19:27
1

- re: 2) "Showing Research Effort" is a tough one to prove, eh? How long was i searching for this? A good 30 mins. I guess that comment implies I'm thicker than most. Not sure what editing was done. – GPB Nov 11 '15 at 19:37
@GPB: suggested edit is minor, here: http://stackoverflow.com/review/suggested-edits/10172926. We all started at some point. It's similar to learning a language. And just say that you searched and how. – serv-inc Nov 11 '15 at 19:41
It's safer to index a 2d array with a single bracket syntax: `X[i,j] = b[row][j]`. The `[][]` works here, but depends of the 1st [] creating a view as opposed to a copy. – hpaulj Nov 12 '15 at 19:08

Creating numpy array from dict

2 Answers2

Concerning a n x m array

Old answer

Concerning a `n` x `m` array