0

Following on from this question:

Unexpectedly large array created with numpy.ones when setting names

When I multiply

a = np.ones([len(sectors),len(columns)])
a[0,:] *= [1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8] 

It works fine.

When I try

columns = ["Port Wt", "Bench Wt", "Port Retn", "Bench Retn", 
            "Attrib", "Select", "Inter", "Total"]
a = np.ones((10,), dtype={"names":columns, "formats":["f8"]*len(columns)})
a[0] *= [1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8]

I get the error

TypeError: cannot convert to an int; scalar object is not a number

I would like to use field-names if possible. What am I doing wrong here?

Many thanks.

Community
  • 1
  • 1
Tahnoon Pasha
  • 5,848
  • 14
  • 49
  • 75
  • 1
    Shouldn't the first parameter for ones, shape, have two elements? You are just creating a 1D array here, and a[0] will be a number. – Ferdinand Beyer Nov 23 '14 at 09:23
  • Hi @FerdinandBeyer - using two elements caused unusual array results per my previous question, hence the approach I've taken is based on the response to that question. – Tahnoon Pasha Nov 23 '14 at 10:34

2 Answers2

2

A element (row) of this a can be modified by assigning it a tuple. We can take advantage of the fact that lists easily convert to and from arrays, to write:

In [162]: a = np.ones((10,), dtype={"names":columns, "formats":["f8"]*len(columns)})

In [163]: x=[1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8]

In [164]: a[0]=tuple(np.array(x)*list(a[0]))

In [165]: a
Out[165]: 
array([(1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8),
       ...], dtype=[('Port Wt', '<f8'), ('Bench Wt', '<f8'),...

More generally you could write

a[i] = tuple(foo(list(a[i]))

Multiple values ('rows') of a can be set with a list of tuples.

An earlier SO structure array question (https://stackoverflow.com/a/26183332/901925) suggests another solution - create a partner 2d array that shares the same data buffer.

In [311]: a1 = np.empty((10,8))  # conventional 2d array

In [312]: a1.data = a.data   # share the a's data buffer

In [313]: a1[0] *= x   # do math on a1

In [314]: a1
Out[314]: 
array([[ 1.1,  1.2,  1.3,  1.4,  1.5,  1.6,  1.7,  1.8],
       ...
       [ 1. ,  1. ,  1. ,  1. ,  1. ,  1. ,  1. ,  1. ]])

In [315]: a
Out[315]: 
array([(1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8),
       ...
       (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)], 
      dtype=[('Port Wt', '<f8'), ('Bench Wt', '<f8'), ('Port Retn', '<f8'), ('Bench Retn', '<f8'), ('Attrib', '<f8'), ('Select', '<f8'), ('Inter', '<f8'), ('Total', '<f8')])

By sharing the data buffer, changes made to a1 affect a as well.

It might be better to view 2d a1 as the primary array, and a as a structured view. a could be constructed on the fly, as needed to display the data, access columns by name, or write to a csv file.

Community
  • 1
  • 1
hpaulj
  • 221,503
  • 14
  • 230
  • 353
1

The rows of your array a are not numpy's arrays, the closest things to them are possibly tuples

>>> import numpy as np
>>> columns = ["Port Wt", "Bench Wt", "Port Retn", "Bench Retn", 
...            "Attrib", "Select", "Inter", "Total"]
>>> a = np.ones((10,), dtype={"names":columns, "formats":["f8"]*len(columns)})
>>> type(a[0,0])
IndexError: too many indices
>>> type(a[0][0])
numpy.float64
>>> type(a[0])
numpy.void
>>> 

on the contrary the columns of a are ndarray's and you can multiply them by a list of floats of the correct length (not the nuber of columns but the number of rows)

>>> type(a['Select'])
numpy.ndarray
>>> a['Select']*[1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-27-fc8dc4596098> in <module>()
----> 1 a['Select']*[1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8]

ValueError: operands could not be broadcast together with shapes (10,) (8,) 
>>> a['Select']*[1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8,       0,0]
array([ 1.1,  1.2,  1.3,  1.4,  1.5,  1.6,  1.7,  1.8,  0. ,  0. ])
>>> 

Edit

In response to a comment from OP: «is it not possible to apply a function to a row in a named array of fields (or tuple) in numpy?»

The only way that I know of is

>>> a[0] = tuple(b*a[c][0] for b, c in zip([1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8],columns))
>>> print a
[(1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8)
 (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)
 (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)
 (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)
 (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)
 (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)
 (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)
 (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)
 (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)
 (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)]
>>> 

but I'm not the most skilled numpy expert around... maybe one of the least skilled indeed

gboffi
  • 22,939
  • 8
  • 54
  • 85
  • thanks @gboffi, does this mean that its not possible to apply a function to a row in a named array of fields (or tuple) in numpy? Or is there another way to do what I'm trying to do? – Tahnoon Pasha Nov 23 '14 at 10:32
  • 1
    @TahnoonPasha I've edited my A to tell you what I know is possible. I cannot exclude that other methods, both more efficient enad more elegant, are possible but what I have added to my answer is the best that I know of. IMHO when you have a named ndarray you work with named columns, which contains homogeneous values. If you want to modify a row of your data (typically an _event_) you should do it before creating the named array. – gboffi Nov 23 '14 at 13:34