2

I want to create an array with 3 columns. The first one a string, the other two integers used for calculations. Then more rows will be added through the append function (below).

The problem is that all columns seem to be coded as strings rather than just the first one. How do I get the correct data type for the numbers?

a = np.array([["String",1,2]])
a = np.append(a, [["another string", 3, 4]],axis = 0)
JLX
  • 29
  • 1
  • 1
  • 2

2 Answers2

9

To have such a mixed datatype data, we could use object as dtype before appending or stacking -

a = np.array([["String",1,2]], dtype=object)
b = [["another string", 3, 4]]
a = np.vstack((a,np.asarray(b,object)))

Sample run -

In [40]: a = np.array([["String",1,2]], dtype=object)

In [41]: b = [["another string", 3, 4]]

In [42]: np.vstack((a,np.asarray(b,object)))
Out[42]: 
array([['String', 1, 2],
       ['another string', 3, 4]], dtype=object)
Divakar
  • 218,885
  • 19
  • 262
  • 358
1

When collecting values iteratively, it is usually best to collect them in a list, and make the array afterwards:

For example, making a list with your data:

In [371]: alist = [("String", 1, 2)]
In [372]: alist.append(("another string", 3, 4))
In [373]: alist
Out[373]: [('String', 1, 2), ('another string', 3, 4)]

For many purposes that list is quite useful, alist[0], or [i[0] for i in alist].

To make a list, one option is a structured array. Because I collected values as a list of tuples I can do:

In [374]: np.array(alist, dtype='U20,int,int')
Out[374]: 
array([('String', 1, 2), ('another string', 3, 4)], 
      dtype=[('f0', '<U20'), ('f1', '<i4'), ('f2', '<i4')])
In [375]: _['f1']
Out[375]: array([1, 3])

We access fields of such an array by field name. The array itself is 1d, (2,).

If instead we make an object dtype array:

In [376]: np.array(alist, dtype=object)
Out[376]: 
array([['String', 1, 2],
       ['another string', 3, 4]], dtype=object)
In [377]: _.shape
Out[377]: (2, 3)
In [378]: __[:,1]
Out[378]: array([1, 3], dtype=object)

With this we can access rows and columns. But beware that we don't get the fast numpy calculation benefits with a object array, especially one with mixed types.

hpaulj
  • 221,503
  • 14
  • 230
  • 353