Numpy: genfromtxt forming tuples

Question

Here is my menu.csv:

Item,Price
Curry Rice,3.5
Pork Chop,6
Seafood Soup,5
Salad,2.8

Here is my code:

import numpy as np
menu_items = np.genfromtxt("menu.csv", delimiter=',',names=True)
print(menu_items)

What I get:

[(nan, 3.5) (nan, 6.2) (nan, 3. ) (nan, 2.8)]

When I use dtype=None:

[(b'Curry Rice', 3.5) (b'Pork Chop', 6.2) (b'Seafood Soup', 3. )
 (b'Salad', 2.8)]

What I want:

[(Curry Rice, 3.5) (Pork Chop, 6.2) (Seafood Soup, 3. ) (Salad, 2.8)]

Any help is appreciated

Possible duplicate of [How to use numpy.genfromtxt when first column is string and the remaining columns are numbers?](https://stackoverflow.com/questions/12319969/how-to-use-numpy-genfromtxt-when-first-column-is-string-and-the-remaining-column) — Ari Cooper-Davis, Nov 10 '19 at 15:23
The 'b' indicates bytestring values, a `S` dtype. In Py3 `unicode` is the standard string type. Try adding `encoding=None` to your `genfromtxt` call — hpaulj, Nov 10 '19 at 16:27

hpaulj · Answer 1 · 2019-11-10T21:04:44.977

With your sample file:

In [349]: cat stack58789967.txt                                                 
Item,Price
Curry Rice,3.5
Pork Chop,6
Seafood Soup,5
Salad,2.8

In [350]: np.genfromtxt('stack58789967.txt',delimiter=',',names=True, dtype=None)                                                                     
/usr/local/bin/ipython3:1: VisibleDeprecationWarning: Reading unicode 
   strings without specifying the encoding argument is deprecated. Set the 
   encoding, use None for the system default.
  #!/usr/bin/python3
Out[350]: 
array([(b'Curry Rice', 3.5), (b'Pork Chop', 6. ), (b'Seafood Soup', 5. ),
       (b'Salad', 2.8)], dtype=[('Item', 'S12'), ('Price', '<f8')])

In [351]: np.genfromtxt('stack58789967.txt',delimiter=',',names=True, dtype=None, encoding=None)                                                      
Out[351]: 
array([('Curry Rice', 3.5), ('Pork Chop', 6. ), ('Seafood Soup', 5. ),
       ('Salad', 2.8)], dtype=[('Item', '<U12'), ('Price', '<f8')])

'S12' is bytestring dtype, one byte per character. This is the Py2 norm. 'U12' is unicode dtype, 4 bytes per character. This is the Py3 norm.

The 'tuples' here mark the records of a structured array.

The array is 1d, and fields are accessed by name:

In [352]: _.shape                                                               
Out[352]: (4,)
In [353]: __['Item']                                                            
Out[353]: array(['Curry Rice', 'Pork Chop', 'Seafood Soup', 'Salad'], dtype='<U12')

nicdelillo · Answer 2 · 2019-11-10T15:57:59.550

0

Welcome!

I think your question looks very similar to How to use numpy.genfromtxt when first column is string and the remaining columns are numbers? . And it looks extensively answered. Have a look there and also check the dtype option for np.genfromtxt in the python doc

edited Nov 10 '19 at 15:57

answered Nov 10 '19 at 15:25

nicdelillo

517
5
13

score 0 · Answer 3 · answered Nov 10 '19 at 15:25

0

By default numpy.genfromtxt() assumes that the data-type of each column is a float. You can ask it to try and guess the data-type of each column by passing it the keyword argument dtype=None.

menu_items = np.genfromtxt("menu.csv", delimiter=',', names=True, dtype=None)

answered Nov 10 '19 at 15:25

Ari Cooper-Davis

3,374
3
26
43

Hi, when I use dtype=None, I have b' in front of the menu items (I edited the page for clarification) – C_WJ Nov 10 '19 at 15:28

Numpy: genfromtxt forming tuples

3 Answers3