-1

I am reading CSV as:

import numpy as np
features = np.genfromtxt('train.csv',delimiter=',',usecols=(1,2))

It outputs data as:

[[-1. -1.] [ 1. -1.] [-1. 1.] [ 1. -1.]]. See the dot after 1 and -1

train.csv

0,-1,-1
-1,1,-1
0,-1,1
1,1,-1
-1,1,-1
0,1,-1
Volatil3
  • 14,253
  • 38
  • 134
  • 263
  • 1
    Can you show a record from `train.csv` ? – Jan Sep 20 '16 at 11:22
  • 2
    there has to be a value somewhere in there that makes numpy convert to floats. – Ma0 Sep 20 '16 at 11:25
  • @Jan Question updated – Volatil3 Sep 20 '16 at 11:26
  • 1
    @Ev.Kounis I tried this and seems it worked `features = np.genfromtxt('train.csv',delimiter=',',usecols=(1,2),dtype=int)` – Volatil3 Sep 20 '16 at 11:26
  • So, your problem is now solved? Also see this: http://stackoverflow.com/questions/3518778/how-to-read-csv-into-record-array-in-numpy. Apparently, casting to a numpy array at the end is more efficient. – Ma0 Sep 20 '16 at 11:28
  • @Volatil3 of course if you specify the dtype to int, the float will go away, see http://docs.scipy.org/doc/numpy/user/basics.io.genfromtxt.html#choosing-the-data-type – user69453 Sep 20 '16 at 11:29
  • 4
    In the link provided by @user69453 it says: **"Note that dtype=float is the default for genfromtxt."** – Ma0 Sep 20 '16 at 11:30

1 Answers1

1

As stated in the comments : np.genfromtxt is simply converting your data to float numbers by default (see the default dtype argument in the function signature). If you want to force the output to integers just specify dtype=np.int in genfromtxt:

features = np.genfromtxt('train.csv',delimiter=',',usecols=(1,2),dtype=np.int)
jadsq
  • 3,033
  • 3
  • 20
  • 32