34

Say I have a CSV file.csv in this format:

dfaefew,432,1
vzcxvvz,300,1
ewrwefd,432,0

How to import the second column as a NumPy array and the third column as another one like this:

second = np.array([432, 300, 432])
third = np.array([1, 1, 0])
mkrieger1
  • 19,194
  • 5
  • 54
  • 65
user3692521
  • 2,563
  • 5
  • 27
  • 33

2 Answers2

58

numpy.genfromtxt() is the best thing to use here

import numpy as np
csv = np.genfromtxt ('file.csv', delimiter=",")
second = csv[:,1]
third = csv[:,2]

>>> second
Out[1]: array([ 432.,  300.,  432.])

>>> third
Out[2]: array([ 1.,  1.,  0.])
Anoop
  • 5,540
  • 7
  • 35
  • 52
  • 2
    genfromtxt works better than loadtxt in my use case and I had to add dtype=None since my data had a mix of data types that I was reading. Just FYI.... – Nikhil Gupta Sep 21 '19 at 10:59
14

You can use numpy.loadtxt:

In [15]: !cat data.csv
dfaefew,432,1
vzcxvvz,300,1
ewrwefd,432,0

In [16]: second, third = loadtxt('data.csv', delimiter=',', usecols=(1,2), unpack=True, dtype=int)

In [17]: second
Out[17]: array([432, 300, 432])

In [18]: third
Out[18]: array([1, 1, 0])

Or numpy.genfromtxt

In [19]: second, third = genfromtxt('data.csv', delimiter=',', usecols=(1,2), unpack=True, dtype=None)

The only change in the arguments is that I used dtype=None, which tells genfromtxt to infer the data type from the values that it finds in the file.

Warren Weckesser
  • 110,654
  • 19
  • 194
  • 214