-1

I am using Python to read a cvs file. The CSV file has two columns separated by ','. I am only able to read one column and when i try to create a list for the next column, I am getting a blank list.

My code is as follow:

import csv
with open('exdata1.csv') as inputData:
    data = csv.reader(inputData, delimiter=',')
    xVal = [row[0] for row in data]
    yVal = [row[1] for row in data]

The output is like this:

>>> xVal
['6.1101', '5.5277', '8.5186', '7.0032', '5.8598', '8.3829', '7.4764', '8.5781', '6.4862', '5.0546', '5.7107', '14.164', '5.734', '8.4084', '5.6407', '5.3794', '6.3654', '5.1301', '6.4296', '7.0708', '6.1891', '20.27', '5.4901', '6.3261', '5.5649', '18.945', '12.828', '10.957', '13.176', '22.203', '5.2524', '6.5894', '9.2482', '5.8918', '8.2111', '7.9334', '8.0959', '5.6063', '12.836', '6.3534', '5.4069', '6.8825', '11.708', '5.7737', '7.8247', '7.0931', '5.0702', '5.8014', '11.7', '5.5416', '7.5402', '5.3077', '7.4239', '7.6031', '6.3328', '6.3589', '6.2742', '5.6397', '9.3102', '9.4536', '8.8254', '5.1793', '21.279', '14.908', '18.959', '7.2182', '8.2951', '10.236', '5.4994', '20.341', '10.136', '7.3345', '6.0062', '7.2259', '5.0269', '6.5479', '7.5386', '5.0365', '10.274', '5.1077', '5.7292', '5.1884', '6.3557', '9.7687', '6.5159', '8.5172', '9.1802', '6.002', '5.5204', '5.0594', '5.7077', '7.6366', '5.8707', '5.3054', '8.2934', '13.394', '5.4369']
>>> yVal
[]

I am not sure what exactly to google for this issue. Any tutorial which deals with this?

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
Morpheus
  • 3,285
  • 4
  • 27
  • 57

3 Answers3

3

csv.reader() objects read data from the underlying file object, and file objects have file positions that move from start to end as you read. If you want to read again you need to rewind the file pointer to the start:

xVal = [row[0] for row in data]
inputData.seek(0)
yVal = [row[1] for row in data]

However, you'd be better of reading just once, and transposing the rows to columns:

xVal, yVal = zip(*data)[:2]
Community
  • 1
  • 1
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • could you please explain `xVal, yVal = zip(*data)[:2]` ? – Morpheus Jan 02 '16 at 22:59
  • @Morpheus: I just edited in a link to a post that explains this. Basically, you apply all the rows as separate arguments to the `zip()` function, which then pairs up the first value of each row as one list, the second value of each row as another, etc. – Martijn Pieters Jan 02 '16 at 23:00
  • @Morpheus: the resulting list of columns is then sliced to only take the first two columns (in case you have more than 2) and assigns those two columns to the variables `xVal` and `yVal`. – Martijn Pieters Jan 02 '16 at 23:00
  • Thanks a lot! Just saw the link. – Morpheus Jan 02 '16 at 23:00
  • You could expand this out to `columns = zip(*data)`, then `xVal = columns[0]` and `yVal = columns[1]`. But why use 3 lines when you have tuple assignment.. – Martijn Pieters Jan 02 '16 at 23:02
  • Suppose I have 4 columns, then the [:2] will become [:4]? Are we saying that assign first column to xVal, assign second column to yVal till you reach index 3? – Morpheus Jan 02 '16 at 23:06
  • @Morpheus: yes, you can slice it to more columns, so `[:4]` gives you the first four and you can assign those to four different variables. *Or* you could just use a single `columns` variable and refer to `columns[0]` and `columns[1]`, etc. – Martijn Pieters Jan 02 '16 at 23:07
  • I am getting the type error when i try to do xVal = columns[0]. `TypeError: 'zip' object is not subscriptable`. I think in python3 zip gives a different object – Morpheus Jan 02 '16 at 23:16
  • Hey! you have to convert it into a list. So the final form is `xVal, yVal = list(zip(*data))[:2]` – Morpheus Jan 02 '16 at 23:23
  • 1
    @Morpheus: right, yes, `zip()` in Python 3 returns a generator. Use `xVal = next(columns); yVal = next(columns)`, or use `columns = list(zip(*data))`, or use a `itertools.islice()` object to do the slicing. – Martijn Pieters Jan 02 '16 at 23:23
1

I would propose to use Pandas and read CSV as a data frame, then you could quickly access columns as Numpy arrays. Code example

import pandas as pd
import numpy as np

df = pd.read_csv('exdata1.csv', header=None, names=['One', 'Two'])
print(df)

print(df.One)
print(df.Two)
Severin Pappadeux
  • 18,636
  • 3
  • 38
  • 64
0

you can change this line in your code data = csv.reader(inputData, delimiter=',') by this one data = list(csv.reader(inputData, delimiter=',')).

This will create list that you can re-use as much as you want.

Sidahmed
  • 792
  • 1
  • 12
  • 23