1

My code is below:

file  = open('traintag1.csv', 'r')
csv_reader = csv.reader(file)
data = [x[-1] for x in csv_reader]
print len(data)
target = [x[-2] for x in csv_reader]
print len(target)

and the result is len(data)=430 which is right but len(target)=0 which is supposed to be 430 the same as len(data). Why are the lengths different?

Also, is there any way to read a csvfile by column?

The file contains data like this:

7765,1256,http://hshihwih.com,0
12453,18978,http://shjhjkshd.com,1
Stephen Rauch
  • 47,830
  • 31
  • 106
  • 135
L.Elizabeth
  • 157
  • 1
  • 7

3 Answers3

2

You can try something like:

file  = open('traintag1.csv','r')
csv_reader = csv.reader(file)
data, target = zip(*[(x[-1], x[-2]) for x in csv_reader])
print len(data)
print len(target)

This code creates a list of tuples, and then uses zip to expand the pairs into independent lists.

Stephen Rauch
  • 47,830
  • 31
  • 106
  • 135
1

csv_reader is iterate once object. If you iterate it one time you cannot iterate it next time, so you won't get any values for second loop.

Please try this simple code,

import csv

file  = open('traintag1.csv','r')
csv_reader = csv.reader(file)

target = []
data = []
for x in csv_reader:
    data.append(x[-1])
    target.append(x[-2])

print len(data)
print len(target)

In this code both values of target and data were fetched from a single loop.

Please let me know in terms of any queries.

Karthikeyan KR
  • 1,134
  • 1
  • 17
  • 38
1

Issue :

You are facing this problem since csv_reader is ITERATOR (Please google this concept :) ).

Iterator is object which has "next" method available. When you execute csv_reader = csv.reader(file) , it creates csv_reader as iterator. csv_reader.next() will give you one line at the time. But as the lines are over, there is no way to restart it again.

Please check below :

C:\Users\dinesh\Desktop>python
Python 2.7.13 (v2.7.13:a06454b1afa1, Dec 17 2016, 20:53:40) [MSC v.1500 64 bit (
AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import csv
>>> file  = open('a.csv','r')
>>> csv_reader = csv.reader(file)
>>>
>>> dir(csv_reader)
['__class__', '__delattr__', '__doc__', '__format__', '__getattribute__', '__has
h__', '__init__', '__iter__', '__new__', '__reduce__', '__reduce_ex__', '__repr_
_', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'dialect', 'line
_num', 'next']
>>>
>>> csv_reader.next()
['7765', '1256', 'http://hshihwih.com', '0']
>>>
>>> csv_reader.next()
['12453', '18978', 'http://shjhjkshd.com', '1']
>>>
>>> csv_reader.next()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

In your code, when first iterations are done for data is done, again it can't come back for target due to reason I explained aboved.

Solution :

Collect data in one loop as below :

import csv

file  = open('a.csv','r')
csv_reader = csv.reader(file)
data = []
target = []
for x in csv_reader:
    data.append(x[-1])
    target.append(x[-2])

print data
print len(data)
print target
print len(target)
Community
  • 1
  • 1
Dinesh Pundkar
  • 4,160
  • 1
  • 23
  • 37