-3

How can I turn a CSV with such data using csv

['20150101','2','1']
['20150102','10','3']
['20150103','4','2']
['20150104','5','4']
['20150105','12','6']
....

to be iterable with basic for, len(), and range() functions while in its original format (or close to its original format) without turning into a list?

What I have tried: I tried len() on the CSV file and I got 2 different errors:

TypeError: object of type '_io.TextIOWrapper' has no len() # without using csv module
TypeError: object of type '_csv.reader' has no len() # using csv module

Even if I use basic function like:

for i, n in enumerate(r):
    print(r[n])

I would get such error:

TypeError: '_csv.reader' object is not subscriptable
rook
  • 5,880
  • 4
  • 39
  • 51
Dorky
  • 111
  • 1
  • 3
  • 10
  • Have you read the [csv](https://docs.python.org/3/library/csv.html) documentation? A point to ponder, how can the program know the length of the csv file short of reading it in? Perhaps the operations that you are proposing for this object are not a natural fit to its structure. – Mike Satteson Feb 06 '15 at 14:31
  • 1
    It's possible to iterate a csv file (or a `csv.reader` object), but not access elements in one via a subscript or get its length with `len()`. The latter two could be done by reading the entire file into memory (turning it into a string). – martineau Feb 06 '15 at 14:33
  • Well, if csv file is not iterable, then how did those hedge fund companies iterate huge csv files for backtesting? – Dorky Feb 06 '15 at 14:36
  • I said they _are_ iterable. – martineau Feb 06 '15 at 14:37
  • @ martineau, but iterating it as a string would defeat the whole purpose to be as efficient as possible. If the file is massive, then it would be a massive string. What about pandas module? I am looking that pandas have such fundtion as dataframe. Can dataframe iterate over csv files without turning it into a list or string? – Dorky Feb 06 '15 at 14:39
  • Do you really need or want to know in advance how many rows there are? If you want each row number as you go, `enumerate` will work fine on an iterator. What is the actual problem you're trying to solve? And what are your constraints - processing speed, memory, ...? – jonrsharpe Feb 06 '15 at 14:48
  • @ jonrsharpe, at the moment my one and only constraint is to iterate the csv file. The processing speed and memory is secondary, but that doesn't mean I should write a script that will take seconds just to make a few calculations. I don't know if I strictly need to know how many rows in advance. But I do know I need to do calculation that involves "looking back" at the previous row to do so, like "calculate data from current row with data from previous row" in which I will need row[current] and row[current-1]. – Dorky Feb 06 '15 at 15:08
  • Yes, I believe if I try, I may be able to do it with list, but given the data that I have to handle in which I need to analyze the stock market daily price of more than 10 years, when I opened the same csv file with Notepad++, I got over 2 million rows. So if I am really desperate to use list then I would get a super massive list with over 2 million data. I don't believe how efficient that can be. – Dorky Feb 06 '15 at 15:11
  • If you only ever need two consecutive rows, look at e.g. http://stackoverflow.com/questions/5434891/iterate-a-list-as-pair-current-next-in-python, which provides an efficient `itertools`-based way of doing this. Just because you don't know how to do it, doesn't mean Python can't! – jonrsharpe Feb 06 '15 at 17:00
  • @ jonrsharpe, I still harbor the hope that Python can. I hope I will not be disappointed that I have to achieve what I want to achieve with very verbose and inefficient scripting. That link you provided me does not work. Yes, I can do iteration if it is a list, but I can't do the same if it is a csv file, without turning it into a list. And I find it funny and weird that Python can turn some data into csv file, but cannot iterate from the same csv file. – Dorky Feb 07 '15 at 01:36
  • *"does not work"* isn't really enough information to help. *"cannot iterate from the same csv"* - not clear what you mean there, Python *can* iterate over a file, csv or otherwise. A list is a sequence, but you can iterate over iterables, too, like file handles. Also, don't leave a space after the @ if you want people to know you're responding. – jonrsharpe Feb 07 '15 at 09:02
  • @jonrsharpe Then how can I iterate a csv file (like those for historical stock prices and volume) with a for loop (or any other loop function that can iterate) that I can get the data of (i - 1) for i in file? I mean, whenever "for i in file" iterates, I don't want data from current i but data from previous i (or i - 1). How can I do that without using list? – Dorky Feb 07 '15 at 13:49
  • Have you actually *read* that previous question?! [This answer](http://stackoverflow.com/a/5434936/3001761) provides a function that *does exactly that*, from [the `itertools` recipes](https://docs.python.org/3/library/itertools.html#itertools-recipes). I don't understand why you still have a problem. – jonrsharpe Feb 07 '15 at 13:57

1 Answers1

-1

Hopefully this will clarify how it works a bit. If you have your code in a csv it will iterate over each row. So length of each row will be three if you ask the length of the whole row (3 elements in each row). This is why you need to specify the item that you want to know in a specific row. It will also take the '' into account when returning the length.

import csv

with open('test.csv', 'rb') as csvfile:
    reader = csv.reader(open("test.csv", 'rU'))

    for i in reader:
        var1 = i[0]
        var2 = i[1]
        var3 = i[2]
        print len(var1), len(var2), len(var3)

This will return

11 3 4
11 4 4
11 3 4
11 3 4
11 4 4