You can't "split a file", but you can read it line by line no matter how big. E.g:
import collections
btcDatear = []
btcPricear = []
btcVolumear = []
howfarback = 20000
try:
with open('.btceUSD.csv', 'r') as sourceCode:
lastNlines = collections.deque(sourceCode, howfarback)
for eachline in lastNlines:
splitLine = eachline.split(',')
btcDate = splitLine[0]
btcPrice = splitLine[1]
btcVolume = splitLine[2]
btcDatear.append(float(btcDate))
btcPricear.append(float(btcPrice))
btcVolumear.append(float(btcVolume))
except Exception as e:
print "failed raw data", str(e)
Building a deque
with a maximum length of howfarback
is the best way to keep the last N lines of a file that you can only read line by line from the start. The with
statement ensures the file is properly closed no matter what; the rest of the logic is like in your code. It would be better to apply the standard library csv
module, but, one bit of learning at a tie:-).
There may be tricks (subtly exploiting the fact that the CSV file is likely to be seekable) to get "the last N lines" faster -- in Unixy systems, the tail
system command is very good at that. If the performance of this straightforward approach is too slow for you, ask again and we'll discuss that:-) [and/or how the csv
module is best used...]
Added: come to think of it, no need to belabor "tail tricks", as they're well explained at Get last n lines of a file with Python, similar to tail -- the question is by a Python guru, Armin Ronacher, so you can be pretty confident of the quality of his code, and the answers and long discussion are interesting.
So if this simple approach takes too long, study Armin's and his respondents'... very tricky but can be truly useful.
So we might as well focus on the use of the csv
module, after an import csv
at the start to be sure -- rewriting only the changing part...:
for fields in csv.reader(iter(lastNlines)):
btcDate, btcPrice, btcVolume = fields[:3]
all the rest as before. csv.reader
takes care of CSV parsing (you may not need the subtleties such as dealing with quoted/escaped commas but you pay no extra there!-) and leaves your code more concise and elegant.