0

I would like to iterate over every 1000 rows of a text file. I used to do something similar with a database, and there I have first written a new id for every 1000 rows, and have iterated over it. Now I would like to do it with the text file itself. Is there some pythonic way to do it? I have came only so far.

import pandas as pd

input_file = 'text.csv'
my_input = pd.read_csv(input_file, sep = ';')
length = my_input.shape[0]
start = 0
end = 999
#for the length of the whole document take the lines in range(start,end)
   do stuff
   start =+ 1000
   end =+ 1000
Fabio Lamanna
  • 20,504
  • 24
  • 90
  • 122
student
  • 511
  • 1
  • 5
  • 20
  • First you need to decide if you wish to read the file as it is, read it as a csv file, or work with its dataframe representation. – DeepSpace Feb 09 '17 at 12:08
  • @DeepSpace I need some of the attributes of every line, so at some point I would need the dataframe, I guess. But may be it is possible first to read the 1000 lines and then create a dataframe, that I can read the attributes? – student Feb 09 '17 at 12:17

1 Answers1

0

It seems to work with the blaze library.

import pandas as pd

input_file = 'text.csv'  
my_input = pd.read_csv(input_file, sep = ';', names=['a', 'b', 'c']
for chunk in blaze.odo(my_input, target=bz.chunks(pd.DataFrame), chunksize=1000):
    for index, row in chunk.iterrows():
            variable1 = row['a']
            variable1 = row['b']
            do stuff
student
  • 511
  • 1
  • 5
  • 20