For me, the most straightforward way to solve this is with generators.
def tokens(filename):
with open(filename) as infile:
for line in infile:
for item in line.split():
yield int(item)
def ballots(tokens):
ballot = []
for t in tokens:
if t:
ballot.append(t)
else:
yield ballot
ballot = []
t = tokens("datafile.txt")
for b in ballots(t):
print b
I see @katrielalex posted a generator-using solution while I was posting mine. The difference between ours is that I'm using two separate generators, one for the individual tokens in the file and one for the specific data structure you wish to parse. The former is passed to the latter as a parameter, the basic idea being that you can write a function like ballots()
for each of the data structures you wish to parse. You can either iterate over everything yielded by the generator, or call next()
on either generator to get the next token or ballot (be prepared for a StopIteration
exception when you run out, or else write the generators to generate a sentinel value such as None
when they run out of real data, and check for that).
It would be pretty straightforward to wrap the whole thing in a class. In fact...
class Parser(object):
def __init__(self, filename):
def tokens(filename):
with open(filename) as infile:
for line in infile:
for item in line.split():
yield int(item)
self.tokens = tokens(filename)
def ballots(self):
ballot = []
for t in self.tokens:
if t:
ballot.append(t)
else:
yield ballot
ballot = []
p = Parser("datafile.txt")
for b in p.ballots():
print b