I had a strange problem. I have a file of the format:
START
1
2
STOP
lllllllll
START
3
5
6
STOP
and I want to read the lines between START
and STOP
as blocks, and use my_f
to process each block.
def block_generator(file):
with open(file) as lines:
for line in lines:
if line == 'START':
block=itertools.takewhile(lambda x:x!='STOP',lines)
yield block
and in my main function I tried to use map()
to get the work done. It worked.
blocks=block_generator(file)
map(my_f,blocks)
will actually give me what I want. But when I tried the same thing with multiprocessing.Pool.map()
, it gave me an error said takewhile() wanted to take 2 arguments, was given 0.
blocks=block_generator(file)
p=multiprocessing.Pool(4)
p.map(my_f,blocks)
Is this a bug?
- The file have more than 1000000 blocks, each has less than 100 lines.
- I accept the answer form untubu.
- But maybe I will simple split the file and use n instance of my original script without multiprocessing to processing them then cat the results together. This way you can never be wrong as long as the script works on a small file.