This is a question regarding how Storm's max spout pending works. I currently have a spout that reads a file and emits a tuple for each line in the file (I know Storm is not the best solution for dealing with files but I do not have a choice for this problem).
I set the topology.max.spout.pending
to 50k to throttle how many tuples get into the topology to be processed. However, I see this number not having any effect in the topology. I see all records in a file being emitted every time. My guess is this might be due to a loop I have in the nextTuple()
method that emits all records in a file.
My question is: Does Storm just stop calling nextTuple()
for the Spout task when topology.max.spout.pending
is reached? Does this mean I should only emit one tuple every time the method is called?