I really like Python generators. In particular, I find that they are just the right tool for connecting to Rest endpoints - my client code only has to iterate on the generator that is connected the the endpoint. However, I am finding one area where Python's generators are not as expressive as I would like. Typically, I need to filter the data I get out of the endpoint. In my current code, I pass a predicate function to the generator and it applies the predicate to the data it is handling and only yields data if the predicate is True.
I would like to move toward composition of generators - like data_filter(datasource( )). Here is some demonstration code that shows what I have tried. It is pretty clear why it does not work, what I am trying to figure out is what is the most expressive way of arriving at the solution:
# Mock of Rest Endpoint: In actual code, generator is
# connected to a Rest endpoint which returns dictionary(from JSON).
def mock_datasource ():
mock_data = ["sanctuary", "movement", "liberty", "seminar",
"formula","short-circuit", "generate", "comedy"]
for d in mock_data:
yield d
# Mock of a filter: simplification, in reality I am filtering on some
# aspect of the data, like data['type'] == "external"
def data_filter (d):
if len(d) < 8:
yield d
# First Try:
# for w in data_filter(mock_datasource()):
# print(w)
# >> TypeError: object of type 'generator' has no len()
# Second Try
# for w in (data_filter(d) for d in mock_datasource()):
# print(w)
# I don't get words out,
# rather <generator object data_filter at 0x101106a40>
# Using a predicate to filter works, but is not the expressive
# composition I am after
for w in (d for d in mock_datasource() if len(d) < 8):
print(w)