I'm looking for a way to run a regex over a (long) iterable of "characters" in Python. (Python doesn't actually have characters, so it's actually an iterable of one-length strings. But same difference.)
The re module only allows searching over strings (or buffers), as far as I can tell.
I could implement it myself, but that seems a little silly.
Alternatively, I could convert the iterable to a string and run the regex over the string, but that gets (hideously) inefficient. (A worst-case example: re.search(".a", "".join('a' for a in range(10**8)))
peaks at over 900M of RAM (private working set) on my (x64) machine, and takes ~12 seconds, even though it only needs to look at the first two characters in the iterable.)