Is there a python class equivalent to ruby's StringScanner class? I Could hack something together, but i don't want to reinvent the wheel if this already exists.
7 Answers
Interestingly there's an undocumented Scanner class in the re module:
import re
def s_ident(scanner, token): return token
def s_operator(scanner, token): return "op%s" % token
def s_float(scanner, token): return float(token)
def s_int(scanner, token): return int(token)
scanner = re.Scanner([
(r"[a-zA-Z_]\w*", s_ident),
(r"\d+\.\d*", s_float),
(r"\d+", s_int),
(r"=|\+|-|\*|/", s_operator),
(r"\s+", None),
])
print scanner.scan("sum = 3*foo + 312.50 + bar")
Following the discussion it looks like it was left in as experimental code/a starting point for others.

- 21,076
- 1
- 35
- 50
There is nothing exactly like Ruby's StringScanner in Python. It is of course easy to put something together:
import re
class Scanner(object):
def __init__(self, s):
self.s = s
self.offset = 0
def eos(self):
return self.offset == len(self.s)
def scan(self, pattern, flags=0):
if isinstance(pattern, basestring):
pattern = re.compile(pattern, flags)
match = pattern.match(self.s, self.offset)
if match is not None:
self.offset = match.end()
return match.group(0)
return None
along with an example of using it interactively
>>> s = Scanner("Hello there!")
>>> s.scan(r"\w+")
'Hello'
>>> s.scan(r"\s+")
' '
>>> s.scan(r"\w+")
'there'
>>> s.eos()
False
>>> s.scan(r".*")
'!'
>>> s.eos()
True
>>>
However, for the work I do I tend to just write those regular expressions in one go and use groups to extract the needed fields. Or for something more complicated I would write a one-off tokenizer or look to PyParsing or PLY to tokenize for me. I don't see myself using something like StringScanner.

- 14,889
- 4
- 39
- 54
Looks like a variant on re.split( pattern, string )
.

- 384,516
- 81
- 508
- 779
https://pypi.python.org/pypi/scanner/
Seems a more maintained and feature complete solution. But it uses oniguruma directly.

- 978
- 10
- 14
Maybe look into the built in module tokenize. It looks like you can pass a string into it using the StringIO module.

- 33,062
- 15
- 45
- 44
Today there is a project by Mark Watkinson that implements StringScanner in Python:
http://asgaard.co.uk/p/Python-StringScanner

- 6,533
- 5
- 58
- 64
Are you looking for regular expressions in Python? Check this link from official docs:

- 2,567
- 5
- 29
- 34