There is no equivalent of fscanf
or Java's Scanner
. The simplest solution is to require the user to use newline separeted input instead of space separated input, you can then read line by line and convert the lines to the correct type.
If you want the user to provide more structured input then you probably should create a parser for the user input. There are some nice parsing libraries for python, for example pyparsing. There is also a scanf
module, even though the last update is of 2008.
If you don't want to have external dependencies then you can use regexes to match the input sequences. Certainly regexes require to work on strings, but you can easily overcome this limitation reading in chunks. For example something like this should work well most of the time:
import re
FORMATS_TYPES = {
'd': int,
'f': float,
's': str,
}
FORMATS_REGEXES = {
'd': re.compile(r'(?:\s|\b)*([+-]?\d+)(?:\s|\b)*'),
'f': re.compile(r'(?:\s|\b)*([+-]?\d+\.?\d*)(?:\s|\b)*'),
's': re.compile(r'\b(\w+)\b'),
}
FORMAT_FIELD_REGEX = re.compile(r'%(s|d|f)')
def scan_input(format_string, stream, max_size=float('+inf'), chunk_size=1024):
"""Scan an input stream and retrieve formatted input."""
chunk = ''
format_fields = format_string.split()[::-1]
while format_fields:
fields = FORMAT_FIELD_REGEX.findall(format_fields.pop())
if not chunk:
chunk = _get_chunk(stream, chunk_size)
for field in fields:
field_regex = FORMATS_REGEXES[field]
match = field_regex.search(chunk)
length_before = len(chunk)
while match is None or match.end() >= len(chunk):
chunk += _get_chunk(stream, chunk_size)
if not chunk or length_before == len(chunk):
if match is None:
raise ValueError('Missing fields.')
break
text = match.group(1)
yield FORMATS_TYPES[field](text)
chunk = chunk[match.end():]
def _get_chunk(stream, chunk_size):
try:
return stream.read(chunk_size)
except EOFError:
return ''
Example usage:
>>> s = StringIO('1234 Hello World -13.48 -678 12.45')
>>> for data in scan_input('%d %s %s %f %d %f', s): print repr(data)
...
1234
'Hello'
'World'
-13.48
-678
12.45
You'll probably have to extend this, and test it properly but it should give you some ideas.